1
|
Pinto S, Cardoso R, Atkinson-Clement C, Guimarães I, Sadat J, Santos H, Mercier C, Carvalho J, Cuartero MC, Oliveira P, Welby P, Frota S, Cavazzini E, Vigário M, Letanneux A, Cruz M, Brulefert C, Desmoulins M, Martins IP, Rothe-Neves R, Viallet F, Ferreira JJ. Do Acoustic Characteristics of Dysarthria in People With Parkinson's Disease Differ Across Languages? JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024:1-20. [PMID: 38754039 DOI: 10.1044/2024_jslhr-23-00525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2024]
Abstract
PURPOSE Cross-language studies suggest more similarities than differences in how dysarthria affects the speech of people with Parkinson's disease (PwPD) who speak different languages. In this study, we aimed to identify the relative contribution of acoustic variables to distinguish PwPD from controls who spoke varieties of two Romance languages, French and Portuguese. METHOD This bi-national, cross-sectional, and case-controlled study included 129 PwPD and 124 healthy controls who spoke French or Portuguese. All participants underwent the same clinical examinations, voice/speech recordings, and self-assessment questionnaires. PwPD were evaluated off and on optimal medication. Inferential analyses included Disease (controls vs. PwPD) and Language (French vs. Portuguese) as factors, and random decision forest algorithms identified relevant acoustic variables able to distinguish participants: (a) by language (French vs. Portuguese) and (b) by clinical status (PwPD on and off medication vs. controls). RESULTS French-speaking and Portuguese-speaking individuals were distinguished from each other with over 90% accuracy by five acoustic variables (the mean fundamental frequency and the shimmer of the sustained vowel /a/ production, the oral diadochokinesis performance index, the relative sound level pressure and the relative sound pressure level standard deviation of the text reading). A distinct set of parameters discriminated between controls and PwPD: for men, maximum phonation time and the oral diadochokinesis speech proportion were the most significant variables; for women, variables calculated from the oral diadochokinesis were the most discriminative. CONCLUSIONS Acoustic variables related to phonation and voice quality distinguished between speakers of the two languages. Variables related to pneumophonic coordination and articulation rate were the more effective in distinguishing PwPD from controls. Thus, our research findings support that respiration and diadochokinesis tasks appear to be the most appropriate to pinpoint signs of dysarthria, which are largely homogeneous and language-universal. In contrast, identifying language-specific variables with the speech tasks and acoustic variables studied was less conclusive.
Collapse
Affiliation(s)
- Serge Pinto
- Aix-Marseille Univ, CNRS, LPL, Aix-en-Provence, France
| | - Rita Cardoso
- CNS - Campus Neurológico Sénior, Torres Vedras, Portugal
- Instituto de Medicina Molecular, Faculdade de Medicina, University of Lisbon, Portugal
| | - Cyril Atkinson-Clement
- Aix-Marseille Univ, CNRS, LPL, Aix-en-Provence, France
- Precision Imaging Beacon, School of Medicine, University of Nottingham, United Kingdom
| | - Isabel Guimarães
- Instituto de Medicina Molecular, Faculdade de Medicina, University of Lisbon, Portugal
- Speech Therapy Department, Alcoitão Health School of Sciences, Alcabideche, Portugal
| | - Jasmin Sadat
- Aix-Marseille Univ, CNRS, LPL, Aix-en-Provence, France
| | - Helena Santos
- CNS - Campus Neurológico Sénior, Torres Vedras, Portugal
| | - Céline Mercier
- Aix-Marseille Univ, CNRS, LPL, Aix-en-Provence, France
- Neurology Department, Centre Hospitalier Intercommunal du Pays d'Aix, Aix-en-Provence, France
| | - Joana Carvalho
- CNS - Campus Neurológico Sénior, Torres Vedras, Portugal
| | | | | | - Pauline Welby
- Aix-Marseille Univ, CNRS, LPL, Aix-en-Provence, France
| | - Sónia Frota
- Center of Linguistics, School of Arts and Humanities, University of Lisbon, Portugal
| | | | - Marina Vigário
- Center of Linguistics, School of Arts and Humanities, University of Lisbon, Portugal
| | - Alban Letanneux
- ESPE Université Paris-Est Créteil, Laboratoire CHArt-UPEC (EA 4004), Bonneuil-sur-Marne, France
| | - Marisa Cruz
- Center of Linguistics, School of Arts and Humanities, University of Lisbon, Portugal
| | | | | | - Isabel Pavão Martins
- Language Research Laboratory, Department of Neurology, University of Lisbon, Portugal
| | - Rui Rothe-Neves
- Laboratório de Fonética, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - François Viallet
- Aix-Marseille Univ, CNRS, LPL, Aix-en-Provence, France
- Neurology Department, Centre Hospitalier Intercommunal du Pays d'Aix, Aix-en-Provence, France
| | - Joaquim J Ferreira
- CNS - Campus Neurológico Sénior, Torres Vedras, Portugal
- Instituto de Medicina Molecular, Faculdade de Medicina, University of Lisbon, Portugal
| |
Collapse
|
2
|
Wang M, Zhao X, Li F, Wu L, Li Y, Tang R, Yao J, Lin S, Zheng Y, Ling Y, Ren K, Chen Z, Yin X, Wang Z, Gao Z, Zhang X. Using sustained vowels to identify patients with mild Parkinson's disease in a Chinese dataset. Front Aging Neurosci 2024; 16:1377442. [PMID: 38765774 PMCID: PMC11102047 DOI: 10.3389/fnagi.2024.1377442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 04/15/2024] [Indexed: 05/22/2024] Open
Abstract
Introduction Parkinson's disease (PD) is the second most common neurodegenerative disease and affects millions of people. Accurate diagnosis and subsequent treatment in the early stages can slow down disease progression. However, making an accurate diagnosis of PD at an early stage is challenging. Previous studies have revealed that even for movement disorder specialists, it was difficult to differentiate patients with PD from healthy individuals until the average modified Hoehn-Yahr staging (mH&Y) reached 1.8. Recent researches have shown that dysarthria provides good indicators for computer-assisted diagnosis of patients with PD. However, few studies have focused on diagnosing patients with PD in the early stages, specifically those with mH&Y ≤ 1.5. Method We used a machine learning algorithm to analyze voice features and developed diagnostic models for differentiating between healthy controls (HCs) and patients with PD, and for differentiating between HCs and patients with mild PD (mH&Y ≤ 1.5). The models were independently validated using separate datasets. Results Our results demonstrate that, a remarkable diagnostic performance of the model in identifying patients with mild PD (mH&Y ≤ 1.5) and HCs, with area under the ROC curve 0.93 (95% CI: 0.851.00), accuracy 0.85, sensitivity 0.95, and specificity 0.75. Conclusion The results of our study are helpful for screening PD in the early stages in the community and primary medical institutions where there is a lack of movement disorder specialists and special equipment.
Collapse
Affiliation(s)
- Miao Wang
- Department of Geriatric Neurology, The Second Medical Center and National Clinical Research Center for Geriatric Disease, Chinese PLA General Hospital, Beijing, China
| | - Xingli Zhao
- Department of Geriatric Neurology, The Second Medical Center and National Clinical Research Center for Geriatric Disease, Chinese PLA General Hospital, Beijing, China
| | - Fengzhu Li
- Department of Geriatric Neurology, The Second Medical Center and National Clinical Research Center for Geriatric Disease, Chinese PLA General Hospital, Beijing, China
| | - Lingyu Wu
- Gyenno Science Co., Ltd., Shenzhen, China
- HUST-GYENNO CNS Intelligent Digital Medicine Technology Center, Wuhan, China
| | - Yifan Li
- Department of Geriatric Neurology, The Second Medical Center and National Clinical Research Center for Geriatric Disease, Chinese PLA General Hospital, Beijing, China
| | - Ruonan Tang
- Department of Geriatric Neurology, The Second Medical Center and National Clinical Research Center for Geriatric Disease, Chinese PLA General Hospital, Beijing, China
| | - Jiarui Yao
- Department of Geriatric Neurology, The Second Medical Center and National Clinical Research Center for Geriatric Disease, Chinese PLA General Hospital, Beijing, China
| | - Shinuan Lin
- Gyenno Science Co., Ltd., Shenzhen, China
- HUST-GYENNO CNS Intelligent Digital Medicine Technology Center, Wuhan, China
| | - Yuan Zheng
- Gyenno Science Co., Ltd., Shenzhen, China
- HUST-GYENNO CNS Intelligent Digital Medicine Technology Center, Wuhan, China
| | - Yun Ling
- Gyenno Science Co., Ltd., Shenzhen, China
- HUST-GYENNO CNS Intelligent Digital Medicine Technology Center, Wuhan, China
| | - Kang Ren
- Gyenno Science Co., Ltd., Shenzhen, China
- HUST-GYENNO CNS Intelligent Digital Medicine Technology Center, Wuhan, China
| | - Zhonglue Chen
- Gyenno Science Co., Ltd., Shenzhen, China
- HUST-GYENNO CNS Intelligent Digital Medicine Technology Center, Wuhan, China
| | - Xi Yin
- Department of Geriatric Neurology, The Second Medical Center and National Clinical Research Center for Geriatric Disease, Chinese PLA General Hospital, Beijing, China
| | - Zhenfu Wang
- Department of Geriatric Neurology, The Second Medical Center and National Clinical Research Center for Geriatric Disease, Chinese PLA General Hospital, Beijing, China
| | - Zhongbao Gao
- Department of Geriatric Neurology, The Second Medical Center and National Clinical Research Center for Geriatric Disease, Chinese PLA General Hospital, Beijing, China
| | - Xi Zhang
- Department of Geriatric Neurology, The Second Medical Center and National Clinical Research Center for Geriatric Disease, Chinese PLA General Hospital, Beijing, China
| |
Collapse
|
3
|
Fahed VS, Doheny EP, Collazo C, Krzysztofik J, Mann E, Morgan-Jones P, Mills L, Drew C, Rosser AE, Cousins R, Witkowski G, Cubo E, Busse M, Lowery MM. Language-Independent Acoustic Biomarkers for Quantifying Speech Impairment in Huntington's Disease. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2024; 33:1390-1405. [PMID: 38530396 DOI: 10.1044/2024_ajslp-23-00175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/28/2024]
Abstract
PURPOSE Changes in voice and speech are characteristic symptoms of Huntington's disease (HD). Objective methods for quantifying speech impairment that can be used across languages could facilitate assessment of disease progression and intervention strategies. The aim of this study was to analyze acoustic features to identify language-independent features that could be used to quantify speech dysfunction in English-, Spanish-, and Polish-speaking participants with HD. METHOD Ninety participants with HD and 83 control participants performed sustained vowel, syllable repetition, and reading passage tasks recorded with previously validated methods using mobile devices. Language-independent features that differed between HD and controls were identified. Principal component analysis (PCA) and unsupervised clustering were applied to the language-independent features of the HD data set to identify subgroups within the HD data. RESULTS Forty-six language-independent acoustic features that were significantly different between control participants and participants with HD were identified. Following dimensionality reduction using PCA, four speech clusters were identified in the HD data set. Unified Huntington's Disease Rating Scale (UHDRS) total motor score, total functional capacity, and composite UHDRS were significantly different for pairwise comparisons of subgroups. The percentage of HD participants with higher dysarthria score and disease stage also increased across clusters. CONCLUSION The results support the application of acoustic features to objectively quantify speech impairment and disease severity in HD in multilanguage studies. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.25447171.
Collapse
Affiliation(s)
- Vitória S Fahed
- School of Electrical and Electronic Engineering, University College Dublin, Ireland
- Insight Centre for Data Analytics, University College Dublin, Ireland
| | - Emer P Doheny
- School of Electrical and Electronic Engineering, University College Dublin, Ireland
- Insight Centre for Data Analytics, University College Dublin, Ireland
| | | | | | - Elliot Mann
- Centre for Trials Research, Cardiff University, United Kingdom
| | - Philippa Morgan-Jones
- Centre for Trials Research, Cardiff University, United Kingdom
- School of Engineering, Cardiff University, United Kingdom
| | - Laura Mills
- Centre for Trials Research, Cardiff University, United Kingdom
| | - Cheney Drew
- Centre for Trials Research, Cardiff University, United Kingdom
| | - Anne E Rosser
- Brain Repair Centre and BRAIN Unit, Schools of Medicine and Biosciences, Cardiff University, United Kingdom
| | | | | | | | - Monica Busse
- Centre for Trials Research, Cardiff University, United Kingdom
| | - Madeleine M Lowery
- School of Electrical and Electronic Engineering, University College Dublin, Ireland
- Insight Centre for Data Analytics, University College Dublin, Ireland
| |
Collapse
|
4
|
Xue Z, Lu H, Zhang T, Little MA. Patient-specific game-based transfer method for Parkinson's disease severity prediction. Artif Intell Med 2024; 150:102810. [PMID: 38553149 DOI: 10.1016/j.artmed.2024.102810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 11/02/2023] [Accepted: 02/11/2024] [Indexed: 04/02/2024]
Abstract
Dysphonia is one of the early symptoms of Parkinson's disease (PD). Most existing methods use feature selection methods to find the optimal subset of voice features for all PD patients. Few have considered the heterogeneity between patients, which implies the need to provide specific prediction models for different patients. However, building the specific model faces the challenge of small sample size, which makes it lack generalization ability. Instance transfer is an effective way to solve this problem. Therefore, this paper proposes a patient-specific game-based transfer (PSGT) method for PD severity prediction. First, a selection mechanism is used to select PD patients with similar disease trends to the target patient from the source domain, which reduces the risk of negative transfer. Then, the contribution of the transferred subjects and their instances to the disease estimation of the target subject is fairly evaluated by the Shapley value, which improves the interpretability of the method. Next, the proportion of valid instances in the transferred subjects is determined, and the instances with higher contribution are transferred to further reduce the difference between the transferred instance subset and the target subject. Finally, the selected subset of instances is added to the training set of the target subject, and the extended data is fed into the random forest to improve the performance of the method. Parkinson's telemonitoring dataset is used to evaluate the feasibility and effectiveness. The mean values of mean absolute error, root mean square error, and volatility obtained by predicting motor-UPDRS and total-UPDRS for target patients are 1.59, 1.95, 1.56 and 1.98, 2.54, 1.94, respectively. Experiment results show that the PSGT has better performance in both prediction error and stability over compared methods.
Collapse
Affiliation(s)
- Zaifa Xue
- School of Information Science and Engineering, Yanshan University, Qinhuangdao, China; Hebei Key Laboratory of information transmission and signal processing, Qinhuangdao, China.
| | - Huibin Lu
- School of Information Science and Engineering, Yanshan University, Qinhuangdao, China; Hebei Key Laboratory of information transmission and signal processing, Qinhuangdao, China.
| | - Tao Zhang
- School of Information Science and Engineering, Yanshan University, Qinhuangdao, China; Hebei Key Laboratory of information transmission and signal processing, Qinhuangdao, China.
| | - Max A Little
- School of Computer Science, University of Birmingham, Birmingham, United Kingdom; Media Lab, Massachusetts Institute of Technology, Cambridge, USA.
| |
Collapse
|
5
|
Bernard M, Poli M, Karadayi J, Dupoux E. Shennong: A Python toolbox for audio speech features extraction. Behav Res Methods 2023; 55:4489-4501. [PMID: 36750521 DOI: 10.3758/s13428-022-02029-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/17/2022] [Indexed: 02/09/2023]
Abstract
We introduce Shennong, a Python toolbox and command-line utility for audio speech features extraction. It implements a wide range of well-established state-of-the-art algorithms: spectro-temporal filters such as Mel-Frequency Cepstral Filterbank or Predictive Linear Filters, pre-trained neural networks, pitch estimators, speaker normalization methods, and post-processing algorithms. Shennong is an open source, reliable and extensible framework built on top of the popular Kaldi speech processing library. The Python implementation makes it easy to use by non-technical users and integrates with third-party speech modeling and machine learning tools from the Python ecosystem. This paper describes the Shennong software architecture, its core components, and implemented algorithms. Then, three applications illustrate its use. We first present a benchmark of speech features extraction algorithms available in Shennong on a phone discrimination task. We then analyze the performances of a speaker normalization model as a function of the speech duration used for training. We finally compare pitch estimation algorithms on speech under various noise conditions.
Collapse
Affiliation(s)
- Mathieu Bernard
- Cognitive Machine Learning, PSL Research University, CNRS, EHESS, ENS, Inria, Paris, France.
- EconomiX (UMR 7235), Université Paris Nanterre, CNRS, Nanterre, France.
| | - Maxime Poli
- Cognitive Machine Learning, PSL Research University, CNRS, EHESS, ENS, Inria, Paris, France
| | - Julien Karadayi
- Cognitive Machine Learning, PSL Research University, CNRS, EHESS, ENS, Inria, Paris, France
| | - Emmanuel Dupoux
- Cognitive Machine Learning, PSL Research University, CNRS, EHESS, ENS, Inria, Paris, France
- Meta AI Research, Paris, France
| |
Collapse
|
6
|
Iyer A, Kemp A, Rahmatallah Y, Pillai L, Glover A, Prior F, Larson-Prior L, Virmani T. A machine learning method to process voice samples for identification of Parkinson's disease. Sci Rep 2023; 13:20615. [PMID: 37996478 PMCID: PMC10667335 DOI: 10.1038/s41598-023-47568-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 11/15/2023] [Indexed: 11/25/2023] Open
Abstract
Machine learning approaches have been used for the automatic detection of Parkinson's disease with voice recordings being the most used data type due to the simple and non-invasive nature of acquiring such data. Although voice recordings captured via telephone or mobile devices allow much easier and wider access for data collection, current conflicting performance results limit their clinical applicability. This study has two novel contributions. First, we show the reliability of personal telephone-collected voice recordings of the sustained vowel /a/ in natural settings by collecting samples from 50 people with specialist-diagnosed Parkinson's disease and 50 healthy controls and applying machine learning classification with voice features related to phonation. Second, we utilize a novel application of a pre-trained convolutional neural network (Inception V3) with transfer learning to analyze the spectrograms of the sustained vowel from these samples. This approach considers speech intensity estimates across time and frequency scales rather than collapsing measurements across time. We show the superiority of our deep learning model for the task of classifying people with Parkinson's disease as distinct from healthy controls.
Collapse
Affiliation(s)
- Anu Iyer
- Georgia Institute of Technology, Atlanta, 30332, USA
| | - Aaron Kemp
- Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA.
| | - Yasir Rahmatallah
- Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
| | - Lakshmi Pillai
- Neurology, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
| | - Aliyah Glover
- Neurology, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
| | - Fred Prior
- Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
| | - Linda Larson-Prior
- Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
- Neurology, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
- Neurobiology and Developmental Sciences, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
| | - Tuhin Virmani
- Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
- Neurology, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
| |
Collapse
|
7
|
Rowe HP, Shellikeri S, Yunusova Y, Chenausky KV, Green JR. Quantifying articulatory impairments in neurodegenerative motor diseases: A scoping review and meta-analysis of interpretable acoustic features. INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2023; 25:486-499. [PMID: 36001500 PMCID: PMC9950294 DOI: 10.1080/17549507.2022.2089234] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
PURPOSE Neurodegenerative motor diseases (NMDs) have devastating effects on the lives of patients and their loved ones, in part due to the impact of neurologic abnormalities on speech, which significantly limits functional communication. Clinical speech researchers have thus spent decades investigating speech features in populations suffering from NMDs. Features of impaired articulatory function are of particular interest given their detrimental impact on intelligibility, their ability to encode a variety of distinct movement disorders, and their potential as diagnostic indicators of neurodegenerative diseases. The objectives of this scoping review were to identify (1) which components of articulation (i.e. coordination, consistency, speed, precision, and repetition rate) are the most represented in the acoustic literature on NMDs; (2) which acoustic articulatory features demonstrate the most potential for detecting speech motor dysfunction in NMDs; and (3) which articulatory components are the most impaired within each NMD. METHOD This review examined literature published between 1976 and 2020. Studies were identified from six electronic databases using predefined key search terms. The first research objective was addressed using a frequency count of studies investigating each articulatory component, while the second and third objectives were addressed using meta-analyses. RESULT Findings from 126 studies revealed a considerable emphasis on articulatory precision. Of the 24 features included in the meta-analyses, vowel dispersion/distance and stop gap duration exhibited the largest effects when comparing the NMD population to controls. The meta-analyses also revealed divergent patterns of articulatory performance across disease types, providing evidence of unique profiles of articulatory impairment. CONCLUSION This review illustrates the current state of the literature on acoustic articulatory features in NMDs. By highlighting the areas of need within each articulatory component and disease group, this work provides a foundation on which clinical researchers, speech scientists, neurologists, and computer science engineers can develop research questions that will both broaden and deepen the understanding of articulatory impairments in NMDs.
Collapse
Affiliation(s)
- Hannah P Rowe
- MGH Institute of Health Professions, Boston, MA, USA
| | - Sanjana Shellikeri
- Department of Speech-Language Pathology & Rehabilitation Sciences Institute, University of Toronto, Toronto, ON, Canada
- Sunnybrook Research Institute, Toronto, ON, Canada
- Penn Frontotemporal Degeneration Center, University of Pennsylvania, Philadelphia, PA, USA
| | - Yana Yunusova
- Department of Speech-Language Pathology & Rehabilitation Sciences Institute, University of Toronto, Toronto, ON, Canada
- Sunnybrook Research Institute, Toronto, ON, Canada
| | - Karen V Chenausky
- MGH Institute of Health Professions, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA, and
| | - Jordan R Green
- MGH Institute of Health Professions, Boston, MA, USA
- Speech and Hearing Biosciences and Technology Program, Harvard University, Cambridge, MA, USA
| |
Collapse
|
8
|
Hemmerling D, Wodzinski M, Orozco-Arroyave JR, Sztaho D, Daniol M, Jemiolo P, Wojcik-Pedziwiatr M. Vision Transformer for Parkinson's Disease Classification using Multilingual Sustained Vowel Recordings. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38083719 DOI: 10.1109/embc40787.2023.10340478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Parkinson's disease (PD) is the 2nd most prevalent neurodegenerative disease in the world. Thus, the early detection of PD has recently been the subject of several scientific and commercial studies. In this paper, we propose a pipeline using Vision Transformer applied to mel-spectrograms for PD classification using multilingual sustained vowel recordings. Furthermore, our proposed transformed-based model shows a great potential to use voice as a single modality biomarker for automatic PD detection without language restrictions, a wide range of vowels, with an F1-score equal to 0.78. The results of our study fall within the range of the estimated prevalence of voice and speech disorders in Parkinson's disease, which ranges from 70-90%. Our study demonstrates a high potential for adaptation in clinical decision-making, allowing for increasingly systematic and fast diagnosis of PD with the potential for use in telemedicine.Clinical relevance- There is an urgent need to develop non invasive biomarker of Parkinson's disease effective enough to detect the onset of the disease to introduce neuroprotective treatment at the earliest stage possible and to follow the results of that intervention. Voice disorders in PD are very frequent and are expected to be utilized as an early diagnostic biomarker. The voice analysis using deep neural networks open new opportunities to assess neurodegenerative diseases' symptoms, for fast diagnosis-making, to guide treatment initiation, and risk prediction. The detection accuracy for voice biomarkers according to our method reached close to the maximum achievable value.
Collapse
|
9
|
Ishikawa K, Pietrowicz M, Charney S, Orbelo D. Landmark-based analysis of speech differentiates conversational from clear speech in speakers with muscle tension dysphonia. JASA EXPRESS LETTERS 2023; 3:2888596. [PMID: 37140265 DOI: 10.1121/10.0019354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 04/18/2023] [Indexed: 05/05/2023]
Abstract
This study evaluated the feasibility of differentiating conversational and clear speech produced by individuals with muscle tension dysphonia (MTD) using landmark-based analysis of speech (LMBAS). Thirty-four adult speakers with MTD recorded conversational and clear speech, with 27 of them able to produce clear speech. The recordings of these individuals were analyzed with the open-source LMBAS program, SpeechMark®, matlab Toolbox version 1.1.2. The results indicated that glottal landmarks, burst onset landmarks, and the duration between glottal landmarks differentiated conversational speech from clear speech. LMBAS shows potential as an approach for detecting the difference between conversational and clear speech in dysphonic individuals.
Collapse
Affiliation(s)
- Keiko Ishikawa
- Department of Communication Sciences and Disorders, University of Kentucky, 900 South Limestone, Lexington, Kentucky 40536-0200, USA
| | - Mary Pietrowicz
- Applied Research Institute, University of Illinois at Urbana-Champaign 2100 South Oak Street, Suite 206, Champaign, Illinois 61820, USA
| | - Sara Charney
- Department of Otolaryngology-Head and Neck Surgery, Mayo Clinic Arizona, 5777 East Mayo Boulevard, Phoenix, Arizona 85054, USA
| | - Diana Orbelo
- Department of Otolaryngology-Head and Neck Surgery, Mayo Medical School, 200 1st Street Southwest, Rochester, Minnesota 55905, , , ,
| |
Collapse
|
10
|
Faragó P, Ștefănigă SA, Cordoș CG, Mihăilă LI, Hintea S, Peștean AS, Beyer M, Perju-Dumbravă L, Ileșan RR. CNN-Based Identification of Parkinson's Disease from Continuous Speech in Noisy Environments. Bioengineering (Basel) 2023; 10:bioengineering10050531. [PMID: 37237601 DOI: 10.3390/bioengineering10050531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 04/21/2023] [Accepted: 04/24/2023] [Indexed: 05/28/2023] Open
Abstract
Parkinson's disease is a progressive neurodegenerative disorder caused by dopaminergic neuron degeneration. Parkinsonian speech impairment is one of the earliest presentations of the disease and, along with tremor, is suitable for pre-diagnosis. It is defined by hypokinetic dysarthria and accounts for respiratory, phonatory, articulatory, and prosodic manifestations. The topic of this article targets artificial-intelligence-based identification of Parkinson's disease from continuous speech recorded in a noisy environment. The novelty of this work is twofold. First, the proposed assessment workflow performed speech analysis on samples of continuous speech. Second, we analyzed and quantified Wiener filter applicability for speech denoising in the context of Parkinsonian speech identification. We argue that the Parkinsonian features of loudness, intonation, phonation, prosody, and articulation are contained in the speech, speech energy, and Mel spectrograms. Thus, the proposed workflow follows a feature-based speech assessment to determine the feature variation ranges, followed by speech classification using convolutional neural networks. We report the best classification accuracies of 96% on speech energy, 93% on speech, and 92% on Mel spectrograms. We conclude that the Wiener filter improves both feature-based analysis and convolutional-neural-network-based classification performances.
Collapse
Affiliation(s)
- Paul Faragó
- Bases of Electronics Department, Faculty of Electronics, Telecommunications and Information Technology, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Sebastian-Aurelian Ștefănigă
- Department of Computer Science, Faculty of Mathematics and Computer Science, West University of Timisoara, 300223 Timisoara, Romania
| | - Claudia-Georgiana Cordoș
- Bases of Electronics Department, Faculty of Electronics, Telecommunications and Information Technology, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Laura-Ioana Mihăilă
- Bases of Electronics Department, Faculty of Electronics, Telecommunications and Information Technology, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Sorin Hintea
- Bases of Electronics Department, Faculty of Electronics, Telecommunications and Information Technology, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Ana-Sorina Peștean
- Department of Neurology and Pediatric Neurology, Faculty of Medicine, University of Medicine and Pharmacy "Iuliu Hatieganu" Cluj-Napoca, 400012 Cluj-Napoca, Romania
| | - Michel Beyer
- Clinic of Oral and Cranio-Maxillofacial Surgery, University Hospital Basel, CH-4031 Basel, Switzerland
- Medical Additive Manufacturing Research Group (Swiss MAM), Department of Biomedical Engineering, University of Basel, CH-4123 Allschwil, Switzerland
| | - Lăcrămioara Perju-Dumbravă
- Department of Neurology and Pediatric Neurology, Faculty of Medicine, University of Medicine and Pharmacy "Iuliu Hatieganu" Cluj-Napoca, 400012 Cluj-Napoca, Romania
| | - Robert Radu Ileșan
- Department of Neurology and Pediatric Neurology, Faculty of Medicine, University of Medicine and Pharmacy "Iuliu Hatieganu" Cluj-Napoca, 400012 Cluj-Napoca, Romania
- Clinic of Oral and Cranio-Maxillofacial Surgery, University Hospital Basel, CH-4031 Basel, Switzerland
| |
Collapse
|
11
|
Wang R, Kuang C, Guo C, Chen Y, Li C, Matsumura Y, Ishimaru M, Van Pelt AJ, Chen F. Automatic Detection of Putative Mild Cognitive Impairment from Speech Acoustic Features in Mandarin-Speaking Elders. J Alzheimers Dis 2023; 95:901-914. [PMID: 37638439 DOI: 10.3233/jad-230373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2023]
Abstract
BACKGROUND To date, the reliable detection of mild cognitive impairment (MCI) remains a significant challenge for clinicians. Very few studies investigated the sensitivity of acoustic features in detecting Mandarin-speaking elders at risk for MCI, defined as "putative MCI" (pMCI). OBJECTIVE This study sought to investigate the possibility of using automatically extracted speech acoustic features to detect elderly people with pMCI and reveal the potential acoustic markers of cognitive decline at an early stage. METHODS Forty-one older adults with pMCI and 41 healthy elderly controls completed four reading tasks (syllable utterance, tongue twister, diadochokinesis, and short sentence reading), from which acoustic features were extracted automatically to train machine learning classifiers. Correlation analysis was employed to evaluate the relationship between classifier predictions and participants' cognitive ability measured by Mini-Mental State Examination 2. RESULTS Classification results revealed that some temporal features (e.g., speech rate, utterance duration, and the number of silent pauses), spectral features (e.g., variability of F1 and F2), and energy features (e.g., SD of peak intensity and SD of intensity range) were effective predictors of pMCI. The best classification result was achieved in the Random Forest classifier (accuracy = 0.81, AUC = 0.81). Correlation analysis uncovered a strong negative correlation between participants' cognitive test scores and the probability estimates of pMCI in the Random Forest classifier, and a modest negative correlation in the Support Vector Machine classifier. CONCLUSIONS The automatic acoustic analysis of speech could provide a promising non-invasive way to assess and monitor the early cognitive decline in Mandarin-speaking elders.
Collapse
Affiliation(s)
- Rumi Wang
- Rehabilitation Medicine Department, Speech and Language Pathology Therapy Section, The Second Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Chen Kuang
- School of Foreign Languages, Hunan University, Hunan, China
| | - Chengyu Guo
- School of Foreign Languages, Hunan University, Hunan, China
| | - Yong Chen
- Laboratory of Food Oral Processing, School of Food Science & Biotechnology, Zhejiang Gongshang University, Hangzhou, Zhejiang, China
| | - Canyang Li
- Rehabilitation Medicine Department, Speech and Language Pathology Therapy Section, The Second Xiangya Hospital of Central South University, Changsha, Hunan, China
| | | | | | - Alice J Van Pelt
- Section of Gastroenterology, Edward Hines, Jr. VA Hospital, Hines, IL, USA
- Division of Gastroenterology and Nutrition, Loyola University Stritch School of Medicine, Maywood, IL, USA
| | - Fei Chen
- School of Foreign Languages, Hunan University, Hunan, China
| |
Collapse
|
12
|
Rowe HP, Gochyyev P, Lammert AC, Lowit A, Spencer KA, Dickerson BC, Berry JD, Green JR. The efficacy of acoustic-based articulatory phenotyping for characterizing and classifying four divergent neurodegenerative diseases using sequential motion rates. J Neural Transm (Vienna) 2022; 129:1487-1511. [PMID: 36305960 PMCID: PMC9859630 DOI: 10.1007/s00702-022-02550-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 10/13/2022] [Indexed: 01/25/2023]
Abstract
Despite the impacts of neurodegeneration on speech function, little is known about how to comprehensively characterize the resulting speech abnormalities using a set of objective measures. Quantitative phenotyping of speech motor impairments may have important implications for identifying clinical syndromes and their underlying etiologies, monitoring disease progression over time, and improving treatment efficacy. The goal of this research was to investigate the validity and classification accuracy of comprehensive acoustic-based articulatory phenotypes in speakers with distinct neurodegenerative diseases. Articulatory phenotypes were characterized based on acoustic features that were selected to represent five components of motor performance: Coordination, Consistency, Speed, Precision, and Rate. The phenotypes were first used to characterize the articulatory abnormalities across four progressive neurologic diseases known to have divergent speech motor deficits: amyotrophic lateral sclerosis (ALS), progressive ataxia (PA), Parkinson's disease (PD), and the nonfluent variant of primary progressive aphasia and progressive apraxia of speech (nfPPA + PAOS). We then examined the efficacy of articulatory phenotyping for disease classification. Acoustic analyses were conducted on audio recordings of 217 participants (i.e., 46 ALS, 52 PA, 60 PD, 20 nfPPA + PAOS, and 39 controls) during a sequential speech task. Results revealed evidence of distinct articulatory phenotypes for the four clinical groups and that the phenotypes demonstrated strong classification accuracy for all groups except ALS. Our results highlight the phenotypic variability present across neurodegenerative diseases, which, in turn, may inform (1) the differential diagnosis of neurological diseases and (2) the development of sensitive outcome measures for monitoring disease progression or assessing treatment efficacy.
Collapse
Affiliation(s)
- Hannah P Rowe
- Department of Rehabilitation Sciences, MGH Institute of Health Professions, Charlestown, Boston, MA, USA
| | - Perman Gochyyev
- School of Healthcare Leadership, MGH Institute of Health Professions, Boston, MA, USA
- Berkeley Evaluation and Assessment Research Center, University of California at Berkeley, Berkeley, CA, USA
| | - Adam C Lammert
- Department of Biomedical Engineering, Worchester Polytechnic Institute, Worcester, MA, USA
| | - Anja Lowit
- Department of Speech and Language Therapy, University of Strathclyde, Glasgow, Scotland, UK
| | - Kristie A Spencer
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA
| | - Bradford C Dickerson
- Department of Neurology, Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
| | - James D Berry
- Department of Neurology, Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
| | - Jordan R Green
- Department of Rehabilitation Sciences, MGH Institute of Health Professions, Charlestown, Boston, MA, USA.
| |
Collapse
|
13
|
Wang Q, Fu Y, Shao B, Chang L, Ren K, Chen Z, Ling Y. Early detection of Parkinson’s disease from multiple signal speech: Based on Mandarin language dataset. Front Aging Neurosci 2022; 14:1036588. [DOI: 10.3389/fnagi.2022.1036588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Accepted: 10/20/2022] [Indexed: 11/11/2022] Open
Abstract
Parkinson’s disease (PD) is a neurodegenerative disorder that negatively affects millions of people. Early detection is of vital importance. As recent researches showed dysarthria level provides good indicators to the computer-assisted diagnosis and remote monitoring of patients at the early stages. It is the goal of this study to develop an automatic detection method based on newest collected Chinese dataset. Unlike English, no agreement was reached on the main features indicating language disorders due to vocal organ dysfunction. Thus, one of our approaches is to classify the speech phonation and articulation with a machine learning-based feature selection model. Based on a relatively big sample, three feature selection algorithms (LASSO, mRMR, Relief-F) were tested to select the vocal features extracted from speech signals collected in a controlled setting, followed by four classifiers (Naïve Bayes, K-Nearest Neighbor, Logistic Regression and Stochastic Gradient Descent) to detect the disorder. The proposed approach shows an accuracy of 75.76%, sensitivity of 82.44%, specificity of 73.15% and precision of 76.57%, indicating the feasibility and promising future for an automatic and unobtrusive detection on Chinese PD. The comparison among the three selection algorithms reveals that LASSO selector has the best performance regardless types of vocal features. The best detection accuracy is obtained by SGD classifier, while the best resulting sensitivity is obtained by LR classifier. More interestingly, articulation features are more representative and indicative than phonation features among all the selection and classifying algorithms. The most prominent articulation features are F1, F2, DDF1, DDF2, BBE and MFCC.
Collapse
|
14
|
Yousif NR, Balaha HM, Haikal AY, El-Gendy EM. A generic optimization and learning framework for Parkinson disease via speech and handwritten records. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2022; 14:1-21. [PMID: 36042792 PMCID: PMC9411848 DOI: 10.1007/s12652-022-04342-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 07/11/2022] [Indexed: 06/15/2023]
Abstract
Parkinson's disease (PD) is a neurodegenerative disorder with slow progression whose symptoms can be identified at late stages. Early diagnosis and treatment of PD can help to relieve the symptoms and delay progression. However, this is very challenging due to the similarities between the symptoms of PD and other diseases. The current study proposes a generic framework for the diagnosis of PD using handwritten images and (or) speech signals. For the handwriting images, 8 pre-trained convolutional neural networks (CNN) via transfer learning tuned by Aquila Optimizer were trained on the NewHandPD dataset to diagnose PD. For the speech signals, features from the MDVR-KCL dataset are extracted numerically using 16 feature extraction algorithms and fed to 4 different machine learning algorithms tuned by Grid Search algorithm, and graphically using 5 different techniques and fed to the 8 pretrained CNN structures. The authors propose a new technique in extracting the features from the voice dataset based on the segmentation of variable speech-signal-segment-durations, i.e., the use of different durations in the segmentation phase. Using the proposed technique, 5 datasets with 281 numerical features are generated. Results from different experiments are collected and recorded. For the NewHandPD dataset, the best-reported metric is 99.75% using the VGG19 structure. For the MDVR-KCL dataset, the best-reported metrics are 99.94% using the KNN and SVM ML algorithms and the combined numerical features; and 100% using the combined the mel-specgram graphical features and VGG19 structure. These results are better than other state-of-the-art researches.
Collapse
Affiliation(s)
- Nada R. Yousif
- Computer and Control Systems Engineering Department, Faculty of Engineering, Mansoura University, Mansoura, Egypt
| | - Hossam Magdy Balaha
- Computer and Control Systems Engineering Department, Faculty of Engineering, Mansoura University, Mansoura, Egypt
| | - Amira Y. Haikal
- Computer and Control Systems Engineering Department, Faculty of Engineering, Mansoura University, Mansoura, Egypt
| | - Eman M. El-Gendy
- Computer and Control Systems Engineering Department, Faculty of Engineering, Mansoura University, Mansoura, Egypt
| |
Collapse
|
15
|
Pakravan M, Jahed M. Significant pathological voice discrimination by computing posterior distribution of balanced accuracy. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
16
|
García AM, Orozco-Arroyave JR. Reply to: "Does Cognitive Impairment Influence Motor Speech Performance in De Novo Parkinson's Disease". Mov Disord 2021; 36:2982-2983. [PMID: 34921457 DOI: 10.1002/mds.28831] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 09/29/2021] [Indexed: 11/06/2022] Open
Affiliation(s)
- Adolfo M García
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina.,National Scientific and Technical Research Council, Buenos Aires, Argentina.,Departamento de Lingüística y Literatura, Facultad de Humanidades, Universidad de Santiago de Chile, Santiago, Chile.,Global Brain Health Institute, University of California, San Francisco, San Francisco, California, USA
| | - Juan Rafael Orozco-Arroyave
- GITA Lab, Faculty of Engineering, Universidad de Antioquia UdeA, Medellín, Colombia.,Pattern Recognition Lab, Friedrich-Alexander University, Erlangen, Germany
| |
Collapse
|
17
|
Amato F, Borzì L, Olmo G, Orozco-Arroyave JR. An algorithm for Parkinson's disease speech classification based on isolated words analysis. Health Inf Sci Syst 2021; 9:32. [PMID: 34422258 PMCID: PMC8324609 DOI: 10.1007/s13755-021-00162-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 07/14/2021] [Indexed: 12/04/2022] Open
Abstract
INTRODUCTION Automatic assessment of speech impairment is a cutting edge topic in Parkinson's disease (PD). Language disorders are known to occur several years earlier than typical motor symptoms, thus speech analysis may contribute to the early diagnosis of the disease. Moreover, the remote monitoring of dysphonia could allow achieving an effective follow-up of PD clinical condition, possibly performed in the home environment. METHODS In this work, we performed a multi-level analysis, progressively combining features extracted from the entire signal, the voiced segments, and the on-set/off-set regions, leading to a total number of 126 features. Furthermore, we compared the performance of early and late feature fusion schemes, aiming to identify the best model configuration and taking advantage of having 25 isolated words pronounced by each subject. We employed data from the PC-GITA database (50 healthy controls and 50 PD patients) for validation and testing. RESULTS We implemented an optimized k-Nearest Neighbours model for the binary classification of PD patients versus healthy controls. We achieved an accuracy of 99.4% in 10-fold cross-validation and 94.3% in testing on the PC-GITA database (average value of male and female subjects). CONCLUSION The promising performance yielded by our model confirms the feasibility of automatic assessment of PD using voice recordings. Moreover, a post-hoc analysis of the most relevant features discloses the option of voice processing using a simple smartphone application.
Collapse
Affiliation(s)
- Federica Amato
- Department of Control and Computing Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, Turin, Italy
| | - Luigi Borzì
- Department of Control and Computing Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, Turin, Italy
| | - Gabriella Olmo
- Department of Control and Computing Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, Turin, Italy
| | - Juan Rafael Orozco-Arroyave
- GITA Lab, Faculty of Engineering, University of Antioquia, Medellín, Colombia
- Pattern Recognition Lab., Friedrich-Alexander-Universit at Erlangen-Nu rnberg, Martenstrasse 3, Erlangen, Germany
| |
Collapse
|
18
|
Carrón J, Campos-Roca Y, Madruga M, Pérez CJ. A mobile-assisted voice condition analysis system for Parkinson's disease: assessment of usability conditions. Biomed Eng Online 2021; 20:114. [PMID: 34802448 PMCID: PMC8607631 DOI: 10.1186/s12938-021-00951-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 11/04/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND AND OBJECTIVE Automatic voice condition analysis systems to detect Parkinson's disease (PD) are generally based on speech data recorded under acoustically controlled conditions and professional supervision. The performance of these approaches in a free-living scenario is unknown. The aim of this research is to investigate the impact of uncontrolled conditions (realistic acoustic environment and lack of supervision) on the performance of automatic PD detection systems based on speech. METHODS A mobile-assisted voice condition analysis system is proposed to aid in the detection of PD using speech. The system is based on a server-client architecture. In the server, feature extraction and machine learning algorithms are designed and implemented to discriminate subjects with PD from healthy ones. The Android app allows patients to submit phonations and physicians to check the complete record of every patient. Six different machine learning classifiers are applied to compare their performance on two different speech databases. One of them is an in-house database (UEX database), collected under professional supervision by using the same Android-based smartphone in the same room, whereas the other one is an age, sex and health-status balanced subset of mPower study for PD, which provides real-world data. By applying identical methodology, single-database experiments have been performed on each database, and also cross-database tests. Cross-validation has been applied to assess generalization performance and hypothesis tests have been used to report statistically significant differences. RESULTS In the single-database experiments, a best accuracy rate of 0.92 (AUC = 0.98) has been obtained on UEX database, while a considerably lower best accuracy rate of 0.71 (AUC = 0.76) has been achieved using the mPower-based database. The cross-database tests provided very degraded accuracy metrics. CONCLUSION The results clearly show the potential of the proposed system as an aid for general practitioners to conduct triage or an additional tool for neurologists to perform diagnosis. However, due to the performance degradation observed using data from mPower study, semi-controlled conditions are encouraged, i.e., voices recorded at home by the patients themselves following a strict recording protocol and control of the information about patients by the medical doctor at charge.
Collapse
Affiliation(s)
- Javier Carrón
- Departamento de Matemáticas, Universidad de Extremadura, Cáceres, Spain
| | - Yolanda Campos-Roca
- Departamento de Tecnología de los Computadores y las Comunicaciones, Universidad de Extremadura, Cáceres, Spain
| | - Mario Madruga
- Departamento de Matemáticas, Universidad de Extremadura, Cáceres, Spain
| | - Carlos J Pérez
- Departamento de Matemáticas, Universidad de Extremadura, Cáceres, Spain.
| |
Collapse
|
19
|
Assessing Parkinson's Disease at Scale Using Telephone-Recorded Speech: Insights from the Parkinson's Voice Initiative. Diagnostics (Basel) 2021; 11:diagnostics11101892. [PMID: 34679590 PMCID: PMC8534584 DOI: 10.3390/diagnostics11101892] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 10/08/2021] [Accepted: 10/10/2021] [Indexed: 01/07/2023] Open
Abstract
Numerous studies have reported on the high accuracy of using voice tasks for the remote detection and monitoring of Parkinson’s Disease (PD). Most of these studies, however, report findings on a small number of voice recordings, often collected under acoustically controlled conditions, and therefore cannot scale at large without specialized equipment. In this study, we aimed to evaluate the potential of using voice as a population-based PD screening tool in resource-constrained settings. Using the standard telephone network, we processed 11,942 sustained vowel /a/ phonations from a US-English cohort comprising 1078 PD and 5453 control participants. We characterized each phonation using 304 dysphonia measures to quantify a range of vocal impairments. Given that this is a highly unbalanced problem, we used the following strategy: we selected a balanced subset (n = 3000 samples) for training and testing using 10-fold cross-validation (CV), and the remaining (unbalanced held-out dataset, n = 8942) samples for further model validation. Using robust feature selection methods we selected 27 dysphonia measures to present into a radial-basis-function support vector machine and demonstrated differentiation of PD participants from controls with 67.43% sensitivity and 67.25% specificity. These findings could help pave the way forward toward the development of an inexpensive, remote, and reliable diagnostic support tool for PD using voice as a digital biomarker.
Collapse
|
20
|
Roldan-Vasco S, Orozco-Duque A, Suarez-Escudero JC, Orozco-Arroyave JR. Machine learning based analysis of speech dimensions in functional oropharyngeal dysphagia. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 208:106248. [PMID: 34260973 DOI: 10.1016/j.cmpb.2021.106248] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 06/15/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVE The normal swallowing process requires a complex coordination of anatomical structures driven by sensory and cranial nerves. Alterations in such coordination cause swallowing malfunctions, namely dysphagia. The dysphagia screening methods are quite subjective and experience dependent. Bearing in mind that the swallowing process and speech production share some anatomical structures and mechanisms of neurological control, this work aims to evaluate the suitability of automatic speech processing and machine learning techniques for screening of functional dysphagia. METHODS Speech recordings were collected from 46 patients with functional oropharyngeal dysphagia produced by neurological causes, and 46 healthy controls. The dimensions of speech including phonation, articulation, and prosody were considered through different speech tasks. Specific features per dimension were extracted and analyzed using statistical tests. Machine learning models were applied per dimension via nested cross-validation. Hyperparameters were selected using the AUC - ROC as optimization criterion. RESULTS The Random Forest in the articulation related speech tasks retrieved the highest performance measures (AUC=0.86±0.10, sensitivity=0.91±0.12) for individual analysis of dimensions. In addition, the combination of speech dimensions with a voting ensemble improved the results, which suggests a contribution of information from different feature sets extracted from speech signals in dysphagia conditions. CONCLUSIONS The proposed approach based on speech related models is suitable for the automatic discrimination between dysphagic and healthy individuals. These findings seem to have potential use in the screening of functional oropharyngeal dysphagia in a non-invasive and inexpensive way.
Collapse
Affiliation(s)
- Sebastian Roldan-Vasco
- Faculty of Engineering, Instituto Tecnológico Metropolitano, Medellín, Colombia; Faculty of Engineering, Universidad de Antioquia, Medellín, Colombia.
| | - Andres Orozco-Duque
- Faculty of Pure and Applied Sciences, Instituto Tecnológico Metropolitano, Medellín, Colombia
| | - Juan Camilo Suarez-Escudero
- School of Health Sciences, Faculty of Medicine, Universidad Pontificia Bolivariana, Medellín, Colombia; Faculty of Pure and Applied Sciences, Instituto Tecnológico Metropolitano, Medellín, Colombia
| | - Juan Rafael Orozco-Arroyave
- Faculty of Engineering, Universidad de Antioquia, Medellín, Colombia; Pattern Recognition Lab, Friedrich-Alexander-Universität, Erlangen-Nürnberg, Germany.
| |
Collapse
|
21
|
Corrales-Astorgano M, Escudero-Mancebo D, González-Ferreras C, Cardeñoso Payo V, Martínez-Castilla P. Analysis of atypical prosodic patterns in the speech of people with Down syndrome. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
22
|
Wisler A, Teplansky K, Heitzman D, Wang J. The Effects of Symptom Onset Location on Automatic Amyotrophic Lateral Sclerosis Detection Using the Correlation Structure of Articulatory Movements. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2276-2286. [PMID: 33647219 PMCID: PMC8740667 DOI: 10.1044/2020_jslhr-20-00288] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 09/22/2020] [Accepted: 11/19/2020] [Indexed: 06/12/2023]
Abstract
Purpose Kinematic measurements of speech have demonstrated some success in automatic detection of early symptoms of amyotrophic lateral sclerosis (ALS). In this study, we examined how the region of symptom onset (bulbar vs. spinal) affects the ability of data-driven models to detect ALS. Method We used a correlation structure of articulatory movements combined with a machine learning model (i.e., artificial neural network) to detect differences between people with ALS and healthy controls. The performance of this system was evaluated separately for participants with bulbar onset and spinal onset to examine how region of onset affects classification performance. We then performed a regression analysis to examine how different severity measures and region of onset affects model performance. Results The proposed model was significantly more accurate in classifying the bulbar-onset participants, achieving an area under the curve of 0.809 relative to the 0.674 achieved for spinal-onset participants. The regression analysis, however, found that differences in classifier performance across participants were better explained by their speech performance (intelligible speaking rate), and no significant differences were observed based on region of onset when intelligible speaking rate was accounted for. Conclusions Although we found a significant difference in the model's ability to detect ALS depending on the region of onset, this disparity can be primarily explained by observable differences in speech motor symptoms. Thus, when the severity of speech symptoms (e.g., intelligible speaking rate) was accounted for, symptom onset location did not affect the proposed computational model's ability to detect ALS.
Collapse
Affiliation(s)
- Alan Wisler
- Department of Speech, Language, and Hearing Sciences, The University of Texas at Austin
| | - Kristin Teplansky
- Department of Speech, Language, and Hearing Sciences, The University of Texas at Austin
| | | | - Jun Wang
- Department of Speech, Language, and Hearing Sciences, The University of Texas at Austin
- Department of Neurology, Dell Medical School, The University of Texas at Austin
| |
Collapse
|
23
|
Jain A, Abedinpour K, Polat O, Çalışkan MM, Asaei A, Pfister FMJ, Fietzek UM, Cernak M. Voice Analysis to Differentiate the Dopaminergic Response in People With Parkinson's Disease. Front Hum Neurosci 2021; 15:667997. [PMID: 34135742 PMCID: PMC8200849 DOI: 10.3389/fnhum.2021.667997] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Accepted: 04/16/2021] [Indexed: 11/13/2022] Open
Abstract
Humans' voice offers the widest variety of motor phenomena of any human activity. However, its clinical evaluation in people with movement disorders such as Parkinson's disease (PD) lags behind current knowledge on advanced analytical automatic speech processing methodology. Here, we use deep learning-based speech processing to differentially analyze voice recordings in 14 people with PD before and after dopaminergic medication using personalized Convolutional Recurrent Neural Networks (p-CRNN) and Phone Attribute Codebooks (PAC). p-CRNN yields an accuracy of 82.35% in the binary classification of ON and OFF motor states at a sensitivity/specificity of 0.86/0.78. The PAC-based approach's accuracy was slightly lower with 73.08% at a sensitivity/specificity of 0.69/0.77, but this method offers easier interpretation and understanding of the computational biomarkers. Both p-CRNN and PAC provide a differentiated view and novel insights into the distinctive components of the speech of persons with PD. Both methods detect voice qualities that are amenable to dopaminergic treatment, including active phonetic and prosodic features. Our findings may pave the way for quantitative measurements of speech in persons with PD.
Collapse
Affiliation(s)
- Anubhav Jain
- Center for Innovation and Business Creation at Technical University of Munich (UnternehmerTUM), Munich, Germany
| | - Kian Abedinpour
- Department of Neurology and Clinical Neurophysiology, Schön Klinik München Schwabing, Munich, Germany
| | - Ozgur Polat
- Center for Innovation and Business Creation at Technical University of Munich (UnternehmerTUM), Munich, Germany
| | - Mine Melodi Çalışkan
- xMint, Yalova, Turkey.,Department of Mechatronics Engineering, Boğaziçi University, Istanbul, Turkey
| | - Afsaneh Asaei
- Center for Innovation and Business Creation at Technical University of Munich (UnternehmerTUM), Munich, Germany
| | - Franz M J Pfister
- Department of Data Science, Ludwig Maximilians Universität, Munich, Germany
| | - Urban M Fietzek
- Department of Neurology and Clinical Neurophysiology, Schön Klinik München Schwabing, Munich, Germany.,Department of Neurology, University of Munich, Munich, Germany
| | - Milos Cernak
- Logitech Europe, École polytechnique fédérale de Lausanne - Quartier de l'Innovation, Lausanne, Switzerland
| |
Collapse
|
24
|
Monitoring Parkinson's disease progression based on recorded speech with missing ordinal responses and replicated covariates. Comput Biol Med 2021; 134:104503. [PMID: 34091382 DOI: 10.1016/j.compbiomed.2021.104503] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 05/10/2021] [Accepted: 05/15/2021] [Indexed: 11/19/2022]
Abstract
Monitoring Parkinson's Disease (PD) progression is an important task to improve the life quality of the affected people. This task can be performed by extracting features from voice recordings and applying specifically designed statistical models, leading to systems that improve the ability of monitoring the progression of PD in an objective, remote, non-invasive, fast, and economically sustainable way. An experiment has been conducted with 36 subjects to study the progression of the PD over 4 years by using the Hoehn and Yahr (HY) scale and features extracted from the phonation of the vowel/a/. The collected dataset had many missing data, which should be addressed jointly with the non-decreasing nature of the disease and the within-subject variability due to the use of replicated features. In order to handle these issues, a Hidden Markov model for longitudinal data was designed and implemented by using a data augmentation scheme based on different latent variables. Markov chain Monte Carlo methods were used to generate from the posterior distribution. The proposed approach has been tested on simulated data, providing good accuracy rates in the context of a multiclass problem. It also has been applied to the real data obtained from the conducted experiment, providing imputed and predicted HY stages compatible with the progression of PD. The conducted experiment and the proposed approach contribute to fill a gap in the scientific literature on experiments and methodologies for tracking PD progression based on acoustic features and the HY scale. This would help to derive an expert system that can be integrated into the protocols of neurology units in hospital centers.
Collapse
|
25
|
Rusz J, Hlavnička J, Novotný M, Tykalová T, Pelletier A, Montplaisir J, Gagnon JF, Dušek P, Galbiati A, Marelli S, Timm PC, Teigen LN, Janzen A, Habibi M, Stefani A, Holzknecht E, Seppi K, Evangelista E, Rassu AL, Dauvilliers Y, Högl B, Oertel W, St Louis EK, Ferini-Strambi L, Růžička E, Postuma RB, Šonka K. Speech Biomarkers in Rapid Eye Movement Sleep Behavior Disorder and Parkinson Disease. Ann Neurol 2021; 90:62-75. [PMID: 33856074 PMCID: PMC8252762 DOI: 10.1002/ana.26085] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 03/16/2021] [Accepted: 04/11/2021] [Indexed: 01/19/2023]
Abstract
Objective This multilanguage study used simple speech recording and high‐end pattern analysis to provide sensitive and reliable noninvasive biomarkers of prodromal versus manifest α‐synucleinopathy in patients with idiopathic rapid eye movement sleep behavior disorder (iRBD) and early‐stage Parkinson disease (PD). Methods We performed a multicenter study across the Czech, English, German, French, and Italian languages at 7 centers in Europe and North America. A total of 448 participants (337 males), including 150 with iRBD (mean duration of iRBD across language groups 0.5–3.4 years), 149 with PD (mean duration of disease across language groups 1.7–2.5 years), and 149 healthy controls were recorded; 350 of the participants completed the 12‐month follow‐up. We developed a fully automated acoustic quantitative assessment approach for the 7 distinctive patterns of hypokinetic dysarthria. Results No differences in language that impacted clinical parkinsonian phenotypes were found. Compared with the controls, we found significant abnormalities of an overall acoustic speech severity measure via composite dysarthria index for both iRBD (p = 0.002) and PD (p < 0.001). However, only PD (p < 0.001) was perceptually distinct in a blinded subjective analysis. We found significant group differences between PD and controls for monopitch (p < 0.001), prolonged pauses (p < 0.001), and imprecise consonants (p = 0.03); only monopitch was able to differentiate iRBD patients from controls (p = 0.004). At the 12‐month follow‐up, a slight progression of overall acoustic speech impairment was noted for the iRBD (p = 0.04) and PD (p = 0.03) groups. Interpretation Automated speech analysis might provide a useful additional biomarker of parkinsonism for the assessment of disease progression and therapeutic interventions. ANN NEUROL 2021;90:62–75
Collapse
Affiliation(s)
- Jan Rusz
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic.,Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | - Jan Hlavnička
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Michal Novotný
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Tereza Tykalová
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Amelie Pelletier
- Department of Neurology, Research Institute of the McGill University Health Centre, Montreal General Hospital, Montreal, Quebec, Canada.,Center for Advanced Research in Sleep Medicine, CIUSSS-NIM - Hôpital du Sacré-Coeur de Montréal, Montreal, Quebec, Canada
| | - Jacques Montplaisir
- Center for Advanced Research in Sleep Medicine, CIUSSS-NIM - Hôpital du Sacré-Coeur de Montréal, Montreal, Quebec, Canada
| | - Jean-Francois Gagnon
- Center for Advanced Research in Sleep Medicine, CIUSSS-NIM - Hôpital du Sacré-Coeur de Montréal, Montreal, Quebec, Canada
| | - Petr Dušek
- Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | - Andrea Galbiati
- Sleep Disorders Center, Division of Neuroscience, Ospedale San Raffaele, Università Vita-Salute, Milan, Italy
| | - Sara Marelli
- Sleep Disorders Center, Division of Neuroscience, Ospedale San Raffaele, Università Vita-Salute, Milan, Italy
| | - Paul C Timm
- Mayo Center for Sleep Medicine, Division of Pulmonary and Critical Care Medicine, Mayo Clinic College of Medicine and Science, Rochester, MN.,Department of Neurology, Mayo Clinic College of Medicine and Science, Rochester, MN
| | - Luke N Teigen
- Mayo Center for Sleep Medicine, Division of Pulmonary and Critical Care Medicine, Mayo Clinic College of Medicine and Science, Rochester, MN.,Department of Neurology, Mayo Clinic College of Medicine and Science, Rochester, MN
| | - Annette Janzen
- Department of Neurology, Philipps University Marburg, Marburg, Germany
| | - Mahboubeh Habibi
- Department of Neurology, Philipps University Marburg, Marburg, Germany
| | - Ambra Stefani
- Department of Neurology, Medical University of Innsbruck, Innsbruck, Austria
| | - Evi Holzknecht
- Department of Neurology, Medical University of Innsbruck, Innsbruck, Austria
| | - Klaus Seppi
- Department of Neurology, Medical University of Innsbruck, Innsbruck, Austria
| | - Elisa Evangelista
- National Reference Network for Narcolepsy, Sleep-Wake Disorder Unit, Department of Neurology, Gui-de-Chauliac Hospital, CHU Montpellier, INSERM, University of Montpellier, Montpellier, France
| | - Anna Laura Rassu
- National Reference Network for Narcolepsy, Sleep-Wake Disorder Unit, Department of Neurology, Gui-de-Chauliac Hospital, CHU Montpellier, INSERM, University of Montpellier, Montpellier, France
| | - Yves Dauvilliers
- National Reference Network for Narcolepsy, Sleep-Wake Disorder Unit, Department of Neurology, Gui-de-Chauliac Hospital, CHU Montpellier, INSERM, University of Montpellier, Montpellier, France
| | - Birgit Högl
- Department of Neurology, Medical University of Innsbruck, Innsbruck, Austria
| | - Wolfgang Oertel
- Department of Neurology, Philipps University Marburg, Marburg, Germany
| | - Erik K St Louis
- Mayo Center for Sleep Medicine, Division of Pulmonary and Critical Care Medicine, Mayo Clinic College of Medicine and Science, Rochester, MN.,Department of Neurology, Mayo Clinic College of Medicine and Science, Rochester, MN.,Mayo Clinic Health System Southwest Wisconsin, La Crosse, WI
| | - Luigi Ferini-Strambi
- Sleep Disorders Center, Division of Neuroscience, Ospedale San Raffaele, Università Vita-Salute, Milan, Italy
| | - Evžen Růžička
- Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | - Ronald B Postuma
- Department of Neurology, Research Institute of the McGill University Health Centre, Montreal General Hospital, Montreal, Quebec, Canada.,Center for Advanced Research in Sleep Medicine, CIUSSS-NIM - Hôpital du Sacré-Coeur de Montréal, Montreal, Quebec, Canada
| | - Karel Šonka
- Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| |
Collapse
|
26
|
Advances in Parkinson's Disease detection and assessment using voice and speech: A review of the articulatory and phonatory aspects. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102418] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
27
|
A machine learning perspective on the emotional content of Parkinsonian speech. Artif Intell Med 2021; 115:102061. [PMID: 34001321 DOI: 10.1016/j.artmed.2021.102061] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Revised: 02/26/2021] [Accepted: 03/29/2021] [Indexed: 12/23/2022]
Abstract
Patients with Parkinson's disease (PD) have distinctive voice patterns, often perceived as expressing sad emotion. While this characteristic of Parkinsonian speech has been supported through the perspective of listeners, where both PD and healthy control (HC) subjects repeat the same speaking tasks, it has never been explored through a machine learning modelling approach. Our work provides an objective evaluation of this characteristic of the PD speech, by building a transfer learning system to assess how the PD pathology affects the sadness perception. To do so we introduce a Mixture-of-Experts (MoE) architecture for speech emotion recognition designed to be transferable across datasets. Firstly, by relying on publicly available emotional speech corpora, we train the MoE model and then we use it to quantify perceived sadness in never seen before PD and matched HC speech recordings. To build our models (experts), we extracted spectral features of the voicing parts of speech and we trained a gradient boosting decision trees model in each corpus to predict happiness vs. sadness. MoE predictions are created by weighting each expert's prediction according to the distance between the new sample and the expert-specific training samples. The MoE approach systematically infers more negative emotional characteristics in PD speech than in HC. Crucially, these judgments are related to the disease severity and the severity of speech impairment in the PD patients: the more impairment, the more likely the speech is to be judged as sad. Our findings pave the way towards a better understanding of the characteristics of PD speech and show how publicly available datasets can be used to train models that provide interesting insights on clinical data.
Collapse
|
28
|
Classification of ALS patients based on acoustic analysis of sustained vowel phonations. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2020.102350] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
29
|
Jeancolas L, Petrovska-Delacrétaz D, Mangone G, Benkelfat BE, Corvol JC, Vidailhet M, Lehéricy S, Benali H. X-Vectors: New Quantitative Biomarkers for Early Parkinson's Disease Detection From Speech. Front Neuroinform 2021; 15:578369. [PMID: 33679361 PMCID: PMC7935511 DOI: 10.3389/fninf.2021.578369] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 01/18/2021] [Indexed: 01/18/2023] Open
Abstract
Many articles have used voice analysis to detect Parkinson's disease (PD), but few have focused on the early stages of the disease and the gender effect. In this article, we have adapted the latest speaker recognition system, called x-vectors, in order to detect PD at an early stage using voice analysis. X-vectors are embeddings extracted from Deep Neural Networks (DNNs), which provide robust speaker representations and improve speaker recognition when large amounts of training data are used. Our goal was to assess whether, in the context of early PD detection, this technique would outperform the more standard classifier MFCC-GMM (Mel-Frequency Cepstral Coefficients—Gaussian Mixture Model) and, if so, under which conditions. We recorded 221 French speakers (recently diagnosed PD subjects and healthy controls) with a high-quality microphone and via the telephone network. Men and women were analyzed separately in order to have more precise models and to assess a possible gender effect. Several experimental and methodological aspects were tested in order to analyze their impacts on classification performance. We assessed the impact of the audio segment durations, data augmentation, type of dataset used for the neural network training, kind of speech tasks, and back-end analyses. X-vectors technique provided better classification performances than MFCC-GMM for the text-independent tasks, and seemed to be particularly suited for the early detection of PD in women (7–15% improvement). This result was observed for both recording types (high-quality microphone and telephone).
Collapse
Affiliation(s)
- Laetitia Jeancolas
- Paris Brain Institute-ICM, Centre de NeuroImagerie de Recherche-CENIR, Paris, France.,Laboratoire SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, Palaiseau, France
| | | | - Graziella Mangone
- Sorbonne University, Inserm, CNRS, Paris Brain Institute-ICM, Paris, France.,Assistance Publique Hôpitaux de Paris, Hôpital Pitié-Salpêtrière, Department of Neurology, Clinical Investigation Center for Neurosciences, Paris, France
| | - Badr-Eddine Benkelfat
- Laboratoire SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, Palaiseau, France
| | - Jean-Christophe Corvol
- Sorbonne University, Inserm, CNRS, Paris Brain Institute-ICM, Paris, France.,Assistance Publique Hôpitaux de Paris, Hôpital Pitié-Salpêtrière, Department of Neurology, Clinical Investigation Center for Neurosciences, Paris, France
| | - Marie Vidailhet
- Sorbonne University, Inserm, CNRS, Paris Brain Institute-ICM, Paris, France.,Assistance Publique Hôpitaux de Paris, Hôpital Pitié-Salpêtrière, Department of Neurology, Clinical Investigation Center for Neurosciences, Paris, France
| | - Stéphane Lehéricy
- Paris Brain Institute-ICM, Centre de NeuroImagerie de Recherche-CENIR, Paris, France.,Sorbonne University, Inserm, CNRS, Paris Brain Institute-ICM, Paris, France.,Assistance Publique Hôpitaux de Paris, Hôpital Pitié-Salpêtrière, Department of Neuroradiology, Paris, France
| | - Habib Benali
- Department of Electrical & Computer Engineering, PERFORM Center, Concordia University, Montreal, QC, Canada
| |
Collapse
|
30
|
Li Y, Zhang X, Wang P, Zhang X, Liu Y. Insight into an unsupervised two-step sparse transfer learning algorithm for speech diagnosis of Parkinson's disease. Neural Comput Appl 2021; 33:9733-9750. [PMID: 33584015 PMCID: PMC7871026 DOI: 10.1007/s00521-021-05741-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Accepted: 01/16/2021] [Indexed: 11/26/2022]
Abstract
Speech diagnosis of Parkinson’s disease (PD) as a non-invasive and simple diagnosis method is particularly worth exploring. However, the number of samples of speech-based PD is relatively small, and there exist discrepancies in the distribution between subjects. In order to solve the two problems, a novel unsupervised two-step sparse transfer learning is proposed in this paper to tackle with PD speech diagnosis. In the first step, convolution sparse coding with the coordinate selection of samples and features is designed to learn speech structure from the source domain to replenish sample information of the target domain. In the second step, joint local structure distribution alignment is designed to maintain the neighbor relationship between the respective samples of the training set and test set, and reduce the distribution difference between the two domains at the same time. Two representative public PD speech datasets and one real-world PD speech dataset were exploited to verify the proposed method on PD speech diagnosis. Experimental results demonstrate that each step of the proposed method has a positive effect on the PD speech classification results, and it also delivers superior performance over the existing relative methods.
Collapse
Affiliation(s)
- Yongming Li
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, 400030 China
| | - Xinyue Zhang
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, 400030 China
| | - Pin Wang
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, 400030 China
| | - Xiaoheng Zhang
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, 400030 China
- Chongqing Radio and TV University, Chongqing, 400052 China
| | - Yuchuan Liu
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, 400030 China
| |
Collapse
|
31
|
Tsanas A, Little MA, Ramig LO. Remote Assessment of Parkinson's Disease Symptom Severity Using the Simulated Cellular Mobile Telephone Network. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2021; 9:11024-11036. [PMID: 33495722 PMCID: PMC7821632 DOI: 10.1109/access.2021.3050524] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 12/25/2020] [Indexed: 06/12/2023]
Abstract
Telemonitoring of Parkinson's Disease (PD) has attracted considerable research interest because of its potential to make a lasting, positive impact on the life of patients and their carers. Purpose-built devices have been developed that record various signals which can be associated with average PD symptom severity, as quantified on standard clinical metrics such as the Unified Parkinson's Disease Rating Scale (UPDRS). Speech signals are particularly promising in this regard, because they can be easily recorded without the use of expensive, dedicated hardware. Previous studies have demonstrated replication of UPDRS to within less than 2 points of a clinical raters' assessment of symptom severity, using high-quality speech signals collected using dedicated telemonitoring hardware. Here, we investigate the potential of using the standard voice-over-GSM (2G) or UMTS (3G) cellular mobile telephone networks for PD telemonitoring, networks that, together, have greater than 5 billion subscribers worldwide. We test the robustness of this approach using a simulated noisy mobile communication network over which speech signals are transmitted, and approximately 6000 recordings from 42 PD subjects. We show that UPDRS can be estimated to within less than 3.5 points difference from the clinical raters' assessment, which is clinically useful given that the inter-rater variability for UPDRS can be as high as 4-5 UPDRS points. This provides compelling evidence that the existing voice telephone network has potential towards facilitating inexpensive, mass-scale PD symptom telemonitoring applications.
Collapse
Affiliation(s)
- Athanasios Tsanas
- Edinburgh Medical SchoolUsher Institute, The University of EdinburghEdinburghEH16 4UXU.K.
| | - Max A. Little
- School of Computer ScienceUniversity of BirminghamBirminghamB15 2TTU.K.
| | - Lorraine O. Ramig
- Department of Speech, Language, and Hearing ScienceUniversity of Colorado BoulderBoulderCO80309USA
- National Center for Voice and SpeechDenverCO80014USA
| |
Collapse
|
32
|
Kamikubo R, Dwivedi U, Kacorri H. Sharing Practices for Datasets Related to Accessibility and Aging. ASSETS. ANNUAL ACM CONFERENCE ON ASSISTIVE TECHNOLOGIES 2021; 1:10.1145/3441852.3471208. [PMID: 35187541 PMCID: PMC8855358 DOI: 10.1145/3441852.3471208] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Datasets sourced from people with disabilities and older adults play an important role in innovation, benchmarking, and mitigating bias for both assistive and inclusive AI-infused applications. However, they are scarce. We conduct a systematic review of 137 accessibility datasets manually located across different disciplines over the last 35 years. Our analysis highlights how researchers navigate tensions between benefits and risks in data collection and sharing. We uncover patterns in data collection purpose, terminology, sample size, data types, and data sharing practices across communities of focus. We conclude by critically reflecting on challenges and opportunities related to locating and sharing accessibility datasets calling for technical, legal, and institutional privacy frameworks that are more attuned to concerns from these communities.
Collapse
Affiliation(s)
- Rie Kamikubo
- College of Information Studies University of Maryland, College Park
| | - Utkarsh Dwivedi
- College of Information Studies University of Maryland, College Park
| | - Hernisa Kacorri
- College of Information Studies University of Maryland, College Park
| |
Collapse
|
33
|
Sidorova J, Carbonell P, Čukić M. Blood Glucose Estimation From Voice: First Review of Successes and Challenges. J Voice 2020; 36:737.e1-737.e10. [PMID: 33041176 DOI: 10.1016/j.jvoice.2020.08.034] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Revised: 08/25/2020] [Accepted: 08/26/2020] [Indexed: 11/25/2022]
Abstract
The possibility to estimate glucose value from voice would make a breakthrough in diabetes treatment: namely, remove the delay in the nonintrusive instantaneous blood glucose estimation, relieve medical budgets and significantly improve wellbeing of diabetics. In this review, different approaches have been described and systematized, in order to provide an objective snapshot of the state of the art. Since nonintrusive glucose estimation is notoriously difficult, we included a Transparence and Reproducibility Score aimed at revealing the biases in the primary research articles. The review is completed with the discussion on future research pathways.
Collapse
Affiliation(s)
- Julia Sidorova
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Hospital Clinic, Barcelona, Spain..
| | - Pablo Carbonell
- Instituto de Automatica e Informatica Industrial, Universidad Politecnica de Valencia, Valencia, Spain
| | - Milena Čukić
- Instituto de Tecnología del Conocimiento, Universidad Complutense de Madrid, Madrid, Spain
| |
Collapse
|
34
|
Prosody-Based Measures for Automatic Severity Assessment of Dysarthric Speech. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10196999] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
One of the first cues for many neurological disorders are impairments in speech. The traditional method of diagnosing speech disorders such as dysarthria involves a perceptual evaluation from a trained speech therapist. However, this approach is known to be difficult to use for assessing speech impairments due to the subjective nature of the task. As prosodic impairments are one of the earliest cues of dysarthria, the current study presents an automatic method of assessing dysarthria in a range of severity levels using prosody-based measures. We extract prosodic measures related to pitch, speech rate, and rhythm from speakers with dysarthria and healthy controls in English and Korean datasets, despite the fact that these two languages differ in terms of prosodic characteristics. These prosody-based measures are then used as inputs to random forest, support vector machine and neural network classifiers to automatically assess different severity levels of dysarthria. Compared to baseline MFCC features, 18.13% and 11.22% relative accuracy improvement are achieved for English and Korean datasets, respectively, when including prosody-based features. Furthermore, most improvements are obtained with a better classification of mild dysarthric utterances: a recall improvement from 42.42% to 83.34% for English speakers with mild dysarthria and a recall improvement from 36.73% to 80.00% for Korean speakers with mild dysarthria.
Collapse
|
35
|
Romana A, Bandon J, Carlozzi N, Roberts A, Provost EM. Classification of Manifest Huntington Disease using Vowel Distortion Measures. INTERSPEECH 2020; 2020:4966-4970. [PMID: 33244474 PMCID: PMC7685306 DOI: 10.21437/interspeech.2020-2724] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Huntington disease (HD) is a fatal autosomal dominant neurocognitive disorder that causes cognitive disturbances, neuropsychiatric symptoms, and impaired motor abilities (e.g., gait, speech, voice). Due to its progressive nature, HD treatment requires ongoing clinical monitoring of symptoms. Individuals with the Huntingtin gene mutation, which causes HD, may exhibit a range of speech symptoms as they progress from premanifest to manifest HD. Speech-based passive monitoring has the potential to augment clinical information by more continuously tracking manifestation symptoms. Differentiating between premanifest and manifest HD is an important yet under-studied problem, as this distinction marks the need for increased treatment. In this work we present the first demonstration of how changes in speech can be measured to differentiate between premanifest and manifest HD. To do so, we focus on one speech symptom of HD: distorted vowels. We introduce a set of Filtered Vowel Distortion Measures (FVDM) which we extract from read speech. We show that FVDM, coupled with features from existing literature, can differentiate between premanifest and manifest HD with 80% accuracy.
Collapse
Affiliation(s)
- Amrit Romana
- Computer Science and Engineering, University of Michigan, Ann Arbor, Michigan, USA
| | - John Bandon
- Computer Science and Engineering, University of Michigan, Ann Arbor, Michigan, USA
| | - Noelle Carlozzi
- Physical Medicine & Rehabilitation, University of Michigan, Ann Arbor, Michigan, USA
| | - Angela Roberts
- Communication Sciences and Disorders, Northwestern University, Evanston, Illinois, USA
| | - Emily Mower Provost
- Computer Science and Engineering, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
36
|
Gómez-Rodellar A, Palacios-Alonso D, Ferrández Vicente JM, Mekyska J, Álvarez-Marquina A, Gómez-Vilda P. A Methodology to Differentiate Parkinson's Disease and Aging Speech Based on Glottal Flow Acoustic Analysis. Int J Neural Syst 2020; 30:2050058. [PMID: 32880202 DOI: 10.1142/s0129065720500586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Speech is controlled by axial neuromotor systems, therefore, it is highly sensitive to the effects of neurodegenerative illnesses such as Parkinson's Disease (PD). Patients suffering from PD present important alterations in speech, which are manifested in phonation, articulation, prosody, and fluency. These alterations may be evaluated using statistical methods on features obtained from glottal, spectral, cepstral, or fractal descriptions of speech. This work introduces an evaluation paradigm based on Information Theory (IT) to differentiate the effects of PD and aging on glottal amplitude distributions. The study is conducted on a database including 48 PD patients (24 males, 24 females), 48 age-matched healthy controls (HC, 24 males, 24 females), and 48 mid-age normative subjects (NS, 24 males, 24 females). It may be concluded from the study that Hierarchical Clustering (HiCl) methods produce a clear separation between the phonation of PD patients from NS subjects (accuracy of 89.6% for both male and female subsets), but the separation between PD patients and HC subjects is less efficient (accuracy of 75.0% for the male subset and 70.8% for the female subset). Conversely, using feature selection and Support Vector Machine (SVM) classification, the differentiation between PD and HC is substantially improved (accuracy of 94.8% for the male subset and 92.8% for the female subset). This improvement was mainly boosted by feature selection, at a cost of information and generalization losses. The results point to the possibility that speech deterioration may affect HC phonation with aging, reducing its difference to PD phonation.
Collapse
Affiliation(s)
- Andrés Gómez-Rodellar
- Usher Institute, Medical School, University of Edinburgh, Old Medical School, Teviot Place, Edinburgh, EH8 9AG UK
| | - Daniel Palacios-Alonso
- Escuela Técnica Superior de Ingeniería Informática, Universidad Rey Juan Carlos, Calle Tulipán, s/n, 28933 Móstoles, Madrid, Spain
| | - José M Ferrández Vicente
- Universidad Politécnica de Cartagena, Campus Universitario Muralla del Mar, Pza. Hospital 1, 30202 Cartagena, Spain
| | - Jiri Mekyska
- Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
| | - Agustín Álvarez-Marquina
- Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad, Politécnica de Madrid, Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - Pedro Gómez-Vilda
- Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad, Politécnica de Madrid, Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
| |
Collapse
|
37
|
Thijs Z, Watts CR. Perceptual Characterization of Voice Quality in Nonadvanced Stages of Parkinson's Disease. J Voice 2020; 36:293.e11-293.e18. [PMID: 32703725 DOI: 10.1016/j.jvoice.2020.05.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Revised: 02/17/2020] [Accepted: 05/04/2020] [Indexed: 10/23/2022]
Abstract
INTRODUCTION Parkinson's disease (PD) is a neurodegenerative disorder that impacts motor and nonmotor systems, and consequently influences voice. In later stages of the disease, people with PD develop salient hypokinetic dysarthria. However, it is unclear how extensive the voice impairment is in the nonadvanced stages of PD. Therefore, the aim of the current research was to investigate the auditory-perceptual characteristics of voice in people with Parkinson's disease (PWPD) in nonadvanced stages. METHODS 29 PWPD and 32 healthy older controls were recruited. For each participant, a recording of the sentence "We were away a year ago" was acquired. These recordings were evaluated by 2 licensed and experienced speech-language pathologists, who provided perceptual ratings of overall dysphonia severity, breathiness, roughness, and perceived age. RESULTS MANCOVA analysis showed that, when controlling for age and intensity, there was a significant effect of group (P = 0.001) on perceptual voice quality. PWPD were perceived to be significantly older, more breathy and more severely dysphonic than the older healthy controls. No differences were found for the perceived roughness. CONCLUSIONS The results suggest that perceptual features of hypokinetic dysarthria in voice, specifically breathiness, are present in nonadvanced stages of PWPD and may contribute to listener perceptions of speaker age. Moreover, the perceptual voice profiles in PWPD showed great variability, possibly reflecting the heterogeneity of disease impact on individuals. The results of this study may inform how research targets rehabilitation and maintenance of voice and laryngeal function in PWPD at nonadvanced stages.
Collapse
Affiliation(s)
- Zoë Thijs
- Texas Christian University, Fort Worth, Texas, USA.
| | | |
Collapse
|
38
|
Iakovakis D, Diniz JA, Trivedi D, Chaudhuri RK, Hadjileontiadis LJ, Hadjidimitriou S, Charisis V, Bostanjopoulou S, Katsarou Z, Klingelhoefer L, Mayer S, Reichmann H, Dias SB. Early Parkinson's Disease Detection via Touchscreen Typing Analysis using Convolutional Neural Networks. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2019:3535-3538. [PMID: 31946641 DOI: 10.1109/embc.2019.8857211] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Parkinson's Disease (PD) is the second most common neurodegenerative disorder worldwide, causing both motor and non-motor symptoms. In the early stages, symptoms are mild and patients may ignore their existence. As a result, they do not undergo any related clinical examination; hence delaying their PD diagnosis. In an effort to remedy such delay, analysis of data passively captured from user's interaction with consumer technologies has been recently explored towards remote screening of early PD motor signs. In the current study, a smartphone-based method analyzing subjects' finger interaction with the smartphone screen is developed for the quantification of fine-motor skills decline in early PD using Convolutional Neural Networks. Experimental results from the analysis of keystroke typing in-the-clinic data from 18 early PD patients and 15 healthy controls have shown a classification performance of 0.89 Area Under the Curve (AUC) with 0.79/0.79 sensitivity/specificity, respectively. Evaluation of the generalization ability of the proposed approach was made by its application on typing data arising from a separate self-reported cohort of 27 PD patients' and 84 healthy controls' daily usage with their personal smartphones (data in-the-wild), achieving 0.79 AUC with 0.74/0.78 sensitivity/specificity, respectively. The results show the potentiality of the proposed approach to process keystroke dynamics arising from users' natural typing activity to detect PD, which contributes to the development of digital tools for remote pathological symptom screening.
Collapse
|
39
|
Khan T, Lundgren LE, Anderson DG, Nowak I, Dougherty M, Verikas A, Pavel M, Jimison H, Nowaczyk S, Aharonson V. Assessing Parkinson's disease severity using speech analysis in non-native speakers. COMPUT SPEECH LANG 2020. [DOI: 10.1016/j.csl.2019.101047] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
40
|
Drimalla H, Scheffer T, Landwehr N, Baskow I, Roepke S, Behnia B, Dziobek I. Towards the automatic detection of social biomarkers in autism spectrum disorder: introducing the simulated interaction task (SIT). NPJ Digit Med 2020; 3:25. [PMID: 32140568 PMCID: PMC7048784 DOI: 10.1038/s41746-020-0227-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Accepted: 01/17/2020] [Indexed: 12/28/2022] Open
Abstract
Social interaction deficits are evident in many psychiatric conditions and specifically in autism spectrum disorder (ASD), but hard to assess objectively. We present a digital tool to automatically quantify biomarkers of social interaction deficits: the simulated interaction task (SIT), which entails a standardized 7-min simulated dialog via video and the automated analysis of facial expressions, gaze behavior, and voice characteristics. In a study with 37 adults with ASD without intellectual disability and 43 healthy controls, we show the potential of the tool as a diagnostic instrument and for better description of ASD-associated social phenotypes. Using machine-learning tools, we detected individuals with ASD with an accuracy of 73%, sensitivity of 67%, and specificity of 79%, based on their facial expressions and vocal characteristics alone. Especially reduced social smiling and facial mimicry as well as a higher voice fundamental frequency and harmony-to-noise-ratio were characteristic for individuals with ASD. The time-effective and cost-effective computer-based analysis outperformed a majority vote and performed equal to clinical expert ratings.
Collapse
Affiliation(s)
- Hanna Drimalla
- Department of Psychology, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany
- Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Prof.-Dr.-Helmert-Str. 2-3, 14482 Potsdam, Germany
| | - Tobias Scheffer
- Institute of Computer Science, University of Potsdam, Am Neuen Palais 10, 14469 Potsdam, Germany
| | - Niels Landwehr
- Institute of Computer Science, University of Potsdam, Am Neuen Palais 10, 14469 Potsdam, Germany
- Leibniz Institute for Agricultural Engineering and Bioeconomy, Max-Eyth-Allee 100, 14469 Potsdam, Germany
| | - Irina Baskow
- Department of Psychology, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany
- Department of Psychiatry, Charité-Universitätsmedizin Berlin, Campus Benjamin Franklin, Hindenburgdamm 30, 12203 Berlin, Germany
| | - Stefan Roepke
- Department of Psychiatry, Charité-Universitätsmedizin Berlin, Campus Benjamin Franklin, Hindenburgdamm 30, 12203 Berlin, Germany
| | - Behnoush Behnia
- Department of Psychiatry, Charité-Universitätsmedizin Berlin, Campus Benjamin Franklin, Hindenburgdamm 30, 12203 Berlin, Germany
| | - Isabel Dziobek
- Department of Psychology, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany
- Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany
| |
Collapse
|
41
|
Karan B, Sahu SS, Mahto K. Parkinson disease prediction using intrinsic mode function based features from speech signal. Biocybern Biomed Eng 2020. [DOI: 10.1016/j.bbe.2019.05.005] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
42
|
Yaman O, Ertam F, Tuncer T. Automated Parkinson's disease recognition based on statistical pooling method using acoustic features. Med Hypotheses 2019; 135:109483. [PMID: 31954340 DOI: 10.1016/j.mehy.2019.109483] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 11/06/2019] [Accepted: 11/08/2019] [Indexed: 02/08/2023]
Abstract
Parkinson's disease is one of the mostly seen neurological disease. It affects to nervous system and hinders people's vital activities. The majority of Parkinson's patients lose their ability to speak, write and balance. Many machine learning methods have been proposed to automatically diagnose Parkinson's disease using acoustic, hand writing and gaits. In this study, a statistical pooling method is proposed to recognize Parkinson's disease using the vowels. The used Parkinson's disease dataset contains the features of vowels. In the proposed method, the features of dataset are increased by applying statistical pooling method. Then, the most weighted features are selected from increased feature vector by using ReliefF. The classification is applied using the most weighted feature vector obtained. In the proposed method, Support Vector Machine (SVM) and K Nearest Neighbor (KNN) algorithms are used. The success rate was calculated as 91.25% and 91.23% with by using SVM and KNN respectively. The proposed method has two main contributions. The first is to obtain new features from the Parkinson's acoustic dataset using the statistical pooling method. The second one is the selection of the most significant features from the many feature vectors obtained. Thus, successful results were obtained for both KNN and SVM algorithms. The comparatively results clearly show that the proposed method achieved the best success rate among the selected state-of-art methods. Considering the proposed method and the results obtained, it proposed method is successful for Parkinson's disease recognition.
Collapse
Affiliation(s)
- Orhan Yaman
- Department of Informatics, Firat University, Elazig, Turkey.
| | - Fatih Ertam
- Department of Digital Forensics Engineering, Firat University, Elazig, Turkey.
| | - Turker Tuncer
- Department of Digital Forensics Engineering, Firat University, Elazig, Turkey.
| |
Collapse
|
43
|
Arora S, Visanji NP, Mestre TA, Tsanas A, AlDakheel A, Connolly BS, Gasca-Salas C, Kern DS, Jain J, Slow EJ, Faust-Socher A, Lang AE, Little MA, Marras C. Investigating Voice as a Biomarker for Leucine-Rich Repeat Kinase 2-Associated Parkinson's Disease. JOURNAL OF PARKINSONS DISEASE 2019; 8:503-510. [PMID: 30248062 DOI: 10.3233/jpd-181389] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
We investigate the potential association between leucine-rich repeat kinase 2 (LRRK2) mutations and voice. Sustained phonations ('aaah' sounds) were recorded from 7 individuals with LRRK2-associated Parkinson's disease (PD), 17 participants with idiopathic PD (iPD), 20 non-manifesting LRRK2-mutation carriers, 25 related non-carriers, and 26 controls. In distinguishing LRRK2-associated PD and iPD, the mean sensitivity was 95.4% (SD 17.8%) and mean specificity was 89.6% (SD 26.5%). Voice features for non-manifesting carriers, related non-carriers, and controls were much less discriminatory. Vocal deficits in LRRK2-associated PD may be different than those in iPD. These preliminary results warrant longitudinal analyses and replication in larger cohorts.
Collapse
Affiliation(s)
| | - Naomi P. Visanji
- The Edmond J. Safra Program in Parkinson’s Disease and the Morton and Gloria Shulman Movement Disorders Centre and, Toronto Western Hospital, Toronto, ON, Canada
| | - Tiago A. Mestre
- Department of Medicine, Parkinson’s Disease and Movement Disorders Center, Division of Neurology, The Ottawa Hospital Research Institute, University of Ottawa Brain and Mind Institute, Ottawa, Canada
| | - Athanasios Tsanas
- Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, UK
| | - Amaal AlDakheel
- The Edmond J. Safra Program in Parkinson’s Disease and the Morton and Gloria Shulman Movement Disorders Centre and, Toronto Western Hospital, Toronto, ON, Canada
| | - Barbara S. Connolly
- Department of Medicine, Division of Neurology, Hamilton Health Sciences, McMaster University, Hamilton, ON, Canada
| | - Carmen Gasca-Salas
- The Edmond J. Safra Program in Parkinson’s Disease and the Morton and Gloria Shulman Movement Disorders Centre and, Toronto Western Hospital, Toronto, ON, Canada
| | - Drew S. Kern
- Department of Neurology, Movement Disorders Center, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
| | - Jennifer Jain
- The Edmond J. Safra Program in Parkinson’s Disease and the Morton and Gloria Shulman Movement Disorders Centre and, Toronto Western Hospital, Toronto, ON, Canada
| | - Elizabeth J. Slow
- The Edmond J. Safra Program in Parkinson’s Disease and the Morton and Gloria Shulman Movement Disorders Centre and, Toronto Western Hospital, Toronto, ON, Canada
| | - Achinoam Faust-Socher
- The Edmond J. Safra Program in Parkinson’s Disease and the Morton and Gloria Shulman Movement Disorders Centre and, Toronto Western Hospital, Toronto, ON, Canada
| | - Anthony E. Lang
- The Edmond J. Safra Program in Parkinson’s Disease and the Morton and Gloria Shulman Movement Disorders Centre and, Toronto Western Hospital, Toronto, ON, Canada
| | - Max A. Little
- Engineering and Applied Science, Aston University, Birmingham, UK
- Media Lab, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Connie Marras
- The Edmond J. Safra Program in Parkinson’s Disease and the Morton and Gloria Shulman Movement Disorders Centre and, Toronto Western Hospital, Toronto, ON, Canada
| |
Collapse
|
44
|
Ali L, Zhu C, Zhang Z, Liu Y. Automated Detection of Parkinson's Disease Based on Multiple Types of Sustained Phonations Using Linear Discriminant Analysis and Genetically Optimized Neural Network. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE-JTEHM 2019; 7:2000410. [PMID: 32166050 PMCID: PMC6876932 DOI: 10.1109/jtehm.2019.2940900] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Revised: 07/30/2019] [Accepted: 09/04/2019] [Indexed: 11/24/2022]
Abstract
Objective: Parkinson’s disease (PD) is a serious neurodegenerative disorder. It is
reported that most of PD patients have voice impairments. But these voice impairments are
not perceptible to common listeners. Therefore, different machine learning methods have
been developed for automated PD detection. However, these methods either lack
generalization and clinically significant classification performance or face the problem
of subject overlap. Methods: To overcome the problems discussed above, we attempt to
develop a hybrid intelligent system that can automatically perform acoustic analysis of
voice signals in order to detect PD. The proposed intelligent system uses linear
discriminant analysis (LDA) for dimensionality reduction and genetic algorithm (GA) for
hyperparameters optimization of neural network (NN) which is used as a predictive model.
Moreover, to avoid subject overlap, we use leave one subject out (LOSO) validation.
Results: The proposed method namely LDA-NN-GA is evaluated in numerical experiments on
multiple types of sustained phonations data in terms of accuracy, sensitivity,
specificity, and Matthew correlation coefficient. It achieves classification accuracy of
95% on training database and 100% on testing database using all the
extracted features. However, as the dataset is imbalanced in terms of gender, thus, to
obtain unbiased results, we eliminated the gender dependent features and obtained accuracy
of 80% for training database and 82.14% for testing database, which seems to
be more unbiased results. Conclusion: Compared with the previous machine learning methods,
the proposed LDA-NN-GA method shows better performance and lower complexity. Clinical
Impact: The experimental results suggest that the proposed automated diagnostic system has
the potential to classify PD patients from healthy subjects. Additionally, in future the
proposed method can also be exploited for prodromal and differential diagnosis, which are
considered challenging tasks.
Collapse
Affiliation(s)
- Liaqat Ali
- School of Information and Communication EngineeringUniversity of Electronic Science and Technology of China (UESTC)Chengdu611731China
| | - Ce Zhu
- School of Information and Communication EngineeringUniversity of Electronic Science and Technology of China (UESTC)Chengdu611731China
| | - Zhonghao Zhang
- School of Information and Communication EngineeringUniversity of Electronic Science and Technology of China (UESTC)Chengdu611731China
| | - Yipeng Liu
- School of Information and Communication EngineeringUniversity of Electronic Science and Technology of China (UESTC)Chengdu611731China
| |
Collapse
|
45
|
Rozenstoks K, Novotny M, Horakova D, Rusz J. Automated Assessment of Oral Diadochokinesis in Multiple Sclerosis Using a Neural Network Approach: Effect of Different Syllable Repetition Paradigms. IEEE Trans Neural Syst Rehabil Eng 2019; 28:32-41. [PMID: 31545738 DOI: 10.1109/tnsre.2019.2943064] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Slow and irregular oral diadochokinesis represents an important manifestation of spastic and ataxic dysarthria in multiple sclerosis (MS). We aimed to develop a robust algorithm based on convolutional neural networks for the accurate detection of syllables from different types of alternating motion rate (AMR) and sequential motion rate (SMR) paradigms. Subsequently, we explored the sensitivity of AMR and SMR paradigms based on voiceless and voiced consonants in the detection of speech impairment. The four types of syllable repetition paradigms including /ta/, /da/, /pa/-/ta/-/ka/, and /ba/-/da/-/ga/ were collected from 120 MS patients and 60 matched healthy control speakers. Our neural network algorithm was able to correctly identify the position of individual syllables with a very high average accuracy of 97.8%, with the correct temporal detection of syllable position of 87.8% for 10 ms and 95.5% for 20 ms tolerance value. We found significantly altered diadochokinetic rate and regularity in MS compared to controls across all types of investigated tasks ( ). MS patients showed slower speech for SMR compared to AMR tasks, whereas voiced paradigms were more irregular. Objective evaluation of oral diadochokinesis using different AMR and SMR paradigms may provide important information regarding speech severity and pathophysiology of the underlying disease.
Collapse
|
46
|
Lan BL, Yeo JHW. Comparison of computer-key-hold-time and alternating-finger-tapping tests for early-stage Parkinson's disease. PLoS One 2019; 14:e0219114. [PMID: 31247037 PMCID: PMC6597101 DOI: 10.1371/journal.pone.0219114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Accepted: 06/14/2019] [Indexed: 11/19/2022] Open
Abstract
Giancardo et al. recently introduced the neuroQWERTY index (nQi), which is a novel motor index derived from computer-key-hold-time data using an ensemble regression algorithm, to detect early-stage Parkinson's disease. Here, we derive a much simpler motor index from their hold-time data, which is the standard deviation (SD) of the hold-time fluctuations, where fluctuation is defined as the difference between successive natural-log of hold time. Our results show the performance of the SD and nQi tests in discriminating early-stage subjects from controls do not differ, although the SD index is much simpler. There is also no difference in performance between the SD and alternating-finger-tapping tests.
Collapse
Affiliation(s)
- Boon Leong Lan
- Electrical and Computer Systems Engineering & Advanced Engineering Platform, School of Engineering, Monash University, Bandar Sunway, Malaysia
- * E-mail:
| | - Jacob Hsiao Wen Yeo
- Electrical and Computer Systems Engineering & Advanced Engineering Platform, School of Engineering, Monash University, Bandar Sunway, Malaysia
| |
Collapse
|
47
|
Arora S, Baghai-Ravary L, Tsanas A. Developing a large scale population screening tool for the assessment of Parkinson's disease using telephone-quality voice. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:2871. [PMID: 31153319 PMCID: PMC6509044 DOI: 10.1121/1.5100272] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Revised: 03/05/2019] [Accepted: 04/09/2019] [Indexed: 05/25/2023]
Abstract
Recent studies have demonstrated that analysis of laboratory-quality voice recordings can be used to accurately differentiate people diagnosed with Parkinson's disease (PD) from healthy controls (HCs). These findings could help facilitate the development of remote screening and monitoring tools for PD. In this study, 2759 telephone-quality voice recordings from 1483 PD and 15 321 recordings from 8300 HC participants were analyzed. To account for variations in phonetic backgrounds, data were acquired from seven countries. A statistical framework for analyzing voice was developed, whereby 307 dysphonia measures that quantify different properties of voice impairment, such as breathiness, roughness, monopitch, hoarse voice quality, and exaggerated vocal tremor, were computed. Feature selection algorithms were used to identify robust parsimonious feature subsets, which were used in combination with a random forests (RFs) classifier to accurately distinguish PD from HC. The best tenfold cross-validation performance was obtained using Gram-Schmidt orthogonalization and RF, leading to mean sensitivity of 64.90% (standard deviation, SD, 2.90%) and mean specificity of 67.96% (SD 2.90%). This large scale study is a step forward toward assessing the development of a reliable, cost-effective, and practical clinical decision support tool for screening the population at large for PD using telephone-quality voice.
Collapse
Affiliation(s)
- Siddharth Arora
- Somerville College, University of Oxford, Oxford, OX2 6HD, United Kingdom
| | | | - Athanasios Tsanas
- Usher Institute of Population Health Sciences and Informatics, Medical School, University of Edinburgh, Edinburgh, EH16 4UX, United Kingdom
| |
Collapse
|
48
|
Changes in Phonation and Their Relations with Progress of Parkinson’s Disease. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8122339] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Hypokinetic dysarthria, which is associated with Parkinson’s disease (PD), affects several speech dimensions, including phonation. Although the scientific community has dealt with a quantitative analysis of phonation in PD patients, a complex research revealing probable relations between phonatory features and progress of PD is missing. Therefore, the aim of this study is to explore these relations and model them mathematically to be able to estimate progress of PD during a two-year follow-up. We enrolled 51 PD patients who were assessed by three commonly used clinical scales. In addition, we quantified eight possible phonatory disorders in five vowels. To identify the relationship between baseline phonatory features and changes in clinical scores, we performed a partial correlation analysis. Finally, we trained XGBoost models to predict the changes in clinical scores during a two-year follow-up. For two years, the patients’ voices became more aperiodic with increased microperturbations of frequency and amplitude. Next, the XGBoost models were able to predict changes in clinical scores with an error in range 11–26%. Although we identified some significant correlations between changes in phonatory features and clinical scores, they are less interpretable. This study suggests that it is possible to predict the progress of PD based on the acoustic analysis of phonation. Moreover, it recommends utilizing the sustained vowel /i/ instead of /a/.
Collapse
|
49
|
Iakovakis D, Hadjidimitriou S, Charisis V, Bostantjopoulou S, Katsarou Z, Klingelhoefer L, Reichmann H, Dias SB, Diniz JA, Trivedi D, Chaudhuri KR, Hadjileontiadis LJ. Motor Impairment Estimates via Touchscreen Typing Dynamics Toward Parkinson's Disease Detection From Data Harvested In-the-Wild. ACTA ACUST UNITED AC 2018. [DOI: 10.3389/fict.2018.00028] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
50
|
|