1
|
Idrisoglu A, Dallora AL, Anderberg P, Berglund JS. Applied Machine Learning Techniques to Diagnose Voice-Affecting Conditions and Disorders: Systematic Literature Review. J Med Internet Res 2023; 25:e46105. [PMID: 37467031 PMCID: PMC10398366 DOI: 10.2196/46105] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 04/26/2023] [Accepted: 05/23/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUND Normal voice production depends on the synchronized cooperation of multiple physiological systems, which makes the voice sensitive to changes. Any systematic, neurological, and aerodigestive distortion is prone to affect voice production through reduced cognitive, pulmonary, and muscular functionality. This sensitivity inspired using voice as a biomarker to examine disorders that affect the voice. Technological improvements and emerging machine learning (ML) technologies have enabled possibilities of extracting digital vocal features from the voice for automated diagnosis and monitoring systems. OBJECTIVE This study aims to summarize a comprehensive view of research on voice-affecting disorders that uses ML techniques for diagnosis and monitoring through voice samples where systematic conditions, nonlaryngeal aerodigestive disorders, and neurological disorders are specifically of interest. METHODS This systematic literature review (SLR) investigated the state of the art of voice-based diagnostic and monitoring systems with ML technologies, targeting voice-affecting disorders without direct relation to the voice box from the point of view of applied health technology. Through a comprehensive search string, studies published from 2012 to 2022 from the databases Scopus, PubMed, and Web of Science were scanned and collected for assessment. To minimize bias, retrieval of the relevant references in other studies in the field was ensured, and 2 authors assessed the collected studies. Low-quality studies were removed through a quality assessment and relevant data were extracted through summary tables for analysis. The articles were checked for similarities between author groups to prevent cumulative redundancy bias during the screening process, where only 1 article was included from the same author group. RESULTS In the analysis of the 145 included studies, support vector machines were the most utilized ML technique (51/145, 35.2%), with the most studied disease being Parkinson disease (PD; reported in 87/145, 60%, studies). After 2017, 16 additional voice-affecting disorders were examined, in contrast to the 3 investigated previously. Furthermore, an upsurge in the use of artificial neural network-based architectures was observed after 2017. Almost half of the included studies were published in last 2 years (2021 and 2022). A broad interest from many countries was observed. Notably, nearly one-half (n=75) of the studies relied on 10 distinct data sets, and 11/145 (7.6%) used demographic data as an input for ML models. CONCLUSIONS This SLR revealed considerable interest across multiple countries in using ML techniques for diagnosing and monitoring voice-affecting disorders, with PD being the most studied disorder. However, the review identified several gaps, including limited and unbalanced data set usage in studies, and a focus on diagnostic test rather than disorder-specific monitoring. Despite the limitations of being constrained by only peer-reviewed publications written in English, the SLR provides valuable insights into the current state of research on ML-based voice-affecting disorder diagnosis and monitoring and highlighting areas to address in future research.
Collapse
Affiliation(s)
- Alper Idrisoglu
- Department of Health, Blekinge Institute of Technology, Karslkrona, Sweden
| | - Ana Luiza Dallora
- Department of Health, Blekinge Institute of Technology, Karslkrona, Sweden
| | - Peter Anderberg
- Department of Health, Blekinge Institute of Technology, Karslkrona, Sweden
- School of Health Sciences, University of Skövde, Skövde, Sweden
| | | |
Collapse
|
2
|
Mahmood A, Mehroz Khan M, Imran M, Alhajlah O, Dhahri H, Karamat T. End-to-End Deep Learning Method for Detection of Invasive Parkinson’s Disease. Diagnostics (Basel) 2023; 13:1088. [PMID: 36980396 PMCID: PMC10047182 DOI: 10.3390/diagnostics13061088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 02/18/2023] [Accepted: 02/23/2023] [Indexed: 03/18/2023] Open
Abstract
Parkinson’s disease directly affects the nervous system are causes a change in voice, lower efficiency in daily routine tasks, failure of organs, and death. As an estimate, nearly ten million people are suffering from Parkinson’s disease worldwide, and this number is increasing day by day. The main cause of an increase in Parkinson’s disease patients is the unavailability of reliable procedures for diagnosing Parkinson’s disease. In the literature, we observed different methods for diagnosing Parkinson’s disease such as gait movement, voice signals, and handwriting tests. The detection of Parkinson’s disease is a difficult task because the important features that can help in detecting Parkinson’s disease are unknown. Our aim in this study is to extract those essential voice features which play a vital role in detecting Parkinson’s disease and develop a reliable model which can diagnose Parkinson’s disease at its early stages. Early diagnostic systems for the detection of Parkinson’s disease are needed to diagnose Parkinson’s disease early so that it can be controlled at the initial stages, but existing models have limitations that can lead to the misdiagnosing of the disease. Our proposed model can assist practitioners in continuously monitoring the Parkinson’s disease rating scale, known as the Total Unified Parkinson’s Disease Scale, which can help practitioners in treating their patients. The proposed model can detect Parkinson’s disease with an error of 0.10 RMSE, which is lower than that of existing models. The proposed model has the capability to extract vital voice features which can help detect Parkinson’s disease in its early stages.
Collapse
|
3
|
Ge W, Lueck C, Suominen H, Apthorp D. Has machine learning over-promised in healthcare? A critical analysis and a proposal for improved evaluation, with evidence from Parkinson’s disease. Artif Intell Med 2023; 139:102524. [PMID: 37100503 DOI: 10.1016/j.artmed.2023.102524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 02/22/2023] [Accepted: 02/28/2023] [Indexed: 03/17/2023]
Abstract
Adoption of artificial intelligence (AI) by the medical community has long been anticipated, endorsed by a stream of machine learning literature showcasing AI systems that yield extraordinary performance. However, many of these systems are likely over-promising and will under-deliver in practice. One key reason is the community's failure to acknowledge and address the presence of inflationary effects in the data. These simultaneously inflate evaluation performance and prevent a model from learning the underlying task, thus severely misrepresenting how that model would perform in the real world. This paper investigated the impact of these inflationary effects on healthcare tasks, as well as how these effects can be addressed. Specifically, we defined three inflationary effects that occur in medical data sets and allow models to easily reach small training losses and prevent skillful learning. We investigated two data sets of sustained vowel phonation from participants with and without Parkinson's disease, and revealed that published models which have achieved high classification performances on these were artificially enhanced due to the inflationary effects. Our experiments showed that removing each inflationary effect corresponded with a decrease in classification accuracy, and that removing all inflationary effects reduced the evaluated performance by up to 30%. Additionally, the performance on a more realistic test set increased, suggesting that the removal of these inflationary effects enabled the model to better learn the underlying task and generalize. Source code is available at https://github.com/Wenbo-G/pd-phonation-analysis under the MIT license.
Collapse
|
4
|
Ngo QC, Motin MA, Pah ND, Drotár P, Kempster P, Kumar D. Computerized analysis of speech and voice for Parkinson's disease: A systematic review. Comput Methods Programs Biomed 2022; 226:107133. [PMID: 36183641 DOI: 10.1016/j.cmpb.2022.107133] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 09/13/2022] [Accepted: 09/13/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Speech impairment is an early symptom of Parkinson's disease (PD). This study has summarized the literature related to speech and voice in detecting PD and assessing its severity. METHODS A systematic review of the literature from 2010 to 2021 to investigate analysis methods and signal features. The keywords "Automatic analysis" in conjunction with "PD speech" or "PD voice" were used, and the PubMed and ScienceDirect databases were searched. A total of 838 papers were found on the first run, of which 189 were selected. One hundred and forty-seven were found to be suitable for the review. The different datasets, recording protocols, signal analysis methods and features that were reported are listed. Values of the features that separate PD patients from healthy controls were tabulated. Finally, the barriers that limit the wide use of computerized speech analysis are discussed. RESULTS Speech and voice may be valuable markers for PD. However, large differences between the datasets make it difficult to compare different studies. In addition, speech analytic methods that are not informed by physiological understanding may alienate clinicians. CONCLUSIONS The potential usefulness of speech and voice for the detection and assessment of PD is confirmed by evidence from the classification and correlation results.
Collapse
Affiliation(s)
| | - Mohammod Abdul Motin
- Biosignals Lab, RMIT University, Melbourne, Australia; Department of Electrical & Electronic Engineering, Rajshahi University of Engineering & Technology, Rajshahi 6204, Bangladesh
| | - Nemuel Daniel Pah
- Biosignals Lab, RMIT University, Melbourne, Australia; Universitas Surabaya, Indonesia
| | - Peter Drotár
- Intelligent Information Systems Lab, Technical University of Kosice, Letna 9, 42001, Kosice, Slovakia
| | - Peter Kempster
- Neurosciences Department, Monash Health, Clayton, VIC, Australia; Department of Medicine, School of Clinical Sciences, Monash University, Clayton, VIC, Australia
| | - Dinesh Kumar
- Biosignals Lab, RMIT University, Melbourne, Australia.
| |
Collapse
|
5
|
Senturk ZK. Layer recurrent neural network-based diagnosis of Parkinson’s disease using voice features. BIOMED ENG-BIOMED TE 2022; 67:249-266. [DOI: 10.1515/bmt-2022-0022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 05/18/2022] [Indexed: 12/13/2022]
Abstract
Abstract
Parkinson’s disease (PD), a slow-progressing neurological disease, affects a large percentage of the world’s elderly population, and this population is expected to grow over the next decade. As a result, early detection is crucial for community health and the future of the globe in order to take proper safeguards and have a less arduous treatment procedure. Recent research has begun to focus on the motor system deficits caused by PD. Because practically most of the PD patients suffer from voice abnormalities, researchers working on automated diagnostic systems investigate vocal impairments. In this paper, we undertake extensive experiments with features extracted from voice signals. We propose a layer Recurrent Neural Network (RNN) based diagnosis for PD. To prove the efficiency of the model, different network models are compared. To the best of our knowledge, several neural network topologies, namely RNN, Cascade Forward Neural Networks (CFNN), and Feed Forward Neural Networks (FFNN), are used and compared for voice-based PD detection for the first time. In addition, the impacts of data normalization and feature selection (FS) are thoroughly examined. The findings reveal that normalization increases classifier performance and Laplacian-based FS outperforms. The proposed RNN model with 300 voice features achieves 99.74% accuracy.
Collapse
Affiliation(s)
- Zehra Karapinar Senturk
- Computer Engineering Department , Faculty of Engineering, Duzce University , 81620 , Duzce , Turkey
| |
Collapse
|
6
|
Pramanik M, Pradhan R, Nandy P, Qaisar SM, Bhoi AK. Assessment of Acoustic Features and Machine Learning for Parkinson's Detection. J Healthc Eng 2021; 2021:9957132. [PMID: 34471507 DOI: 10.1155/2021/9957132] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 07/22/2021] [Accepted: 08/13/2021] [Indexed: 12/23/2022]
Abstract
This article presents a machine learning approach for Parkinson's disease detection. Potential multiple acoustic signal features of Parkinson's and control subjects are ascertained. A collaborated feature bank is created through correlated feature selection, Fisher score feature selection, and mutual information-based feature selection schemes. A detection model on top of the feature bank has been developed using the traditional Naïve Bayes, which proved state of the art. The Naïve Bayes detector on collaborative acoustic features can detect the presence of Parkinson's magnificently with a detection accuracy of 78.97% and precision of 0.926, under the hold-out cross validation. The collaborative feature bank on Naïve Bayes revealed distinguishable results as compared to many other recently proposed approaches. The simplicity of Naïve Bayes makes the system robust and effective throughout the detection process.
Collapse
|
7
|
Rusz J, Tykalova T, Ramig LO, Tripoliti E. Reply to: "Fostering Voice Objective Analysis in Patients With Movement Disorders". Mov Disord 2021; 36:1042-1043. [PMID: 33851752 DOI: 10.1002/mds.28539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 01/28/2021] [Indexed: 11/10/2022] Open
Affiliation(s)
- Jan Rusz
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Tereza Tykalova
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Lorraine O Ramig
- University of Colorado-Boulder, Boulder, Colorado, USA.,National Center for Voice and Speech, Denver, Colorado, USA.,Columbia University, New York, New York, USA.,LSVT Global, Inc., Tucson, Arizona, USA
| | - Elina Tripoliti
- Institute of Neurology, Department of Clinical and Movement Neurosciences, and National Hospital for Neurology and Neurosurgery, University College London, CLH NHS Trust, London, United Kingdom
| |
Collapse
|
8
|
Norel R, Agurto C, Heisig S, Rice JJ, Zhang H, Ostrand R, Wacnik PW, Ho BK, Ramos VL, Cecchi GA. Speech-based characterization of dopamine replacement therapy in people with Parkinson's disease. NPJ Parkinsons Dis 2020; 6:12. [PMID: 32566741 PMCID: PMC7293295 DOI: 10.1038/s41531-020-0113-5] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 05/19/2020] [Indexed: 11/10/2022]
Abstract
People with Parkinson's (PWP) disease are under constant tension with respect to their dopamine replacement therapy (DRT) regimen. Waiting too long between doses results in more prominent symptoms, loss of motor function, and greater risk of falling per step. Shortened pill cycles can lead to accelerated habituation and faster development of disabling dyskinesias. The Unified Parkinson's Disease Rating Scale (MDS-UPDRS) is the gold standard for monitoring Parkinson's disease progression but requires a neurologist to administer and therefore is not an ideal instrument to continuously evaluate short-term disease fluctuations. We investigated the feasibility of using speech to detect changes in medication states, based on expectations of subtle changes in voice and content related to dopaminergic levels. We calculated acoustic and prosodic features for three speech tasks (picture description, reverse counting, and diadochokinetic rate) for 25 PWP, each evaluated "ON" and "OFF" DRT. Additionally, we generated semantic features for the picture description task. Classification of ON/OFF medication states using features generated from picture description, reverse counting and diadochokinetic rate tasks resulted in cross-validated accuracy rates of 0.89, 0.84, and 0.60, respectively. The most discriminating task was picture description which provided evidence that participants are more likely to use action words in ON than in OFF state. We also found that speech tempo was modified by DRT. Our results suggest that automatic speech assessment can capture changes associated with the DRT cycle. Given the ease of acquiring speech data, this method shows promise to remotely monitor DRT effects.
Collapse
Affiliation(s)
- R Norel
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
| | - C Agurto
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
| | - S Heisig
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
| | - J J Rice
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
| | - H Zhang
- Pfizer Digital Medicine & Translational Imaging: Early Clinical Development, Cambridge, MA 02139 USA
| | - R Ostrand
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
| | - P W Wacnik
- Pfizer Digital Medicine & Translational Imaging: Early Clinical Development, Cambridge, MA 02139 USA
| | - B K Ho
- Department of Neurology, Tufts University School of Medicine and Tufts Medical Center, 800 Washington St, Boston, MA 02111 USA
| | - V L Ramos
- Pfizer Digital Medicine & Translational Imaging: Early Clinical Development, Cambridge, MA 02139 USA
| | - G A Cecchi
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
| |
Collapse
|
9
|
Lauraitis A, Maskeliūnas R, Damaševičius R, Krilavičius T. A Mobile Application for Smart Computer-Aided Self-Administered Testing of Cognition, Speech, and Motor Impairment. Sensors (Basel) 2020; 20:E3236. [PMID: 32517223 PMCID: PMC7309061 DOI: 10.3390/s20113236] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 05/29/2020] [Accepted: 06/03/2020] [Indexed: 11/16/2022]
Abstract
We present a model for digital neural impairment screening and self-assessment, which can evaluate cognitive and motor deficits for patients with symptoms of central nervous system (CNS) disorders, such as mild cognitive impairment (MCI), Parkinson's disease (PD), Huntington's disease (HD), or dementia. The data was collected with an Android mobile application that can track cognitive, hand tremor, energy expenditure, and speech features of subjects. We extracted 238 features as the model inputs using 16 tasks, 12 of them were based on a self-administered cognitive testing (SAGE) methodology and others used finger tapping and voice features acquired from the sensors of a smart mobile device (smartphone or tablet). Fifteen subjects were involved in the investigation: 7 patients with neurological disorders (1 with Parkinson's disease, 3 with Huntington's disease, 1 with early dementia, 1 with cerebral palsy, 1 post-stroke) and 8 healthy subjects. The finger tapping, SAGE, energy expenditure, and speech analysis features were used for neural impairment evaluations. The best results were achieved using a fusion of 13 classifiers for combined finger tapping and SAGE features (96.12% accuracy), and using bidirectional long short-term memory (BiLSTM) (94.29% accuracy) for speech analysis features.
Collapse
Affiliation(s)
- Andrius Lauraitis
- Department of Multimedia Engineering, Kaunas University of Technology, 50186 Kaunas, Lithuania; (A.L.); (R.M.)
| | - Rytis Maskeliūnas
- Department of Multimedia Engineering, Kaunas University of Technology, 50186 Kaunas, Lithuania; (A.L.); (R.M.)
| | - Robertas Damaševičius
- Department of Applied Informatics, Vytautas Magnus University, 44404 Kaunas, Lithuania;
- Faculty of Applied Mathematics, Silesian University of Technology, 44-100 Gliwice, Poland
| | - Tomas Krilavičius
- Department of Applied Informatics, Vytautas Magnus University, 44404 Kaunas, Lithuania;
- Baltic Institute of Advanced Technology, 01124 Vilnius, Lithuania
| |
Collapse
|
10
|
Affiliation(s)
- B. Sai Jahnavi
- Department of Electronics and Communication Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, India
| | - B. Sai Supraja
- Department of Electronics and Communication Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, India
| | - S. Lalitha
- Department of Electronics and Communication Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, India
| |
Collapse
|
11
|
Novotný M, Dušek P, Daly I, Růžička E, Rusz J. Glottal Source Analysis of Voice Deficits in Newly Diagnosed Drug-naïve Patients with Parkinson’s Disease: Correlation Between Acoustic Speech Characteristics and Non-Speech Motor Performance. Biomed Signal Process Control 2020. [DOI: 10.1016/j.bspc.2019.101818] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
12
|
|