1
|
Pommée T, Morsomme D. Voice Quality in Telephone Interviews: A preliminary Acoustic Investigation. J Voice 2025; 39:563.e1-563.e20. [PMID: 36192289 DOI: 10.1016/j.jvoice.2022.08.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 08/24/2022] [Accepted: 08/25/2022] [Indexed: 10/07/2022]
Abstract
OBJECTIVES To investigate the impact of standardized mobile phone recordings passed through a telecom channel on acoustic markers of voice quality and on its perception by voice experts in normophonic speakers. METHODS Continuous speech and a sustained vowel were recorded for fourteen female and ten male normophonic speakers. The recordings were done simultaneously with a head-mounted high-quality microphone and through the telephone network on a receiving smartphone. Twenty-two acoustic voice quality, breathiness and pitch-related measures were extracted from the recordings. Nine vocologists perceptually rated the G, R and B parameters of the GRBAS scale on each voice sample. The reproducibility, the recording type, the stimulus type and the gender effects, as well as the correlation between acoustic and perceptual measures were investigated. RESULTS The sustained vowel samples are damped after one second. Only the frequencies between 100 and 3700Hz are passed through the telecom channel and the frequency response is characterized by peaks and troughs. The acoustic measures show a good reproducibility over the three repetitions. All measures significantly differ between the recording types, except for the local jitter, the harmonics-to-noise ratio by Dejonckere and Lebacq, the period standard deviation and all six pitch measures. The AVQI score is higher in telephone recordings, while the ABI score is lower. Significant differences between genders are also found for most of the measures; while the AVQI is similar in men and women, the ABI is higher in women in both recording types. For the perceptual assessment, the interrater agreement is rather low, while the reproducibility over the three repetitions is good. Few significant differences between recording types are observed, except for lower breathiness ratings on telephone recordings. G ratings are significantly more severe on the sustained vowel on both recording types, R ratings only on telephone recordings. While roughness is rated higher in men on telephone recordings by most experts, no gender effect is observed for breathiness on either recording types. Finally, neither the AVQI nor the ABI yield strong correlations with any of the perceptual parameters. CONCLUSIONS Our results show that passing a voice signal through a telecom channel induces filter and noise effects that limit the use of common acoustic voice quality measures and indexes. The AVQI and ABI are both significantly impacted by the recording type. The most reliable acoustic measures seem to be pitch perturbation (local jitter and period standard deviation) as well as the harmonics-to-noise ratio from Dejonckere and Lebacq. Our results also underline that raters are not equally sensitive to the various factors, including the recording type, the stimulus type and the gender effects. Neither of the three perceptual parameters G, R and B seem to be reliably measurable on telephone recordings using the two investigated acoustic indexes. Future studies investigating the impact of voice quality in telephone conversations should thus focus on acoustic measures on continuous speech samples that are limited to the frequency response of the telecom channel and that are not too sensitive to environmental and additive noise.
Collapse
Affiliation(s)
- Timothy Pommée
- Research Unit for a life-Course perspective on Health and Education, Voice Unit, University of Liège, Belgium.
| | - Dominique Morsomme
- Research Unit for a life-Course perspective on Health and Education, Voice Unit, University of Liège, Belgium
| |
Collapse
|
2
|
Rusko M, Sabo R, Trnka M, Zimmermann A, Malaschitz R, Ružický E, Brandoburová P, Kevická V, Škorvánek M. Slovak database of speech affected by neurodegenerative diseases. Sci Data 2024; 11:1320. [PMID: 39632912 PMCID: PMC11618578 DOI: 10.1038/s41597-024-04171-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Accepted: 11/26/2024] [Indexed: 12/07/2024] Open
Abstract
A new Slovak speech database EWA-DB was created for research purposes aimed at early detection of neurodegenerative diseases from speech. It contains 1649 speakers performing various speech and language tasks, such as sustained vowel phonation, diadochokinesis, naming and picture description. The sample of speakers consists of individuals with Alzheimer's disease, mild cognitive impairment, Parkinson's disease, and healthy controls. In this article we describe the EWA-DB development process, the language and speech task selection, patient and healthy control recruitment, as well as the testing and recording protocol. The structure and content of the database and file formats are described in detail. We assume that the presented database could be suitable for the development of automatic systems predicting the diagnoses of Alzheimer's disease, mild cognitive impairment, and Parkinson's disease from language and speech features.
Collapse
Affiliation(s)
- Milan Rusko
- Institute of Informatics of the Slovak Academy of Sciences, Bratislava, Slovakia.
| | - Róbert Sabo
- Institute of Informatics of the Slovak Academy of Sciences, Bratislava, Slovakia
| | - Marián Trnka
- Institute of Informatics of the Slovak Academy of Sciences, Bratislava, Slovakia
| | | | | | - Eugen Ružický
- Faculty of Informatics, Pan European University, Bratislava, Slovakia
| | - Petra Brandoburová
- Department of Psychology, Faculty of Arts, Comenius University, Bratislava, Slovakia
- MEMORY Centre, Bratislava, Slovakia
- 2nd Department of Neurology, University Hospital, Bratislava, Slovakia
| | - Viktória Kevická
- Institute of Informatics of the Slovak Academy of Sciences, Bratislava, Slovakia
- Department of Communication Disorders, Faculty of Education, Comenius University, Bratislava, Slovakia
| | - Matej Škorvánek
- Department of Neurology, Faculty of Medicine, P.J. Safarik University, Kosice, Slovakia
- Department of Neurology, University Hospital L. Pasteur, Kosice, Slovakia
| |
Collapse
|
3
|
Fumel J, Bahuaud D, Weed E, Fusaroli R, Basirat A. A Systematic Review and Bayesian Meta-Analysis of Acoustic Measures of Prosody in Parkinson's Disease. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:2548-2564. [PMID: 39018262 DOI: 10.1044/2024_jslhr-23-00588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/19/2024]
Abstract
PURPOSE Linguistic prosody is affected in Parkinson's disease (PD), which implicates the basal ganglia's role in the production of prosody. However, there is no recent systematic synthesis of the available acoustic evidence of prosodic impairment in PD. This study aimed to identify the acoustic features of linguistic prosody that are consistently affected in PD. METHOD The authors systematically reviewed articles that reported acoustic features of prosodic production in PD. Articles focused on fundamental frequency (F0) and its variability, intensity and its variability, speech and articulation rate, and pause duration and ratio. From a total of 648 records identified, 36 met criteria for inclusion and exclusion. For each acoustic measurement and task, data from people with PD (PwPD) were compared with those from controls to extract effect sizes. Pooled effect sizes were estimated using robust Bayesian hierarchical regression models. RESULTS PD was associated with decreased F0 variability and increased pause duration. There was limited evidence of reduced intensity variability and speech rate in PwPD. No evidence was found to suggest that PD affects articulation rate or pause ratio. CONCLUSIONS The primary acoustic parameters of prosody affected by PD are F0 variability and pause duration. The identification of these acoustic parameters has important clinical implications for the selection of PD management strategies. The association of F0 variability and pause duration with PD suggests that the neural circuits controlling these parameters are at least partly shared and might include the basal ganglia. While the current study focused on the phonetic realization of prosodic cues, future studies should examine whether and how PD affects prosody at higher levels of processing. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.25892923.
Collapse
Affiliation(s)
- Jules Fumel
- Univ. Lille, CNRS, UMR 9193 - SCALab - Sciences Cognitives et Sciences Affectives, F-59000 Lille, France
| | - Delphine Bahuaud
- Department of Speech and Language Therapy, Faculty of Medicine, UFR3S, Univ. Lille, F-59000 Lille, France
| | - Ethan Weed
- Department of Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Denmark
- Interacting Minds Centre, School of Culture and Society, Aarhus University, Denmark
| | - Riccardo Fusaroli
- Department of Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Denmark
- Interacting Minds Centre, School of Culture and Society, Aarhus University, Denmark
- Linguistic Data Consortium, University of Pennsylvania, Philadelphia
| | - Anahita Basirat
- Univ. Lille, CNRS, UMR 9193 - SCALab - Sciences Cognitives et Sciences Affectives, F-59000 Lille, France
| |
Collapse
|
4
|
Cantor-Cutiva LC, Ramani SA, Walden PR, Hunter EJ. Screening of Voice Pathologies: Identifying the Predictive Value of Voice Acoustic Parameters for Common Voice Pathologies. J Voice 2023:S0892-1997(23)00390-9. [PMID: 38143203 PMCID: PMC11193840 DOI: 10.1016/j.jvoice.2023.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 12/01/2023] [Accepted: 12/04/2023] [Indexed: 12/26/2023]
Abstract
BACKGROUND Voice acoustic analysis is important for objectively assessing voice production and diagnosing voice disorders. AIM This study aimed to investigate the sensitivity of various voice acoustic parameters in differentiating common voice pathology types. METHODS Data from the publicly available Perceptual Voice Qualities Database were analyzed; the database includes recordings of participants with and without voice disorders. A wide range of acoustic parameters was estimated from the recordings, such as alpha ratio, harmonics-to-noise ratio (HNR), cepstral peak prominence smoothed (CPPS), pitch period entropy (PPE), fundamental frequency, jitter, shimmer, and sound pressure levels. The predictive capabilities of the parameters were evaluated using receiver operating characteristic curves. Linear regression analysis determined the associations between parameters and voice disorders. Principal component analysis was conducted to identify important parameters for distinguishing voice disorders. RESULTS AND CONCLUSION This study has identified significant differences in acoustic parameters between those with and without voice disorders. Notably, the combination of five parameters-namely, PPE, shimmer, jitter, CPPS, and HNR-was identified as a strong predictor in voice disorder screening. These findings contribute substantially to the field of voice disorders, offering valuable insights for screening and diagnosis.
Collapse
Affiliation(s)
| | - Sai Aishwarya Ramani
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan
| | | | - Eric J Hunter
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, Iowa
| |
Collapse
|
5
|
Ceylan ME, Cangi ME, Yılmaz G, Peru BS, Yiğit Ö. Are smartphones and low-cost external microphones comparable for measuring time-domain acoustic parameters? Eur Arch Otorhinolaryngol 2023; 280:5433-5444. [PMID: 37584753 DOI: 10.1007/s00405-023-08179-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 08/05/2023] [Indexed: 08/17/2023]
Abstract
PURPOSE This study examined and compared the diagnostic accuracy and correlation levels of the acoustic parameters of the audio recordings obtained from smartphones on two operating systems and from dynamic and condenser types of external microphones. METHOD The study included 87 adults: 57 with voice disorder and 30 with a healthy voice. Each participant was asked to perform a sustained vowel phonation (/a/). The recordings were taken simultaneously using five microphones AKG-P220, Shure-SM58, Samson Go Mic, Apple iPhone 6, and Samsung Galaxy J7 Pro microphones in an acoustically insulated cabinet. Acoustic examinations were performed using Praat version 6.2.09. The data were examined using Pearson correlation and receiver-operating characteristic (ROC) analyses. RESULTS The parameters with the highest area under curve (AUC) values among all microphone recordings in the time-domain analyses were the frequency perturbation parameters. Additionally, considering the correlation coefficients obtained by synchronizing the microphones with each other and the AUC values together, the parameter with the highest correlation coefficient and diagnostic accuracy values was the jitter-local parameter. CONCLUSION Period-to-period perturbation parameters obtained from audio recordings made with smartphones show similar levels of diagnostic accuracy to external microphones used in clinical conditions.
Collapse
Affiliation(s)
- M Enes Ceylan
- Üsküdar University, Speech and Language Therapy, Istanbul, Türkiye
| | - M Emrah Cangi
- University of Health Sciences, Speech and Language Therapy, Selimiye, Tıbbiye Cd No: 38, Istanbul, 34668, Üsküdar, Türkiye.
| | - Göksu Yılmaz
- Üsküdar University, Speech and Language Therapy, Istanbul, Türkiye
| | - Beyza Sena Peru
- Üsküdar University, Speech and Language Therapy, Istanbul, Türkiye
| | - Özgür Yiğit
- Istanbul Şişli Hamidiye Etfal Training and Research Hospital, Istanbul, Türkiye
| |
Collapse
|
6
|
Sara JDS, Orbelo D, Maor E, Lerman LO, Lerman A. Guess What We Can Hear-Novel Voice Biomarkers for the Remote Detection of Disease. Mayo Clin Proc 2023; 98:1353-1375. [PMID: 37661144 PMCID: PMC10043966 DOI: 10.1016/j.mayocp.2023.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 02/08/2023] [Accepted: 03/16/2023] [Indexed: 03/30/2023]
Abstract
The advancement of digital biomarkers and the provision of remote health care greatly progressed during the coronavirus disease 2019 global pandemic. Combining voice/speech data with artificial intelligence and machine-based learning offers a novel solution to the growing demand for telemedicine. Voice biomarkers, obtained from the extraction of characteristic acoustic and linguistic features, are associated with a variety of diseases and even coronavirus disease 2019. In the current review, we (1) describe the basis on which digital voice biomarkers could facilitate "telemedicine," (2) discuss potential mechanisms that may explain the association between voice biomarkers and disease, (3) offer a novel classification system to conceptualize voice biomarkers depending on different methods for recording and analyzing voice/speech samples, (4) outline evidence revealing an association between voice biomarkers and a number of disease states, and (5) describe the process of developing a voice biomarker from recording, storing voice samples, and extracting acoustic and linguistic features relevant to training and testing deep and machine-based learning algorithms to detect disease. We further explore several important future considerations in this area of research, including the necessity for clinical trials and the importance of safeguarding data and individual privacy. To this end, we searched PubMed and Google Scholar to identify studies evaluating the relationship between voice/speech features and biomarkers and various diseases. Search terms included digital biomarker, telemedicine, voice features, voice biomarker, speech features, speech biomarkers, acoustics, linguistics, cardiovascular disease, neurologic disease, psychiatric disease, and infectious disease. The search was limited to studies published in English in peer-reviewed journals between 1980 and the present. To identify potential studies not captured by our database search strategy, we also searched studies listed in the bibliography of relevant publications and reviews.
Collapse
Affiliation(s)
| | - Diana Orbelo
- Division of Otolaryngology, Mayo Clinic College of Medicine and Science, Rochester, MN; Chaim Sheba Medical Center, Tel HaShomer, Israel
| | - Elad Maor
- Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Lilach O Lerman
- Division of Nephrology and Hypertension, Mayo Clinic Rochester, MN
| | - Amir Lerman
- Department of Cardiovascular Medicine, Mayo Clinic College of Medicine and Science, Rochester, MN.
| |
Collapse
|
7
|
Trejo-Gabriel-Galán JM, Cubo-Delgado E. [Telephone assistance for neurological diseases: a systematic review]. Rev Neurol 2023; 77:67-73. [PMID: 37466132 PMCID: PMC10662245 DOI: 10.33588/rn.7703.2022284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Indexed: 07/20/2023]
Abstract
INTRODUCTION AND OBJECTIVE While part of the care for neurological patients is done by telephone, it is not well known what neurological diseases and which part of that care is provided by telephone. Our goal is to find it out through a bibliographic review. MATERIALS AND METHODS References on telephone care for neurological diseases accessible through the PubMed, Embase, and Cochrane platforms have been systematically reviewed, with an unspecified start date and up to March 2022. We found 618 references, and as 219 did not pass the exclusion criteria, 399 were finally included in the review. RESULTS Dementia is the area of neurology with more publications about its telephone assistance. It is followed by stroke, head trauma, multiple sclerosis, Parkinson's disease and movement disorders, epilepsy, neuromuscular disorders, and others. DISCUSSION AND CONCLUSIONS Dementias are the diseases with more bibliographic references on their telephone assistance despite not being the most prevalent. The telephone is frequently used to administer diagnostic scales or support caregivers and is particularly useful in diseases that limit mobility and attending a medical practice.
Collapse
|
8
|
Idrisoglu A, Dallora AL, Anderberg P, Berglund JS. Applied Machine Learning Techniques to Diagnose Voice-Affecting Conditions and Disorders: Systematic Literature Review. J Med Internet Res 2023; 25:e46105. [PMID: 37467031 PMCID: PMC10398366 DOI: 10.2196/46105] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 04/26/2023] [Accepted: 05/23/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUND Normal voice production depends on the synchronized cooperation of multiple physiological systems, which makes the voice sensitive to changes. Any systematic, neurological, and aerodigestive distortion is prone to affect voice production through reduced cognitive, pulmonary, and muscular functionality. This sensitivity inspired using voice as a biomarker to examine disorders that affect the voice. Technological improvements and emerging machine learning (ML) technologies have enabled possibilities of extracting digital vocal features from the voice for automated diagnosis and monitoring systems. OBJECTIVE This study aims to summarize a comprehensive view of research on voice-affecting disorders that uses ML techniques for diagnosis and monitoring through voice samples where systematic conditions, nonlaryngeal aerodigestive disorders, and neurological disorders are specifically of interest. METHODS This systematic literature review (SLR) investigated the state of the art of voice-based diagnostic and monitoring systems with ML technologies, targeting voice-affecting disorders without direct relation to the voice box from the point of view of applied health technology. Through a comprehensive search string, studies published from 2012 to 2022 from the databases Scopus, PubMed, and Web of Science were scanned and collected for assessment. To minimize bias, retrieval of the relevant references in other studies in the field was ensured, and 2 authors assessed the collected studies. Low-quality studies were removed through a quality assessment and relevant data were extracted through summary tables for analysis. The articles were checked for similarities between author groups to prevent cumulative redundancy bias during the screening process, where only 1 article was included from the same author group. RESULTS In the analysis of the 145 included studies, support vector machines were the most utilized ML technique (51/145, 35.2%), with the most studied disease being Parkinson disease (PD; reported in 87/145, 60%, studies). After 2017, 16 additional voice-affecting disorders were examined, in contrast to the 3 investigated previously. Furthermore, an upsurge in the use of artificial neural network-based architectures was observed after 2017. Almost half of the included studies were published in last 2 years (2021 and 2022). A broad interest from many countries was observed. Notably, nearly one-half (n=75) of the studies relied on 10 distinct data sets, and 11/145 (7.6%) used demographic data as an input for ML models. CONCLUSIONS This SLR revealed considerable interest across multiple countries in using ML techniques for diagnosing and monitoring voice-affecting disorders, with PD being the most studied disorder. However, the review identified several gaps, including limited and unbalanced data set usage in studies, and a focus on diagnostic test rather than disorder-specific monitoring. Despite the limitations of being constrained by only peer-reviewed publications written in English, the SLR provides valuable insights into the current state of research on ML-based voice-affecting disorder diagnosis and monitoring and highlighting areas to address in future research.
Collapse
Affiliation(s)
- Alper Idrisoglu
- Department of Health, Blekinge Institute of Technology, Karslkrona, Sweden
| | - Ana Luiza Dallora
- Department of Health, Blekinge Institute of Technology, Karslkrona, Sweden
| | - Peter Anderberg
- Department of Health, Blekinge Institute of Technology, Karslkrona, Sweden
- School of Health Sciences, University of Skövde, Skövde, Sweden
| | | |
Collapse
|
9
|
Miller MI, Shih LC, Kolachalama VB. Machine Learning in Clinical Trials: A Primer with Applications to Neurology. Neurotherapeutics 2023; 20:1066-1080. [PMID: 37249836 PMCID: PMC10228463 DOI: 10.1007/s13311-023-01384-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2023] [Indexed: 05/31/2023] Open
Abstract
We reviewed foundational concepts in artificial intelligence (AI) and machine learning (ML) and discussed ways in which these methodologies may be employed to enhance progress in clinical trials and research, with particular attention to applications in the design, conduct, and interpretation of clinical trials for neurologic diseases. We discussed ways in which ML may help to accelerate the pace of subject recruitment, provide realistic simulation of medical interventions, and enhance remote trial administration via novel digital biomarkers and therapeutics. Lastly, we provide a brief overview of the technical, administrative, and regulatory challenges that must be addressed as ML achieves greater integration into clinical trial workflows.
Collapse
Affiliation(s)
- Matthew I Miller
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA
| | - Ludy C Shih
- Department of Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, 02118, USA
| | - Vijaya B Kolachalama
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA.
- Department of Computer Science and Faculty of Computing & Data Sciences, Boston University, Boston, MA, 02115, USA.
| |
Collapse
|
10
|
Parola A, Simonsen A, Lin JM, Zhou Y, Wang H, Ubukata S, Koelkebeck K, Bliksted V, Fusaroli R. Voice Patterns as Markers of Schizophrenia: Building a Cumulative Generalizable Approach Via a Cross-Linguistic and Meta-analysis Based Investigation. Schizophr Bull 2023; 49:S125-S141. [PMID: 36946527 PMCID: PMC10031745 DOI: 10.1093/schbul/sbac128] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/23/2023]
Abstract
BACKGROUND AND HYPOTHESIS Voice atypicalities are potential markers of clinical features of schizophrenia (eg, negative symptoms). A recent meta-analysis identified an acoustic profile associated with schizophrenia (reduced pitch variability and increased pauses), but also highlighted shortcomings in the field: small sample sizes, little attention to the heterogeneity of the disorder, and to generalizing findings to diverse samples and languages. STUDY DESIGN We provide a critical cumulative approach to vocal atypicalities in schizophrenia, where we conceptually and statistically build on previous studies. We aim at identifying a cross-linguistically reliable acoustic profile of schizophrenia and assessing sources of heterogeneity (symptomatology, pharmacotherapy, clinical and social characteristics). We relied on previous meta-analysis to build and analyze a large cross-linguistic dataset of audio recordings of 231 patients with schizophrenia and 238 matched controls (>4000 recordings in Danish, German, Mandarin and Japanese). We used multilevel Bayesian modeling, contrasting meta-analytically informed and skeptical inferences. STUDY RESULTS We found only a minimal generalizable acoustic profile of schizophrenia (reduced pitch variability), while duration atypicalities replicated only in some languages. We identified reliable associations between acoustic profile and individual differences in clinical ratings of negative symptoms, medication, age and gender. However, these associations vary across languages. CONCLUSIONS The findings indicate that a strong cross-linguistically reliable acoustic profile of schizophrenia is unlikely. Rather, if we are to devise effective clinical applications able to target different ranges of patients, we need first to establish larger and more diverse cross-linguistic datasets, focus on individual differences, and build self-critical cumulative approaches.
Collapse
Affiliation(s)
- Alberto Parola
- Department of Linguistics, Cognitive Science and Semiotics, Aarhus University, Aarhus, Denmark
- The Interacting Minds Center, Institute of Culture and Society, Aarhus University, Aarhus, Denmark
- Department of Psychology, University of Turin, Turin, Italy
| | - Arndis Simonsen
- The Interacting Minds Center, Institute of Culture and Society, Aarhus University, Aarhus, Denmark
- Psychosis Research Unit, Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Jessica Mary Lin
- Department of Linguistics, Cognitive Science and Semiotics, Aarhus University, Aarhus, Denmark
- The Interacting Minds Center, Institute of Culture and Society, Aarhus University, Aarhus, Denmark
| | - Yuan Zhou
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| | - Huiling Wang
- Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan, China
| | - Shiho Ubukata
- Department of Psychiatry, Kyoto University, Kyoto, Japan
| | - Katja Koelkebeck
- LVR-Hospital Essen, Department of Psychiatry and Psychotherapy, Hospital and Institute of the University of Duisburg-Essen, Essen, Germany
- Center for Translational Neuro- and Behavioral Sciences (C-TNBS), University Duisburg-Essen, Germany
| | - Vibeke Bliksted
- The Interacting Minds Center, Institute of Culture and Society, Aarhus University, Aarhus, Denmark
- Psychosis Research Unit, Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Riccardo Fusaroli
- Department of Linguistics, Cognitive Science and Semiotics, Aarhus University, Aarhus, Denmark
- The Interacting Minds Center, Institute of Culture and Society, Aarhus University, Aarhus, Denmark
- Linguistic Data Consortium, University of Pennsylvania, Philadelphia, USA
| |
Collapse
|
11
|
Wen P, Zhang Y, Wen G. Intelligent personalized diagnosis modeling in advanced medical system for Parkinson's disease using voice signals. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:8085-8102. [PMID: 37161187 DOI: 10.3934/mbe.2023351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Currently, machine learning methods have been utilized to realize the early detection of Parkinson's disease (PD) by using voice signals. Because the vocal system of each person is unique, and the same person's pronunciation can be different at different times, the training samples used in machine learning become very different from the speech signal of the patient to be diagnosed, frequently resulting in poor diagnostic performance. On this account, this paper presents a new intelligent personalized diagnosis method (PDM) for Parkinson's disease. The method was designed to begin with constructing new training data by assigning the best classifier to each training sample composed of features from the speech signals of patients. Subsequently, a meta-classifier was trained on the new training data. Finally, for the signal of each test patient, the method used the meta-classifier to select the most appropriate classifier, followed by adopting the selected classifier to classify the signal so that the more accurate diagnosis result of the test patient can be obtained. The novelty of the proposed method is that the proposed method uses different classifiers to perform the diagnosis of PD for diversified patients, whereas the current method uses the same classifier to diagnose all patients to be tested. Results of a large number of experiments show that PDM not only improves the performance but also exceeds the existing methods in speed.
Collapse
Affiliation(s)
- Pengcheng Wen
- College of Intelligent Systems Science and Engineering, Hubei University for Nationalities, Enshi 445000, China
| | - Yuhan Zhang
- Southern Medical University, Affiliated Dongguan Songshan Lake Central Hospital, Dongguan 523000, China
| | - Guihua Wen
- School of Computer Science & Engineering, South China University of Technology, Guangzhou 510000, China
| |
Collapse
|
12
|
Addressing smartphone mismatch in Parkinson’s disease detection aid systems based on speech. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
13
|
Edgley K, Chun HYY, Whiteley WN, Tsanas A. New Insights into Stroke from Continuous Passively Collected Temperature and Sleep Data Using Wrist-Worn Wearables. SENSORS (BASEL, SWITZERLAND) 2023; 23:1069. [PMID: 36772109 PMCID: PMC9920931 DOI: 10.3390/s23031069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 01/07/2023] [Accepted: 01/10/2023] [Indexed: 06/18/2023]
Abstract
Actigraphy may provide new insights into clinical outcomes and symptom management of patients through passive, continuous data collection. We used the GENEActiv smartwatch to passively collect actigraphy, wrist temperature, and ambient light data from 27 participants after stroke or probable brain transient ischemic attack (TIA) over 42 periods of device wear. We computed 323 features using established algorithms and proposed 25 novel features to characterize sleep and temperature. We investigated statistical associations between the extracted features and clinical outcomes evaluated using clinically validated questionnaires to gain insight into post-stroke recovery. We subsequently fitted logistic regression models to replicate clinical diagnosis (stroke or TIA) and disability due to stroke. The model generalization performance was assessed using a leave-one-subject-out cross validation method with the selected feature subsets, reporting the area under the curve (AUC). We found that several novel features were strongly correlated (|r|>0.3) with stroke symptoms and mental health measures. Using selected novel features, we obtained an AUC of 0.766 to estimate diagnosis and an AUC of 0.749 to estimate whether disability due to stroke was present. Collectively, these findings suggest that features extracted from the temperature smartwatch sensor may reveal additional clinically useful information over and above existing actigraphy-based features.
Collapse
Affiliation(s)
- Katherine Edgley
- MRC Centre for Reproductive Health, University of Edinburgh, Edinburgh EH16 4TJ, UK
| | - Ho-Yan Yvonne Chun
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh EH16 4SB, UK
| | - William N. Whiteley
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh EH16 4SB, UK
- Usher Institute, Edinburgh Medical School, University of Edinburgh, Edinburgh EH16 4UX, UK
| | - Athanasios Tsanas
- Usher Institute, Edinburgh Medical School, University of Edinburgh, Edinburgh EH16 4UX, UK
- Alan Turing Institute, London NW1 2DB, UK
| |
Collapse
|
14
|
Worasawate D, Asawaponwiput W, Yoshimura N, Intarapanich A, Surangsrirat D. Classification of Parkinson's disease from smartphone recording data using time-frequency analysis and convolutional neural network. Technol Health Care 2023; 31:705-718. [PMID: 36155539 DOI: 10.3233/thc-220386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
BACKGROUND Parkinson's disease (PD) is a long-term neurodegenerative disease of the central nervous system. The current diagnosis is dependent on clinical observation and the abilities and experience of a trained specialist. One of the symptoms that affects most patients is voice impairment. OBJECTIVE Voice samples are non-invasive data that can be collected remotely for diagnosis and disease progression monitoring. In this study, we analyzed voice recording data from a smartphone as a possible medical self-diagnosis tool by using only one-second voice recording. The data from one of the largest mobile PD studies, the mPower study, was used. METHODS A total of 29,798 ten-second voice recordings on smartphone from 4,051 participants were used for the analysis. The voice recordings were from sustained phonation by participants saying /aa/ for ten seconds into an iPhone microphone. A dataset comprising 385,143 short one-second audio samples was generated from the original ten-second voice recordings. The samples were converted to a spectrogram using a short-time Fourier transform. CNN models were then applied to classify the samples. RESULTS Classification accuracies of the proposed method with LeNet-5, ResNet-50, and VGGNet-16 are 97.7 ± 0.1%, 98.6 ± 0.2%, and 99.3 ± 0.1%, respectively. CONCLUSIONS We achieve a respectable classification performance using a generalized approach on a dataset with a large number of samples. The result emphasizes that an analysis based on one-second clip recorded on a smartphone could be a promising non-invasive and remotely available PD biomarker.
Collapse
Affiliation(s)
- Denchai Worasawate
- Department of Electrical Engineering, Faculty of Engineering, Kasetsart University, Bangkok, Thailand
| | - Warisara Asawaponwiput
- Department of Electrical Engineering, Faculty of Engineering, Kasetsart University, Bangkok, Thailand
| | - Natsue Yoshimura
- Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, Japan
| | - Apichart Intarapanich
- Educational Technology Team, National Electronics and Computer Technology Center, Pathum Thani, Thailand
| | - Decho Surangsrirat
- Assistive Technology and Medical Devices Research Center, National Science and Technology Development Agency, Pathum Thani, Thailand
| |
Collapse
|
15
|
Xu Z, Shen B, Tang Y, Wu J, Wang J. Deep Clinical Phenotyping of Parkinson's Disease: Towards a New Era of Research and Clinical Care. PHENOMICS (CHAM, SWITZERLAND) 2022; 2:349-361. [PMID: 36939759 PMCID: PMC9590510 DOI: 10.1007/s43657-022-00051-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 03/12/2022] [Accepted: 03/28/2022] [Indexed: 11/27/2022]
Abstract
Despite recent advances in technology, clinical phenotyping of Parkinson's disease (PD) has remained relatively limited as current assessments are mainly based on empirical observation and subjective categorical judgment at the clinic. A lack of comprehensive, objective, and quantifiable clinical phenotyping data has hindered our capacity to diagnose, assess patients' conditions, discover pathogenesis, identify preclinical stages and clinical subtypes, and evaluate new therapies. Therefore, deep clinical phenotyping of PD patients is a necessary step towards understanding PD pathology and improving clinical care. In this review, we present a growing community consensus and perspective on how to clinically phenotype this disease, that is, to phenotype the entire course of disease progression by integrating capacity, performance, and perception approaches with state-of-the-art technology. We also explore the most studied aspects of PD deep clinical phenotypes, namely, bradykinesia, tremor, dyskinesia and motor fluctuation, gait impairment, speech impairment, and non-motor phenotypes.
Collapse
Affiliation(s)
- Zhiheng Xu
- Department of Neurology and National Research Center for Aging and Medicine & National Center for Neurological Disorders, State Key Laboratory of Medical Neurobiology, Huashan Hospital, Fudan University, Shanghai, 200040 China
| | - Bo Shen
- Department of Neurology and National Research Center for Aging and Medicine & National Center for Neurological Disorders, State Key Laboratory of Medical Neurobiology, Huashan Hospital, Fudan University, Shanghai, 200040 China
| | - Yilin Tang
- Department of Neurology and National Research Center for Aging and Medicine & National Center for Neurological Disorders, State Key Laboratory of Medical Neurobiology, Huashan Hospital, Fudan University, Shanghai, 200040 China
| | - Jianjun Wu
- Department of Neurology and National Research Center for Aging and Medicine & National Center for Neurological Disorders, State Key Laboratory of Medical Neurobiology, Huashan Hospital, Fudan University, Shanghai, 200040 China
| | - Jian Wang
- Department of Neurology and National Research Center for Aging and Medicine & National Center for Neurological Disorders, State Key Laboratory of Medical Neurobiology, Huashan Hospital, Fudan University, Shanghai, 200040 China
| |
Collapse
|
16
|
Maremmani C, Rovini E, Salvadori S, Pecori A, Pasquini J, Ciammola A, Rossi S, Berchina G, Monastero R, Cavallo F. Hands-feet wireless devices: Test-retest reliability and discriminant validity of motor measures in Parkinson's disease telemonitoring. Acta Neurol Scand 2022; 146:304-317. [PMID: 35788914 PMCID: PMC9541466 DOI: 10.1111/ane.13667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 06/18/2022] [Accepted: 06/21/2022] [Indexed: 11/26/2022]
Abstract
BACKGROUND Telemonitoring, a branch of telemedicine, involves the use of technological tools to remotely detect clinical data and evaluate patients. Telemonitoring of patients with Parkinson's disease (PD) should be performed using reliable and discriminant motor measures. Furthermore, the method of data collection and transmission, and the type of subjects suitable for telemonitoring must be well defined. OBJECTIVE To analyze differences in patients with PD and healthy controls (HC) with the wearable inertial device SensHands-SensFeet (SH-SF), adopting a standardized acquisition mode, to verify if motor measures provided by SH-SF have a high discriminating capacity and high intraclass correlation coefficient (ICC). METHODS Altogether, 64 patients with mild-to-moderate PD and 50 HC performed 14 standardized motor activities for assessing bradykinesia, postural and resting tremors, and gait parameters. SH-SF inertial devices were used to acquire movements and calculate objective motor measures of movement (total: 75). For each motor task, five or more biomechanical parameters were measured twice. The results were compared between patients with PD and HC. RESULTS Fifty-eight objective motor measures significantly differed between patients with PD and HC; among these, 32 demonstrated relevant discrimination power (Cohen's d > 0.8). The test-retest reliability was excellent in patients with PD (median ICC = 0.85 right limbs, 0.91 left limbs) and HC (median ICC = 0.78 right limbs, 0.82 left limbs). CONCLUSION In a supervised environment, the SH-SF device provides motor measures with good results in terms of reliability and discriminant ability. The reliability of SH-SF measurements should be evaluated in an unsupervised home setting in future studies.
Collapse
Affiliation(s)
- Carlo Maremmani
- Unit of Neurology, Ospedale Apuane, Azienda USL Toscana Nord Ovest, Massa, Italy
| | - Erika Rovini
- Department of Industrial Engineering, University of Florence, Florence, Italy
| | - Stefano Salvadori
- Institute of Clinical Physiology, National Research Council (CNR), Pisa, Italy
| | - Alessandro Pecori
- Institute of Clinical Physiology, National Research Council (CNR), Pisa, Italy
| | - Jacopo Pasquini
- Department of Neurology - Stroke Unit and Laboratory of Neuroscience, IRCCS Istituto Auxologico Italiano, Milan, Italy.,Department of Pathophysiology and Transplantation, University of Milan, Milan, Italy
| | - Andrea Ciammola
- Department of Neurology - Stroke Unit and Laboratory of Neuroscience, IRCCS Istituto Auxologico Italiano, Milan, Italy.,Department of Pathophysiology and Transplantation, University of Milan, Milan, Italy
| | - Simone Rossi
- Department of Biomedical and Neuromotor Sciences University of Bologna, Bologna, Italy
| | - Giulia Berchina
- Unit of Neurology, Ospedale Apuane, Azienda USL Toscana Nord Ovest, Massa, Italy
| | - Roberto Monastero
- Department of Biomedicine, Neuroscience and Advanced Diagnostics, University of Palermo, Palermo, Italy
| | - Filippo Cavallo
- Department of Industrial Engineering, University of Florence, Florence, Italy.,The Biorobotics Institute, Scuola Superiore Sant'Anna, Pontedera, Pisa, Italy
| |
Collapse
|
17
|
许 之, 张 梦, 王 坚. [Diagnostic Value of Speech Acoustic Analysis in Parkinson's Disease]. SICHUAN DA XUE XUE BAO. YI XUE BAN = JOURNAL OF SICHUAN UNIVERSITY. MEDICAL SCIENCE EDITION 2022; 53:726-731. [PMID: 35871748 PMCID: PMC10409472 DOI: 10.12182/20220760304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Accepted: 06/15/2022] [Indexed: 06/15/2023]
Abstract
Screening for and identifying patients with Parkinson's disease (PD) at an early stage and forming accurate diagnosis of PD during the course of the progression of the disease are of essential importance but still remain challenging for the clinical diagnosis and treatment of PD. One of the common clinical manifestations of PD is speech impairment, or voice impairment. Thanks to the recent advances in the field of acoustic analysis, a large number of acoustic parameters have been proposed for evaluating speech impairment quantitatively. Early identification and accurate diagnosis of PD was henceforth made possible through the application of speech acoustic analysis. Herein, we summarized the latest research findings on the application of acoustic analysis in PD diagnosis. We reported some acoustic parameters commonly used in the evaluation of voice impairment in PD patients. Then, we presented the diagnostic value of acoustic analysis in developing accurate diagnosis, early screening and differential diagnosis. Furthermore, we discussed the drawbacks and prospects of current studies, intending to enhance understanding of acoustic analysis of PD patients and its potential diagnostic values.
Collapse
Affiliation(s)
- 之珩 许
- 复旦大学附属华山医院 神经内科 (上海 200040)Department of Neurology, Huashan Hospital, Fudan University, Shanghai 200040, China
| | - 梦翰 张
- 复旦大学附属华山医院 神经内科 (上海 200040)Department of Neurology, Huashan Hospital, Fudan University, Shanghai 200040, China
- 复旦大学 现代语言学研究院 (上海 200433)Institute of Modern Languages and Linguistics, Fudan University, Shanghai 200433, China
| | - 坚 王
- 复旦大学附属华山医院 神经内科 (上海 200040)Department of Neurology, Huashan Hospital, Fudan University, Shanghai 200040, China
| |
Collapse
|
18
|
Tsanas A. Relevance, redundancy, and complementarity trade-off (RRCT): A principled, generic, robust feature-selection tool. PATTERNS (NEW YORK, N.Y.) 2022; 3:100471. [PMID: 35607618 PMCID: PMC9122960 DOI: 10.1016/j.patter.2022.100471] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 01/19/2022] [Accepted: 02/24/2022] [Indexed: 12/21/2022]
Abstract
We present a new heuristic feature-selection (FS) algorithm that integrates in a principled algorithmic framework the three key FS components: relevance, redundancy, and complementarity. Thus, we call it relevance, redundancy, and complementarity trade-off (RRCT). The association strength between each feature and the response and between feature pairs is quantified via an information theoretic transformation of rank correlation coefficients, and the feature complementarity is quantified using partial correlation coefficients. We empirically benchmark the performance of RRCT against 19 FS algorithms across four synthetic and eight real-world datasets in indicative challenging settings evaluating the following: (1) matching the true feature set and (2) out-of-sample performance in binary and multi-class classification problems when presenting selected features into a random forest. RRCT is very competitive in both tasks, and we tentatively make suggestions on the generalizability and application of the best-performing FS algorithms across settings where they may operate effectively.
Collapse
Affiliation(s)
- Athanasios Tsanas
- Usher Institute, Edinburgh Medical School, University of Edinburgh, NINE Edinburgh BioQuarter, 9 Little France road, Edinburgh, UK.,School of Mathematics, University of Edinburgh, Edinburgh, UK.,Alan Turing Institute, British Library, London, UK
| |
Collapse
|
19
|
Rybner A, Jessen ET, Mortensen MD, Larsen SN, Grossman R, Bilenberg N, Cantio C, Jepsen JRM, Weed E, Simonsen A, Fusaroli R. Vocal markers of autism: Assessing the generalizability of machine learning models. Autism Res 2022; 15:1018-1030. [PMID: 35385224 DOI: 10.1002/aur.2721] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 02/24/2022] [Accepted: 03/22/2022] [Indexed: 01/09/2023]
Abstract
Machine learning (ML) approaches show increasing promise in their ability to identify vocal markers of autism. Nonetheless, it is unclear to what extent such markers generalize to new speech samples collected, for example, using a different speech task or in a different language. In this paper, we systematically assess the generalizability of ML findings across a variety of contexts. We train promising published ML models of vocal markers of autism on novel cross-linguistic datasets following a rigorous pipeline to minimize overfitting, including cross-validated training and ensemble models. We test the generalizability of the models by testing them on (i) different participants from the same study, performing the same task; (ii) the same participants, performing a different (but similar) task; (iii) a different study with participants speaking a different language, performing the same type of task. While model performance is similar to previously published findings when trained and tested on data from the same study (out-of-sample performance), there is considerable variance between studies. Crucially, the models do not generalize well to different, though similar, tasks and not at all to new languages. The ML pipeline is openly shared. Generalizability of ML models of vocal markers of autism is an issue. We outline three recommendations for strategies researchers could take to be more explicit about generalizability and improve it in future studies. LAY SUMMARY: Machine learning approaches promise to be able to identify autism from voice only. These models underestimate how diverse the contexts in which we speak are, how diverse the languages used are and how diverse autistic voices are. Machine learning approaches need to be more careful in defining their limits and generalizability.
Collapse
Affiliation(s)
- Astrid Rybner
- Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark
| | - Emil Trenckner Jessen
- Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark
| | - Marie Damsgaard Mortensen
- Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark
| | - Stine Nyhus Larsen
- Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark
| | - Ruth Grossman
- Communication Sciences and Disorders, Emerson College, Boston, Massachusetts, USA
| | - Niels Bilenberg
- Child and Youth Psychiatry, University of Southern Denmark, Odense, Denmark
| | - Cathriona Cantio
- Child and Youth Psychiatry, University of Southern Denmark, Odense, Denmark.,Psychology, University of Southern Denmark, Odense, Denmark
| | - Jens Richardt Møllegaard Jepsen
- Child and Adolescent Mental Health Centre, Mental Health Services in the Capital Region of Denmark, Copenhagen, Denmark.,Center for Neuropsychiatric Schizophrenia Research and Center for Clinical Intervention and Neuropsychiatric Schizophrenia Research, Mental Health Services in the Capital Region of Denmark, Copenhagen, Denmark
| | - Ethan Weed
- Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark.,Interacting Minds Center, School of Culture and Society, Aarhus University, Aarhus, Denmark
| | - Arndis Simonsen
- Interacting Minds Center, School of Culture and Society, Aarhus University, Aarhus, Denmark.,Psychosis Research Unit, Aarhus University Hospital, Aarhus, Denmark
| | - Riccardo Fusaroli
- Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark.,Interacting Minds Center, School of Culture and Society, Aarhus University, Aarhus, Denmark.,Linguistic Data Consortium, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
20
|
Automated methods for diagnosis of Parkinson’s disease and predicting severity level. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06626-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
21
|
Comparing the Effectiveness of Speech and Physiological Features in Explaining Emotional Responses during Voice User Interface Interactions. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12031269] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
The rapid rise of voice user interface technology has changed the way users traditionally interact with interfaces, as tasks requiring gestural or visual attention are swapped by vocal commands. This shift has equally affected designers, required to disregard common digital interface guidelines in order to adapt to non-visual user interaction (No-UI) methods. The guidelines regarding voice user interface evaluation are far from the maturity of those surrounding digital interface evaluation, resulting in a lack of consensus and clarity. Thus, we sought to contribute to the emerging literature regarding voice user interface evaluation and, consequently, assist user experience professionals in their quest to create optimal vocal experiences. To do so, we compared the effectiveness of physiological features (e.g., phasic electrodermal activity amplitude) and speech features (e.g., spectral slope amplitude) to predict the intensity of users’ emotional responses during voice user interface interactions. We performed a within-subjects experiment in which the speech, facial expression, and electrodermal activity responses of 16 participants were recorded during voice user interface interactions that were purposely designed to elicit frustration and shock, resulting in 188 analyzed interactions. Our results suggest that the physiological measure of facial expression and its extracted feature, automatic facial expression-based valence, is most informative of emotional events lived through voice user interface interactions. By comparing the unique effectiveness of each feature, theoretical and practical contributions may be noted, as the results contribute to voice user interface literature while providing key insights favoring efficient voice user interface evaluation.
Collapse
|
22
|
Kumar R, Tripathy M, Kumar N, Anand RS. Management of Parkinson's Disease Dysarthria: Can Artificial Intelligence Provide the Solution? Ann Indian Acad Neurol 2022; 25:810-816. [PMID: 36560994 PMCID: PMC9764905 DOI: 10.4103/aian.aian_554_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 08/06/2022] [Indexed: 12/24/2022] Open
Abstract
Speech disorder is a significant problem for people affected with Parkinson's disease (PD) leading to a substantial disability to communicate with others. PD affects the voice, including changes in pitch, intensity, articulation, and syllable rate.We aimed to study the current status of artificial intelligence (AI) using machine learning algorithms (MLAs) in the assessment of speech abnormalities in PD along with the generation of intelligible synthetic speech for voice rehabilitation. We searched the literature for studies focusing on speech/voice disorder in PD and rehabilitation techniques till June 18, 2022. We searched PubMed and Engineering Village (Compendex and Inspec combined) databases. After careful screening of the title and evaluation of abstracts, we used select articles describing the use of AI or its various forms in the management of speech abnormalities in PD to synthesize this review. MLAs classify PD and non-PD patients with an accuracy of more than 90% using only voice features. Non-acoustic sensors can rehabilitate PD patient by converting dysarthric speech to highly intelligible speech using MLAs. MLAs can automatically assess several speech features and quantify the progression of speech abnormalities in PD. PD speech rehabilitation using MLAs may prove superior to other available therapies.
Collapse
Affiliation(s)
- Raj Kumar
- Department of Electrical Engineering, Indian Institute of Technology, Roorkee, Uttarakhand, India
| | - Manoj Tripathy
- Department of Electrical Engineering, Indian Institute of Technology, Roorkee, Uttarakhand, India
| | - Niraj Kumar
- Department of Neurology, All India Institute of Medical Sciences, Rishikesh, Uttarakhand, India,Address for correspondence: Dr. Niraj Kumar, Department of Neurology, All India Institute of Medical Sciences, Rishikesh, Uttarakhand, India. E-mail:
| | - Radhey Shyam Anand
- Department of Electrical Engineering, Indian Institute of Technology, Roorkee, Uttarakhand, India
| |
Collapse
|
23
|
Keen EM, True EJ, Summers AR, Smith EC, Brew J, Grandjean Lapierre S. High-throughput digital cough recording on a university campus: A SARS-CoV-2-negative curated open database and operational template for acoustic screening of respiratory diseases. Digit Health 2022; 8:20552076221097513. [PMID: 35558638 PMCID: PMC9087241 DOI: 10.1177/20552076221097513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 04/12/2022] [Indexed: 11/16/2022] Open
Abstract
Objective Respiratory illnesses have information-rich acoustic biomarkers, such as cough, that
can potentially play an important role in screening populations for disease risk. To
realize that potential, datasets of paired acoustic-clinical samples are needed for the
development and validation of acoustic screening models, and protocols for collecting
acoustic samples must be efficient and safe. We collected cough acoustic signatures at a
high-throughput SARS-CoV-2 testing site on a college campus. Here, we share logistical
details and the dataset of acoustic cough signatures paired with the gold standard in
SARS-CoV-2 testing of SARS-CoV-2 genomic sequences using qRT-PCR. Methods Cough recordings were collected in winter-spring 2021 at a rural residential college
(Sewanee, TN, USA), where approximately 2000 students were tested for SARS-CoV-2 on a
weekly basis. Cough collection was managed by student volunteers using custom
software. Results 4302 coughs were recorded from 960 participants over 11 weeks. All coughs were COVID-19
negative. Approximately 30 s were required to check-in a participant and collect their
cough. Conclusion The value of acoustic screening tools depends upon our ability to develop and implement
them reliably and quickly. For that to happen, high-quality datasets and logistical
insights must be collected and shared on an ongoing basis.
Collapse
Affiliation(s)
- Eric M. Keen
- Sewanee: The University of the South, Sewanee, TN, USA
- Hyfe, Inc., Wilmington, DE, USA
| | - Emily J. True
- Sewanee: The University of the South, Sewanee, TN, USA
| | | | | | | | - Simon Grandjean Lapierre
- Department of Microbiology, Infectious Diseases and Immunology, Université de Montréal, Montréal, Québec, Canada
- Immunopathology Axis, Centre de Recherche du Centre Hospitalier de l’Université de Montréal, Montréal, Québec, Canada
| |
Collapse
|
24
|
Tufail AB, Ma YK, Zhang QN, Khan A, Zhao L, Yang Q, Adeel M, Khan R, Ullah I. 3D convolutional neural networks-based multiclass classification of Alzheimer's and Parkinson's diseases using PET and SPECT neuroimaging modalities. Brain Inform 2021; 8:23. [PMID: 34725741 PMCID: PMC8560868 DOI: 10.1186/s40708-021-00144-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2020] [Accepted: 10/15/2021] [Indexed: 11/10/2022] Open
Abstract
Background Alzheimer’s disease (AD) is a neurodegenerative brain pathology formed due to piling up of amyloid proteins, development of plaques and disappearance of neurons. Another common subtype of dementia like AD, Parkinson’s disease (PD) is determined by the disappearance of dopaminergic neurons in the region known as substantia nigra pars compacta located in the midbrain. Both AD and PD target aged population worldwide forming a major chunk of healthcare costs. Hence, there is a need for methods that help in the early diagnosis of these diseases. PD subjects especially those who have confirmed postmortem plaque are a strong candidate for a second AD diagnosis. Modalities such as positron emission tomography (PET) and single photon emission computed tomography (SPECT) can be combined with deep learning methods to diagnose these two diseases for the benefit of clinicians. Result In this work, we deployed a 3D Convolutional Neural Network (CNN) to extract features for multiclass classification of both AD and PD in the frequency and spatial domains using PET and SPECT neuroimaging modalities to differentiate between AD, PD and Normal Control (NC) classes. Discrete Cosine Transform has been deployed as a frequency domain learning method along with random weak Gaussian blurring and random zooming in/out augmentation methods in both frequency and spatial domains. To select the hyperparameters of the 3D-CNN model, we deployed both 5- and 10-fold cross-validation (CV) approaches. The best performing model was found to be AD/NC(SPECT)/PD classification with random weak Gaussian blurred augmentation in the spatial domain using fivefold CV approach while the worst performing model happens to be AD/NC(PET)/PD classification without augmentation in the frequency domain using tenfold CV approach. We also found that spatial domain methods tend to perform better than their frequency domain counterparts. Conclusion The proposed model provides a good performance in discriminating AD and PD subjects due to minimal correlation between these two dementia types on the clinicopathological continuum between AD and PD subjects from a neuroimaging perspective.
Collapse
Affiliation(s)
- Ahsan Bin Tufail
- School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, 150001, China.,Department of Electrical and Computer Engineering, COMSATS University Islamabad, Sahiwal Campus, Sahiwal, Pakistan
| | - Yong-Kui Ma
- School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, 150001, China.
| | - Qiu-Na Zhang
- School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, 150001, China
| | - Adil Khan
- Department of Computer Science, University of Peshawar, Peshawar, Pakistan
| | | | - Qiang Yang
- School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, 150001, China
| | | | - Rahim Khan
- School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, 150001, China
| | | |
Collapse
|
25
|
Arora S, Tsanas A. Assessing Parkinson's Disease at Scale Using Telephone-Recorded Speech: Insights from the Parkinson's Voice Initiative. Diagnostics (Basel) 2021; 11:1892. [PMID: 34679590 PMCID: PMC8534584 DOI: 10.3390/diagnostics11101892] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 10/08/2021] [Accepted: 10/10/2021] [Indexed: 01/07/2023] Open
Abstract
Numerous studies have reported on the high accuracy of using voice tasks for the remote detection and monitoring of Parkinson's Disease (PD). Most of these studies, however, report findings on a small number of voice recordings, often collected under acoustically controlled conditions, and therefore cannot scale at large without specialized equipment. In this study, we aimed to evaluate the potential of using voice as a population-based PD screening tool in resource-constrained settings. Using the standard telephone network, we processed 11,942 sustained vowel /a/ phonations from a US-English cohort comprising 1078 PD and 5453 control participants. We characterized each phonation using 304 dysphonia measures to quantify a range of vocal impairments. Given that this is a highly unbalanced problem, we used the following strategy: we selected a balanced subset (n = 3000 samples) for training and testing using 10-fold cross-validation (CV), and the remaining (unbalanced held-out dataset, n = 8942) samples for further model validation. Using robust feature selection methods we selected 27 dysphonia measures to present into a radial-basis-function support vector machine and demonstrated differentiation of PD participants from controls with 67.43% sensitivity and 67.25% specificity. These findings could help pave the way forward toward the development of an inexpensive, remote, and reliable diagnostic support tool for PD using voice as a digital biomarker.
Collapse
Affiliation(s)
- Siddharth Arora
- Somerville College, University of Oxford, Oxford OX2 6HD, UK;
| | - Athanasios Tsanas
- Usher Institute, Edinburgh Medical School, University of Edinburgh, Edinburgh EH16 4UX, UK
| |
Collapse
|
26
|
Laganas C, Iakovakis D, Hadjidimitriou S, Charisis V, Dias SB, Bostantzopoulou S, Katsarou Z, Klingelhoefer L, Reichmann H, Trivedi D, Chaudhuri KR, Hadjileontiadis LJ. Parkinson's Disease Detection Based on Running Speech Data From Phone Calls. IEEE Trans Biomed Eng 2021; 69:1573-1584. [PMID: 34596531 DOI: 10.1109/tbme.2021.3116935] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
OBJECTIVE Parkinson's Disease (PD) is a progressive neurodegenerative disorder, manifesting with subtle early signs, which often hinder timely and early diagnosis and treatment. The development of accessible, technology-based methods for longitudinal PD symptoms tracking in daily living offers the potential for transforming the disease assessment and accelerating PD diagnosis. METHODS A privacy-aware method for classifying PD patients and healthy controls (HC), on the grounds of speech impairment present in PD, is proposed here. Voice features from running speech signals were extracted from recordings passively captured over voice phone calls. Features are fed in a language-aware training of multiple- and single-instance learning classifiers, along with demographic variables, exploiting a multilingual cohort of 498 subjects (392/106 self-reported HC/PD patients) to classify PD. RESULTS By means of leave-one-subject-out cross-validation, the best-performing models yielded 0.69/0.68/0.63/0.83 area under the Receiver Operating Characteristic curve (AUC) for the binary classification of PD patient vs. HC in sub-cohorts of English/Greek/German/Portuguese-speaking subjects, respectively. Out-of-sample testing of the best performing models was conducted in an additional dataset, generated by 63 clinically-assessed subjects (24/39 HC/early PD patients). Testing has resulted in 0.84/0.93/0.83 AUC for the English/Greek/German-speaking sub-cohorts, respectively. Comparative analysis with other approaches for language-aware PD detection justified the efficiency of the proposed one, considering the ecological validity of the acquired voice data. CONCLUSIONS The present work demonstrates increased robustness in PD detection using voice data captured in-the-wild. SIGNIFICANCE A high-frequency, privacy-aware and unobtrusive PD screening tool is introduced for the first time, based on analysis of voice samples captured during routine phone calls.
Collapse
|
27
|
Xue C, Karjadi C, Paschalidis IC, Au R, Kolachalama VB. Detection of dementia on voice recordings using deep learning: a Framingham Heart Study. Alzheimers Res Ther 2021; 13:146. [PMID: 34465384 PMCID: PMC8409004 DOI: 10.1186/s13195-021-00888-3] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 08/12/2021] [Indexed: 11/10/2022]
Abstract
BACKGROUND Identification of reliable, affordable, and easy-to-use strategies for detection of dementia is sorely needed. Digital technologies, such as individual voice recordings, offer an attractive modality to assess cognition but methods that could automatically analyze such data are not readily available. METHODS AND FINDINGS We used 1264 voice recordings of neuropsychological examinations administered to participants from the Framingham Heart Study (FHS), a community-based longitudinal observational study. The recordings were 73 min in duration, on average, and contained at least two speakers (participant and examiner). Of the total voice recordings, 483 were of participants with normal cognition (NC), 451 recordings were of participants with mild cognitive impairment (MCI), and 330 were of participants with dementia (DE). We developed two deep learning models (a two-level long short-term memory (LSTM) network and a convolutional neural network (CNN)), which used the audio recordings to classify if the recording included a participant with only NC or only DE and to differentiate between recordings corresponding to those that had DE from those who did not have DE (i.e., NDE (NC+MCI)). Based on 5-fold cross-validation, the LSTM model achieved a mean (±std) area under the receiver operating characteristic curve (AUC) of 0.740 ± 0.017, mean balanced accuracy of 0.647 ± 0.027, and mean weighted F1 score of 0.596 ± 0.047 in classifying cases with DE from those with NC. The CNN model achieved a mean AUC of 0.805 ± 0.027, mean balanced accuracy of 0.743 ± 0.015, and mean weighted F1 score of 0.742 ± 0.033 in classifying cases with DE from those with NC. For the task related to the classification of participants with DE from NDE, the LSTM model achieved a mean AUC of 0.734 ± 0.014, mean balanced accuracy of 0.675 ± 0.013, and mean weighted F1 score of 0.671 ± 0.015. The CNN model achieved a mean AUC of 0.746 ± 0.021, mean balanced accuracy of 0.652 ± 0.020, and mean weighted F1 score of 0.635 ± 0.031 in classifying cases with DE from those who were NDE. CONCLUSION This proof-of-concept study demonstrates that automated deep learning-driven processing of audio recordings of neuropsychological testing performed on individuals recruited within a community cohort setting can facilitate dementia screening.
Collapse
Affiliation(s)
- Chonghua Xue
- Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA
| | - Cody Karjadi
- The Framingham Heart Study, Boston University, Boston, MA, 02118, USA
- Departments of Anatomy & Neurobiology and Neurology, Boston University School of Medicine, Boston, MA, 02118, USA
| | - Ioannis Ch Paschalidis
- Departments to Electrical & Computer Engineering, Systems Engineering and Biomedical Engineering; Faculty of Computing & Data Sciences, Boston University, Boston, MA, 02118, USA
| | - Rhoda Au
- The Framingham Heart Study, Boston University, Boston, MA, 02118, USA
- Departments of Anatomy & Neurobiology and Neurology, Boston University School of Medicine, Boston, MA, 02118, USA
- Boston University Alzheimer's Disease Center, Boston, MA, 02118, USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, 02118, USA
| | - Vijaya B Kolachalama
- Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA.
- Boston University Alzheimer's Disease Center, Boston, MA, 02118, USA.
- Department of Computer Science and Faculty of Computing & Data Sciences, Boston University, Boston, MA, 02115, USA.
| |
Collapse
|
28
|
Xue C, Karjadi C, Paschalidis IC, Au R, Kolachalama VB. Detection of dementia on voice recordings using deep learning: a Framingham Heart Study. Alzheimers Res Ther 2021. [PMID: 34465384 DOI: 10.1186/s13195-021-00888-3.pdf] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
BACKGROUND Identification of reliable, affordable, and easy-to-use strategies for detection of dementia is sorely needed. Digital technologies, such as individual voice recordings, offer an attractive modality to assess cognition but methods that could automatically analyze such data are not readily available. METHODS AND FINDINGS We used 1264 voice recordings of neuropsychological examinations administered to participants from the Framingham Heart Study (FHS), a community-based longitudinal observational study. The recordings were 73 min in duration, on average, and contained at least two speakers (participant and examiner). Of the total voice recordings, 483 were of participants with normal cognition (NC), 451 recordings were of participants with mild cognitive impairment (MCI), and 330 were of participants with dementia (DE). We developed two deep learning models (a two-level long short-term memory (LSTM) network and a convolutional neural network (CNN)), which used the audio recordings to classify if the recording included a participant with only NC or only DE and to differentiate between recordings corresponding to those that had DE from those who did not have DE (i.e., NDE (NC+MCI)). Based on 5-fold cross-validation, the LSTM model achieved a mean (±std) area under the receiver operating characteristic curve (AUC) of 0.740 ± 0.017, mean balanced accuracy of 0.647 ± 0.027, and mean weighted F1 score of 0.596 ± 0.047 in classifying cases with DE from those with NC. The CNN model achieved a mean AUC of 0.805 ± 0.027, mean balanced accuracy of 0.743 ± 0.015, and mean weighted F1 score of 0.742 ± 0.033 in classifying cases with DE from those with NC. For the task related to the classification of participants with DE from NDE, the LSTM model achieved a mean AUC of 0.734 ± 0.014, mean balanced accuracy of 0.675 ± 0.013, and mean weighted F1 score of 0.671 ± 0.015. The CNN model achieved a mean AUC of 0.746 ± 0.021, mean balanced accuracy of 0.652 ± 0.020, and mean weighted F1 score of 0.635 ± 0.031 in classifying cases with DE from those who were NDE. CONCLUSION This proof-of-concept study demonstrates that automated deep learning-driven processing of audio recordings of neuropsychological testing performed on individuals recruited within a community cohort setting can facilitate dementia screening.
Collapse
Affiliation(s)
- Chonghua Xue
- Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA
| | - Cody Karjadi
- The Framingham Heart Study, Boston University, Boston, MA, 02118, USA.,Departments of Anatomy & Neurobiology and Neurology, Boston University School of Medicine, Boston, MA, 02118, USA
| | - Ioannis Ch Paschalidis
- Departments to Electrical & Computer Engineering, Systems Engineering and Biomedical Engineering; Faculty of Computing & Data Sciences, Boston University, Boston, MA, 02118, USA
| | - Rhoda Au
- The Framingham Heart Study, Boston University, Boston, MA, 02118, USA.,Departments of Anatomy & Neurobiology and Neurology, Boston University School of Medicine, Boston, MA, 02118, USA.,Boston University Alzheimer's Disease Center, Boston, MA, 02118, USA.,Department of Epidemiology, Boston University School of Public Health, Boston, MA, 02118, USA
| | - Vijaya B Kolachalama
- Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA. .,Boston University Alzheimer's Disease Center, Boston, MA, 02118, USA. .,Department of Computer Science and Faculty of Computing & Data Sciences, Boston University, Boston, MA, 02115, USA.
| |
Collapse
|
29
|
García AM, Arias-Vergara T, C Vasquez-Correa J, Nöth E, Schuster M, Welch AE, Bocanegra Y, Baena A, Orozco-Arroyave JR. Cognitive Determinants of Dysarthria in Parkinson's Disease: An Automated Machine Learning Approach. Mov Disord 2021; 36:2862-2873. [PMID: 34390508 DOI: 10.1002/mds.28751] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 07/20/2021] [Accepted: 07/23/2021] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND Dysarthric symptoms in Parkinson's disease (PD) vary greatly across cohorts. Abundant research suggests that such heterogeneity could reflect subject-level and task-related cognitive factors. However, the interplay of these variables during motor speech remains underexplored, let alone by administering validated materials to carefully matched samples with varying cognitive profiles and combining automated tools with machine learning methods. OBJECTIVE We aimed to identify which speech dimensions best identify patients with PD in cognitively heterogeneous, cognitively preserved, and cognitively impaired groups through tasks with low (reading) and high (retelling) processing demands. METHODS We used support vector machines to analyze prosodic, articulatory, and phonemic identifiability features. Patient groups were compared with healthy control subjects and against each other in both tasks, using each measure separately and in combination. RESULTS Relative to control subjects, patients in cognitively heterogeneous and cognitively preserved groups were best discriminated by combined dysarthric signs during reading (accuracy = 84% and 80.2%). Conversely, patients with cognitive impairment were maximally discriminated from control subjects when considering phonemic identifiability during retelling (accuracy = 86.9%). This same pattern maximally distinguished between cognitively spared and impaired patients (accuracy = 72.1%). Also, cognitive (executive) symptom severity was predicted by prosody in cognitively preserved patients and by phonemic identifiability in cognitively heterogeneous and impaired groups. No measure predicted overall motor dysfunction in any group. CONCLUSIONS Predominant dysarthric symptoms appear to be best captured through undemanding tasks in cognitively heterogeneous and preserved cohorts and through cognitively loaded tasks in patients with cognitive impairment. Further applications of this framework could enhance dysarthria assessments in PD. © 2021 International Parkinson and Movement Disorder Society.
Collapse
Affiliation(s)
- Adolfo M García
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina.,National Scientific and Technical Research Council (CONICET), Buenos Aires, Argentina.,Departamento de Lingüística y Literatura, Facultad de Humanidades, Universidad de Santiago de Chile, Santiago, Chile.,Global Brain Health Institute, University of California, San Francisco, California, USA
| | - Tomás Arias-Vergara
- GITA Lab, Faculty of Engineering, Universidad de Antioquia UdeA, Medellín, Colombia.,Pattern Recognition Lab, Friedrich-Alexander University, Erlangen, Nürnberg, Germany.,Department of Otorhinolaryngology, Head and Neck Surgery, Ludwig-Maximilians University, Munich, Germany
| | - Juan C Vasquez-Correa
- GITA Lab, Faculty of Engineering, Universidad de Antioquia UdeA, Medellín, Colombia.,Pattern Recognition Lab, Friedrich-Alexander University, Erlangen, Nürnberg, Germany
| | - Elmar Nöth
- Friedrich-Alexander University Erlangen-Nuremberg
| | - Maria Schuster
- Department of Otorhinolaryngology, Head and Neck Surgery, Ludwig-Maximilians University, Munich, Germany
| | - Ariane E Welch
- Memory and Aging Center, University of California, San Francisco, California, USA
| | - Yamile Bocanegra
- Grupo de Neurociencias de Antioquia, Facultad de Medicina, Universidad de Antioquia, Medellín, Colombia
| | - Ana Baena
- Grupo de Neurociencias de Antioquia, Facultad de Medicina, Universidad de Antioquia, Medellín, Colombia
| | - Juan R Orozco-Arroyave
- GITA Lab, Faculty of Engineering, Universidad de Antioquia UdeA, Medellín, Colombia.,Pattern Recognition Lab, Friedrich-Alexander University, Erlangen, Nürnberg, Germany
| |
Collapse
|
30
|
Improved Estimation of Parkinsonian Vowel Quality through Acoustic Feature Assimilation. ScientificWorldJournal 2021; 2021:6076828. [PMID: 34335114 PMCID: PMC8298151 DOI: 10.1155/2021/6076828] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 10/17/2020] [Accepted: 06/30/2021] [Indexed: 02/06/2023] Open
Abstract
This paper investigated the performance of a number of acoustic measures, both individually and in combination, in predicting the perceived quality of sustained vowels produced by people impaired with Parkinson's disease (PD). Sustained vowel recordings were collected from 51 PD patients before and after the administration of the Levodopa medication. Subjective ratings of the overall vowel quality were garnered using a visual analog scale. These ratings served to benchmark the effectiveness of the acoustic measures. Acoustic predictors of the perceived vowel quality included the harmonics-to-noise ratio (HNR), smoothed cepstral peak prominence (CPP), recurrence period density entropy (RPDE), Gammatone frequency cepstral coefficients (GFCCs), linear prediction (LP) coefficients and their variants, and modulation spectrogram features. Linear regression (LR) and support vector regression (SVR) models were employed to assimilate multiple features. Different feature dimensionality reduction methods were investigated to avoid model overfitting and enhance the prediction capabilities for the test dataset. Results showed that the RPDE measure performed the best among all individual features, while a regression model incorporating a subset of features produced the best overall correlation of 0.80 between the predicted and actual vowel quality ratings. This model may therefore serve as a surrogate for auditory-perceptual assessment of Parkinsonian vowel quality. Furthermore, the model may offer the clinician a tool to predict who may benefit from Levodopa medication in terms of enhanced voice quality.
Collapse
|
31
|
Abstract
PURPOSE OF REVIEW The COVID-pandemic has facilitated the implementation of telemedicine in both clinical practice and research. We highlight recent developments in three promising areas of telemedicine: teleconsultation, telemonitoring, and teletreatment. We illustrate this using Parkinson's disease as a model for other chronic neurological disorders. RECENT FINDINGS Teleconsultations can reliably administer parts of the neurological examination remotely, but are typically not useful for establishing a reliable diagnosis. For follow-ups, teleconsultations can provide enhanced comfort and convenience to patients, and provide opportunities for blended and proactive care models. Barriers include technological challenges, limited clinician confidence, and a suboptimal clinician-patient relationship. Telemonitoring using wearable sensors and smartphone-based apps can support clinical decision-making, but we lack large-scale randomized controlled trials to prove effectiveness on clinical outcomes. Increasingly many trials are now incorporating telemonitoring as an exploratory outcome, but more work remains needed to demonstrate its clinical meaningfulness. Finding a balance between benefits and burdens for individual patients remains vital. Recent work emphasised the promise of various teletreatment solutions, such as remotely adjustable deep brain stimulation parameters, virtual reality enhanced exercise programs, and telephone-based cognitive behavioural therapy. Personal contact remains essential to ascertain adherence to teletreatment. SUMMARY The availability of different telemedicine tools for remote consultation, monitoring, and treatment is increasing. Future research should establish whether telemedicine improves outcomes in routine clinical care, and further underpin its merits both as intervention and outcome in research settings.
Collapse
Affiliation(s)
- Robin van den Bergh
- Radboud University Medical Center, Donders Institute for Brain, Cognition and Behaviour, Department of Neurology, Center of Expertise for Parkinson & Movement Disorders
| | - Bastiaan R. Bloem
- Radboud University Medical Center, Donders Institute for Brain, Cognition and Behaviour, Department of Neurology, Center of Expertise for Parkinson & Movement Disorders
| | - Marjan J. Meinders
- Radboud University Medical Center, Radboud Institute for Health Sciences, Scientific Center for Quality of Healthcare, Nijmegen, The Netherlands
| | - Luc J.W. Evers
- Radboud University Medical Center, Donders Institute for Brain, Cognition and Behaviour, Department of Neurology, Center of Expertise for Parkinson & Movement Disorders
| |
Collapse
|
32
|
Advances in Parkinson's Disease detection and assessment using voice and speech: A review of the articulatory and phonatory aspects. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102418] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
33
|
Gómez A, Tsanas A, Gómez P, Palacios-Alonso D, Rodellar V, Álvarez A. Acoustic to kinematic projection in Parkinson’s disease dysarthria. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102422] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
34
|
Jeancolas L, Petrovska-Delacrétaz D, Mangone G, Benkelfat BE, Corvol JC, Vidailhet M, Lehéricy S, Benali H. X-Vectors: New Quantitative Biomarkers for Early Parkinson's Disease Detection From Speech. Front Neuroinform 2021; 15:578369. [PMID: 33679361 PMCID: PMC7935511 DOI: 10.3389/fninf.2021.578369] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 01/18/2021] [Indexed: 01/18/2023] Open
Abstract
Many articles have used voice analysis to detect Parkinson's disease (PD), but few have focused on the early stages of the disease and the gender effect. In this article, we have adapted the latest speaker recognition system, called x-vectors, in order to detect PD at an early stage using voice analysis. X-vectors are embeddings extracted from Deep Neural Networks (DNNs), which provide robust speaker representations and improve speaker recognition when large amounts of training data are used. Our goal was to assess whether, in the context of early PD detection, this technique would outperform the more standard classifier MFCC-GMM (Mel-Frequency Cepstral Coefficients—Gaussian Mixture Model) and, if so, under which conditions. We recorded 221 French speakers (recently diagnosed PD subjects and healthy controls) with a high-quality microphone and via the telephone network. Men and women were analyzed separately in order to have more precise models and to assess a possible gender effect. Several experimental and methodological aspects were tested in order to analyze their impacts on classification performance. We assessed the impact of the audio segment durations, data augmentation, type of dataset used for the neural network training, kind of speech tasks, and back-end analyses. X-vectors technique provided better classification performances than MFCC-GMM for the text-independent tasks, and seemed to be particularly suited for the early detection of PD in women (7–15% improvement). This result was observed for both recording types (high-quality microphone and telephone).
Collapse
Affiliation(s)
- Laetitia Jeancolas
- Paris Brain Institute-ICM, Centre de NeuroImagerie de Recherche-CENIR, Paris, France.,Laboratoire SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, Palaiseau, France
| | | | - Graziella Mangone
- Sorbonne University, Inserm, CNRS, Paris Brain Institute-ICM, Paris, France.,Assistance Publique Hôpitaux de Paris, Hôpital Pitié-Salpêtrière, Department of Neurology, Clinical Investigation Center for Neurosciences, Paris, France
| | - Badr-Eddine Benkelfat
- Laboratoire SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, Palaiseau, France
| | - Jean-Christophe Corvol
- Sorbonne University, Inserm, CNRS, Paris Brain Institute-ICM, Paris, France.,Assistance Publique Hôpitaux de Paris, Hôpital Pitié-Salpêtrière, Department of Neurology, Clinical Investigation Center for Neurosciences, Paris, France
| | - Marie Vidailhet
- Sorbonne University, Inserm, CNRS, Paris Brain Institute-ICM, Paris, France.,Assistance Publique Hôpitaux de Paris, Hôpital Pitié-Salpêtrière, Department of Neurology, Clinical Investigation Center for Neurosciences, Paris, France
| | - Stéphane Lehéricy
- Paris Brain Institute-ICM, Centre de NeuroImagerie de Recherche-CENIR, Paris, France.,Sorbonne University, Inserm, CNRS, Paris Brain Institute-ICM, Paris, France.,Assistance Publique Hôpitaux de Paris, Hôpital Pitié-Salpêtrière, Department of Neuroradiology, Paris, France
| | - Habib Benali
- Department of Electrical & Computer Engineering, PERFORM Center, Concordia University, Montreal, QC, Canada
| |
Collapse
|
35
|
Tsanas A, Little MA, Ramig LO. Remote Assessment of Parkinson's Disease Symptom Severity Using the Simulated Cellular Mobile Telephone Network. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2021; 9:11024-11036. [PMID: 33495722 PMCID: PMC7821632 DOI: 10.1109/access.2021.3050524] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 12/25/2020] [Indexed: 06/12/2023]
Abstract
Telemonitoring of Parkinson's Disease (PD) has attracted considerable research interest because of its potential to make a lasting, positive impact on the life of patients and their carers. Purpose-built devices have been developed that record various signals which can be associated with average PD symptom severity, as quantified on standard clinical metrics such as the Unified Parkinson's Disease Rating Scale (UPDRS). Speech signals are particularly promising in this regard, because they can be easily recorded without the use of expensive, dedicated hardware. Previous studies have demonstrated replication of UPDRS to within less than 2 points of a clinical raters' assessment of symptom severity, using high-quality speech signals collected using dedicated telemonitoring hardware. Here, we investigate the potential of using the standard voice-over-GSM (2G) or UMTS (3G) cellular mobile telephone networks for PD telemonitoring, networks that, together, have greater than 5 billion subscribers worldwide. We test the robustness of this approach using a simulated noisy mobile communication network over which speech signals are transmitted, and approximately 6000 recordings from 42 PD subjects. We show that UPDRS can be estimated to within less than 3.5 points difference from the clinical raters' assessment, which is clinically useful given that the inter-rater variability for UPDRS can be as high as 4-5 UPDRS points. This provides compelling evidence that the existing voice telephone network has potential towards facilitating inexpensive, mass-scale PD symptom telemonitoring applications.
Collapse
Affiliation(s)
- Athanasios Tsanas
- Edinburgh Medical SchoolUsher Institute, The University of EdinburghEdinburghEH16 4UXU.K.
| | - Max A. Little
- School of Computer ScienceUniversity of BirminghamBirminghamB15 2TTU.K.
| | - Lorraine O. Ramig
- Department of Speech, Language, and Hearing ScienceUniversity of Colorado BoulderBoulderCO80309USA
- National Center for Voice and SpeechDenverCO80014USA
| |
Collapse
|
36
|
Fagherazzi G, Fischer A, Ismael M, Despotovic V. Voice for Health: The Use of Vocal Biomarkers from Research to Clinical Practice. Digit Biomark 2021; 5:78-88. [PMID: 34056518 PMCID: PMC8138221 DOI: 10.1159/000515346] [Citation(s) in RCA: 64] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 02/18/2021] [Indexed: 12/17/2022] Open
Abstract
Diseases can affect organs such as the heart, lungs, brain, muscles, or vocal folds, which can then alter an individual's voice. Therefore, voice analysis using artificial intelligence opens new opportunities for healthcare. From using vocal biomarkers for diagnosis, risk prediction, and remote monitoring of various clinical outcomes and symptoms, we offer in this review an overview of the various applications of voice for health-related purposes. We discuss the potential of this rapidly evolving environment from a research, patient, and clinical perspective. We also discuss the key challenges to overcome in the near future for a substantial and efficient use of voice in healthcare.
Collapse
Affiliation(s)
- Guy Fagherazzi
- Deep Digital Phenotyping Research Unit, Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Aurélie Fischer
- Deep Digital Phenotyping Research Unit, Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Muhannad Ismael
- IT for Innovation in Services Department (ITIS), Luxembourg Institute of Science and Technology (LIST), Esch-sur-Alzette, Luxembourg
| | - Vladimir Despotovic
- Department of Computer Science, Faculty of Science, Technology and Medicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
37
|
Sajal MSR, Ehsan MT, Vaidyanathan R, Wang S, Aziz T, Mamun KAA. Telemonitoring Parkinson's disease using machine learning by combining tremor and voice analysis. Brain Inform 2020; 7:12. [PMID: 33090328 PMCID: PMC7579898 DOI: 10.1186/s40708-020-00113-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 10/03/2020] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND With the growing number of the aged population, the number of Parkinson's disease (PD) affected people is also mounting. Unfortunately, due to insufficient resources and awareness in underdeveloped countries, proper and timely PD detection is highly challenged. Besides, all PD patients' symptoms are neither the same nor they all become pronounced at the same stage of the illness. Therefore, this work aims to combine more than one symptom (rest tremor and voice degradation) by collecting data remotely using smartphones and detect PD with the help of a cloud-based machine learning system for telemonitoring the PD patients in the developing countries. METHOD This proposed system receives rest tremor and vowel phonation data acquired by smartphones with built-in accelerometer and voice recorder sensors. The data are primarily collected from diagnosed PD patients and healthy people for building and optimizing machine learning models that exhibit higher performance. After that, data from newly suspected PD patients are collected, and the trained algorithms are evaluated to detect PD. Based on the majority-vote from those algorithms, PD-detected patients are connected with a nearby neurologist for consultation. Upon receiving patients' feedback after being diagnosed by the neurologist, the system may update the model by retraining using the latest data. Also, the system requests the detected patients periodically to upload new data to track their disease progress. RESULT The highest accuracy in PD detection using offline data was [Formula: see text] from voice data and [Formula: see text] from tremor data when used separately. In both cases, k-nearest neighbors (kNN) gave the highest accuracy over support vector machine (SVM) and naive Bayes (NB). The application of maximum relevance minimum redundancy (MRMR) feature selection method showed that by selecting different feature sets based on the patient's gender, we could improve the detection accuracy. This study's novelty is the application of ensemble averaging on the combined decisions generated from the analysis of voice and tremor data. The average accuracy of PD detection becomes [Formula: see text] when ensemble averaging was performed on majority-vote from kNN, SVM, and NB. CONCLUSION The proposed system can detect PD using a cloud-based system for computation, data preserving, and regular monitoring of voice and tremor samples captured by smartphones. Thus, this system can be a solution for healthcare authorities to ensure the older population's accessibility to a better medical diagnosis system in the developing countries, especially in the pandemic situation like COVID-19, when in-person monitoring is minimal.
Collapse
Affiliation(s)
- Md Sakibur Rahman Sajal
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh. .,Advanced Intelligent Multidisciplinary Systems Lab (AIMS Lab), Institute of Advanced Research, United International University, Dhaka, Bangladesh.
| | - Md Tanvir Ehsan
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh.,Advanced Intelligent Multidisciplinary Systems Lab (AIMS Lab), Institute of Advanced Research, United International University, Dhaka, Bangladesh
| | - Ravi Vaidyanathan
- Department of Mechanical Engineering, Imperial College London, London, UK
| | - Shouyan Wang
- Institute of Science and Technology for Brain-inspired Intelligence (ISTBI), Fudan University, Shanghai, People's Republic of China
| | - Tipu Aziz
- Functional Neurosurgery and Experimental Neurology Group, University of Oxford, Oxford, UK
| | - Khondaker Abdullah Al Mamun
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh.,Advanced Intelligent Multidisciplinary Systems Lab (AIMS Lab), Institute of Advanced Research, United International University, Dhaka, Bangladesh
| |
Collapse
|
38
|
Sidorova J, Anisimova M. Impact of Diabetes Mellitus on Voice : A Methodological Commentary. J Voice 2020; 36:294.e1-294.e12. [PMID: 32739034 DOI: 10.1016/j.jvoice.2020.05.015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 05/14/2020] [Accepted: 05/26/2020] [Indexed: 11/18/2022]
Affiliation(s)
- Julia Sidorova
- Blekinge Institute of Technology, Vallhallavagän 1, Karlskrona, 37141, Sweden.
| | - Maria Anisimova
- Zurich University of Applied Sciences, Technikumstrasse, 9, 8400, Winterthur
| |
Collapse
|
39
|
Lesot MJ, Vieira S, Reformat MZ, Carvalho JP, Wilbik A, Bouchon-Meunier B, Yager RR. Hybrid Model for Parkinson’s Disease Prediction. INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS 2020. [PMCID: PMC7274681 DOI: 10.1007/978-3-030-50143-3_49] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Parkinson’s is a chronic, progressive neurological disease with no known cause that affects the central nervous system of older people and compromises their movement. This disorder can impair daily aspects of people and therefore identify their existence early, helps in choosing treatments that can reduce the impact of the disease on the patient’s routine. This work aims to identify Parkinson’s traces through a voice recording replications database applied to a fuzzy neural network to identify their patterns and enable the extraction of knowledge about situations present in the data collected in patients. The results obtained by the hybrid model were superior to state of the art for the theme, proving that it is possible to perform hybrid models in the extraction of knowledge and the classification of behavioral patterns of high accuracy Parkinson’s.
Collapse
Affiliation(s)
| | - Susana Vieira
- IDMEC, IST, Universidade de Lisboa, Lisbon, Portugal
| | | | | | - Anna Wilbik
- Eindhoven University of Technology, Eindhoven, The Netherlands
| | | | | |
Collapse
|