1
|
Larsen E, Murton O, Song X, Joachim D, Watts D, Kapczinski F, Venesky L, Hurowitz G. Validating the efficacy and value proposition of mental fitness vocal biomarkers in a psychiatric population: prospective cohort study. Front Psychiatry 2024; 15:1342835. [PMID: 38505797 PMCID: PMC10948552 DOI: 10.3389/fpsyt.2024.1342835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 02/14/2024] [Indexed: 03/21/2024] Open
Abstract
Background The utility of vocal biomarkers for mental health assessment has gained increasing attention. This study aims to further this line of research by introducing a novel vocal scoring system designed to provide mental fitness tracking insights to users in real-world settings. Methods A prospective cohort study with 104 outpatient psychiatric participants was conducted to validate the "Mental Fitness Vocal Biomarker" (MFVB) score. The MFVB score was derived from eight vocal features, selected based on literature review. Participants' mental health symptom severity was assessed using the M3 Checklist, which serves as a transdiagnostic tool for measuring depression, anxiety, post-traumatic stress disorder, and bipolar symptoms. Results The MFVB demonstrated an ability to stratify individuals by their risk of elevated mental health symptom severity. Continuous observation enhanced the MFVB's efficacy, with risk ratios improving from 1.53 (1.09-2.14, p=0.0138) for single 30-second voice samples to 2.00 (1.21-3.30, p=0.0068) for data aggregated over two weeks. A higher risk ratio of 8.50 (2.31-31.25, p=0.0013) was observed in participants who used the MFVB 5-6 times per week, underscoring the utility of frequent and continuous observation. Participant feedback confirmed the user-friendliness of the application and its perceived benefits. Conclusions The MFVB is a promising tool for objective mental health tracking in real-world conditions, with potential to be a cost-effective, scalable, and privacy-preserving adjunct to traditional psychiatric assessments. User feedback suggests that vocal biomarkers can offer personalized insights and support clinical therapy and other beneficial activities that are associated with improved mental health risks and outcomes.
Collapse
Affiliation(s)
| | | | | | | | - Devon Watts
- Neuroscience Graduate Program, Department of Health Sciences, McMaster University, Hamilton, ON, Canada
- St. Joseph’s Healthcare Hamilton, Hamilton, ON, Canada
| | - Flavio Kapczinski
- Neuroscience Graduate Program, Department of Health Sciences, McMaster University, Hamilton, ON, Canada
- Department of Psychiatry, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | | | | |
Collapse
|
2
|
Kutsuna I, Hoshino A, Morisugi A, Mori Y, Shirato A, Takeda M, Isaji H, Suwa M. Relationship between duration of sick leave and time variation of words used in return-to-work programs for depression. Work 2024; 77:981-991. [PMID: 37781845 DOI: 10.3233/wor-230083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/03/2023] Open
Abstract
BACKGROUND Return-to-work (RTW) programs are provided as rehabilitation for people who have taken sick leave from work because of mental health problems. However, methods to present this information to workplaces objectively remain limited. OBJECTIVE This study aimed to conduct an exploratory investigation of the relationship between duration of sick leave and time variation of words used in RTW programs for depression from textual data collected from electronic medical records as a new evaluation indicator. METHODS The study subjects were those who had taken sick leave because of major depressive or adjustment disorder and had participated in an RTW program. The study data comprised demographic characteristics and texts. Textual data were collected from electronic medical records and classified based on the SOAP note. Thereafter, the textual data were quantified into category scores based on a standard text analysis dictionary. A generalized linear mixed model was used for the statistical analysis, with the score for each category (emotional, social, cognitive, perceptual, biological, motivational, relativity, and informal) as the dependent variable and the duration of sick leave, time, and interaction between the duration of sick leave and time as the independent variables. The level of statistical significance was set at 0.05. RESULTS In total, 42 participants were included in the analysis. The results revealed a significant interaction between the social (p = 0.001) and emotional (p = 0.002) categories. CONCLUSION The findings suggest a relationship between word changes in electronic medical records and the duration of sick leave.
Collapse
Affiliation(s)
- Ichiro Kutsuna
- Department of Health Science, Graduate School of Medicine, Nagoya University, Nagoya, Japan
- Mental Clinic Anser, Medical Corporation Seiseikai, Aichi, Japan
| | - Aiko Hoshino
- Department of Health Science, Graduate School of Medicine, Nagoya University, Nagoya, Japan
- Mental Clinic Anser, Medical Corporation Seiseikai, Aichi, Japan
| | - Ami Morisugi
- Mental Clinic Anser, Medical Corporation Seiseikai, Aichi, Japan
| | - Yukari Mori
- Mental Clinic Anser, Medical Corporation Seiseikai, Aichi, Japan
| | - Aki Shirato
- Hinaga General Center for Mental Care, Mie, Japan
| | - Mirai Takeda
- Hinaga General Center for Mental Care, Mie, Japan
| | - Hikari Isaji
- Department of Health Science, Graduate School of Medicine, Nagoya University, Nagoya, Japan
| | - Mami Suwa
- Mental Clinic Anser, Medical Corporation Seiseikai, Aichi, Japan
| |
Collapse
|
3
|
Silva WJ, Lopes L, Galdino MKC, Almeida AA. Voice Acoustic Parameters as Predictors of Depression. J Voice 2024; 38:77-85. [PMID: 34353686 DOI: 10.1016/j.jvoice.2021.06.018] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 05/24/2021] [Accepted: 06/02/2021] [Indexed: 10/20/2022]
Abstract
OBJECTIVE To analyze whether voice acoustic parameters are discriminant and predictive in patients with and without depression. METHODS Observational case-control study. The following instruments were administered to the participants: Self-Reporting Questionnaire (SRQ-20), Beck Depression Inventory-Second Edition (BDI-II), Voice Symptom Scale (VoiSS) and voice collection for subsequent extraction of the following acoustic parameters: mean, mode and standard deviation (SD) of the fundamental frequency (F0); jitter; shimmer; glottal to noise excitation ratio (GNE); cepstral peak prominence-smoothed (CPPS); and spectral tilt. A total of 144 individuals participated in the study: 54 patients diagnosed with depression (case group) and 90 without a diagnosis of depression (control group). RESULTS The means of the acoustic parameters showed differences between the groups: F0 (SD), jitter, and shimmer values were high, while values for GNE, CPPS and spectral tilt were lower in the case group than in the control group. There was a significant association between BDI-II and jitter, shimmer, CPPS, and spectral tilt and between CPPS and the class of antidepressants used. The multiple linear regression model showed that jitter and CPPS were predictors of depression, as measured by the BDI-II. CONCLUSION Acoustic parameters were able to discriminate between patients with and without depression and were associated with BDI-II scores. The class of antidepressants used was associated with CPPS, and the jitter and CPPS parameters were able to predict the presence of depression, as measured by the BDI-II clinical score.
Collapse
Affiliation(s)
- Wegina Jordana Silva
- Department of Speech Therapy, Federal University of Paraíba (UFPB) and Federal University of Rio Grande do Norte (UFRN), João Pessoa, Paraíba, Brazil.
| | - Leonardo Lopes
- Department of Speech Therapy, Federal University of Paraíba (UFPB), Graduate Program in Speech Therapy, Federal University of Paraíba (UFPB) and Federal University of Rio Grande do Norte (UFRN - PPgFon), Graduate Program in Decision and Health Models (PPgMDS), and Graduate Program in Linguistic (PROLING) of UFPB, João Pessoa, Paraíba, Brazil.
| | - Melyssa Kellyane Cavalcanti Galdino
- Department of Psychology, Federal University of Paraíba (UFPB), Graduate Program in Cognitive Neuroscience and Behavior (PPgNeC) of UFPB, João Pessoa, Paraíba, Brazil.
| | - Anna Alice Almeida
- Department of Speech Therapy, Federal University of Paraíba (UFPB), Graduate Program in Speech Therapy, Federal University of Paraíba (UFPB) and Federal University of Rio Grande do Norte (UFRN - PPgFon), Graduate Program in Decision and Health Models (PPgMDS), and Graduate Program in Cognitive Neuroscience and Behavior (PPgNeC) of UFPB, João Pessoa, Paraíba, Brazil.
| |
Collapse
|
4
|
Yang C, Zhang X, Chen Y, Li Y, Yu S, Zhao B, Wang T, Luo L, Gao S. Emotion-dependent language featuring depression. J Behav Ther Exp Psychiatry 2023; 81:101883. [PMID: 37290350 DOI: 10.1016/j.jbtep.2023.101883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 04/06/2023] [Accepted: 05/27/2023] [Indexed: 06/10/2023]
Abstract
BACKGROUND AND OBJECTIVES Understanding language features of depression contributes to the detection of the disorder. Considering that depression is characterized by dysfunctions in emotion and individuals with depression often show emotion-dependent cognition, the present study investigated the speech features and word use of emotion-dependent narrations in patients with depression. METHODS Forty depression patients and forty controls were required to narrate self-relevant memories under five basic human emotions (i.e., sad, angry, fearful, neutral, and happy). Recorded speech and transcribed texts were analyzed. RESULTS Patients with depression, as compared to non-depressed individuals, talked slower and less. They also performed differently in using negative emotion, work, family, sex, biology, health, and assent words regardless of emotion manipulation. Moreover, the use of words such as first person singular pronoun, past tense, causation, achievement, family, death, psychology, impersonal pronoun, quantifier and preposition words displayed emotion-dependent differences between groups. With the involvement of emotion, linguistic indicators associated with depressive symptoms were identified and explained 71.6% variances of depression severity. LIMITATIONS Word use was analyzed based on the dictionary which does not cover all the words spoken in the memory task, resulting in text data loss. Besides, a relatively small number of depression patients were included in the present study and therefore the results need confirmation in future research using big emotion-dependent data of speech and texts. CONCLUSIONS Our findings suggest that consideration of different emotional contexts is an effective means to improve the accuracy of depression detection via the analysis of word use and speech features.
Collapse
Affiliation(s)
- Chaoqing Yang
- School of Foreign Languages, University of Electronic Science and Technology of China, Chengdu, China
| | - Xinying Zhang
- School of Foreign Languages, University of Electronic Science and Technology of China, Chengdu, China
| | - Yuxuan Chen
- School of Foreign Languages, University of Electronic Science and Technology of China, Chengdu, China
| | - Yunge Li
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China
| | - Shu Yu
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China
| | - Bingmei Zhao
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China
| | - Tao Wang
- School of Psychology, Qufu Normal University, Qufu, China
| | - Lizhu Luo
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China; Singapore Institute for Clinical Sciences, A*STAR Research Entities, Singapore.
| | - Shan Gao
- School of Foreign Languages, University of Electronic Science and Technology of China, Chengdu, China; The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China.
| |
Collapse
|
5
|
Achim AM, Roy MA, Fossard M. The other side of the social interaction: Theory of mind impairments in people with schizophrenia are linked to other people's difficulties in understanding them. Schizophr Res 2023; 259:150-157. [PMID: 35906170 DOI: 10.1016/j.schres.2022.07.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 06/28/2022] [Accepted: 07/01/2022] [Indexed: 10/16/2022]
Abstract
BACKGROUND People with schizophrenia (SZ) often present with theory of mind (ToM) deficits and with speech production deficits. While a link has been established between ToM abilities and symptoms of thought disorder, much less is known about other aspects of speech production in SZ. STUDY DESIGN This is a case-control study in which 25 stable outpatients with recent-onset SZ (27.1 years, 22 men) and 22 matched healthy controls (25.6 years, 16 men) performed a collaborative, verbal production task with a real interaction partner. Blind raters scored how easy participants made it to understand them (Facility ratings), how interesting they were to listen to (Interest ratings) and how expressive they were (Expressivity ratings). ToM was assessed with the Combined Stories Test and Sarfati's cartoon task. Symptoms were assessed with the PANSS five-factor version. STUDY RESULTS Compared to healthy controls, SZ received significantly lower ratings for all three aspects of their verbal productions (Facility, Interest and Expressivity), despite the raters being blind to group membership. Interestingly, the Facility ratings were linked to ToM performance in the SZ group, which suggest that SZ participants who have difficulties understanding others (ToM deficits) also make it harder for others to understand them. Other notable findings include a strong link between the Expressivity ratings and the Interest ratings for both groups, and significant correlations between the Facility ratings and Cognitive/Disorganisation symptoms, and between the Expressivity ratings and both Negative and Depression/Anxiety symptoms in SZ. CONCLUSION Studying speech production during real, collaborative social interactions could help move beyond the individual approach to SZ deficits, making it possible to involve the interaction partners to promote more efficient communication for people with schizophrenia.
Collapse
Affiliation(s)
- Amélie M Achim
- Département de psychiatrie et neurosciences, Université Laval, Pavillon Ferdinand-Vandry, (room 4873), 1050, avenue de la Médecine, Quebec City G1V 0A6, QC, Canada; Centre de recherche CERVO, 2601, de la Canardière, Quebec City G1J 2G3, QC, Canada.
| | - Marc-André Roy
- Département de psychiatrie et neurosciences, Université Laval, Pavillon Ferdinand-Vandry, (room 4873), 1050, avenue de la Médecine, Quebec City G1V 0A6, QC, Canada; Centre de recherche CERVO, 2601, de la Canardière, Quebec City G1J 2G3, QC, Canada
| | - Marion Fossard
- Institut des sciences logopédiques, Université de Neuchâtel, Rue Pierre-à-Mazel 7, CH-2000 Neuchâtel, Switzerland
| |
Collapse
|
6
|
Wang Y, Liang L, Zhang Z, Xu X, Liu R, Fang H, Zhang R, Wei Y, Liu Z, Zhu R, Zhang X, Wang F. Fast and accurate assessment of depression based on voice acoustic features: a cross-sectional and longitudinal study. Front Psychiatry 2023; 14:1195276. [PMID: 37415683 PMCID: PMC10320390 DOI: 10.3389/fpsyt.2023.1195276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 06/02/2023] [Indexed: 07/08/2023] Open
Abstract
Background Depression is a widespread mental disorder that affects a significant portion of the population. However, the assessment of depression is often subjective, relying on standard questions or interviews. Acoustic features have been suggested as a reliable and objective alternative for depression assessment. Therefore, in this study, we aim to identify and explore voice acoustic features that can effectively and rapidly predict the severity of depression, as well as investigate the potential correlation between specific treatment options and voice acoustic features. Methods We utilized voice acoustic features correlated with depression scores to train a prediction model based on artificial neural network. Leave-one-out cross-validation was performed to evaluate the performance of the model. We also conducted a longitudinal study to analyze the correlation between the improvement of depression and changes in voice acoustic features after an Internet-based cognitive-behavioral therapy (ICBT) program consisting of 12 sessions. Results Our study showed that the neural network model trained based on the 30 voice acoustic features significantly correlated with HAMD scores can accurately predict the severity of depression with an absolute mean error of 3.137 and a correlation coefficient of 0.684. Furthermore, four out of the 30 features significantly decreased after ICBT, indicating their potential correlation with specific treatment options and significant improvement in depression (p < 0.05). Conclusion Voice acoustic features can effectively and rapidly predict the severity of depression, providing a low-cost and efficient method for screening patients with depression on a large scale. Our study also identified potential acoustic features that may be significantly related to specific treatment options for depression.
Collapse
Affiliation(s)
- Yang Wang
- Psychology Institute, Inner Mongolia Normal University, Hohhot, Inner Mongolia, China
- Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China
- Functional Brain Imaging Institute, Nanjing Medical University, Nanjing, China
| | - Lijuan Liang
- Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China
- Functional Brain Imaging Institute, Nanjing Medical University, Nanjing, China
- Laboratory of Psychology, The First Affiliated Hospital of Hainan Medical University, Haikou, Hainan, China
| | - Zhongguo Zhang
- Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China
- Functional Brain Imaging Institute, Nanjing Medical University, Nanjing, China
- The Fourth People’s Hospital of Yancheng, Yancheng, Jiangsu, China
| | - Xiao Xu
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| | - Rongxun Liu
- Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China
- Functional Brain Imaging Institute, Nanjing Medical University, Nanjing, China
- College of Medical Engineering, Xinxiang Medical University, Xinxiang, Henan, China
| | - Hanzheng Fang
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Ran Zhang
- Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China
- Functional Brain Imaging Institute, Nanjing Medical University, Nanjing, China
| | - Yange Wei
- Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China
- Functional Brain Imaging Institute, Nanjing Medical University, Nanjing, China
| | - Zhongchun Liu
- Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan, Hubei, China
| | - Rongxin Zhu
- Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China
- Functional Brain Imaging Institute, Nanjing Medical University, Nanjing, China
| | - Xizhe Zhang
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| | - Fei Wang
- Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China
- Functional Brain Imaging Institute, Nanjing Medical University, Nanjing, China
| |
Collapse
|
7
|
Tan EJ, Neill E, Kleiner JL, Rossell SL. Depressive symptoms are specifically related to speech pauses in schizophrenia spectrum disorders. Psychiatry Res 2023; 321:115079. [PMID: 36716551 DOI: 10.1016/j.psychres.2023.115079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 01/03/2023] [Accepted: 01/25/2023] [Indexed: 01/28/2023]
Abstract
Depression is a common and debilitating mental illness associated with sadness and negativity and is often comorbid with other psychiatric conditions, such as schizophrenia. Depressive symptoms are presently primarily assessed through clinical interviews, however there are other behavioural indicators being investigated as more objective methods of depressive symptom assessment. The present study aimed to evaluate the utility of assessing depression using quantitative speech parameters by comparing speech between 23 schizophrenia/schizoaffective patients with clinically significant depressive symptoms (DP) 19 schizophrenia/schizoaffective patients without depressive symptoms (NDP) and 22 healthy controls with no psychiatric history (HC). Participant audio recordings were transcribed and analyzed to extract five types of speech variables: utterances, words, speaking rate, formulation errors and pauses. The results indicated that DP patients produced significantly more pauses within utterances, and had more utterances with pauses compared to NDP patients and HCs (p = <.05), who performed similarly to each other. Word, speaking rate and formulation errors variables were not significantly different between the patient groups (p > .05). The findings suggest that depressive symptoms may have a specific relationship to speech pauses, and support the potential future use of speech pause assessments as an alternative and objective depression rating and monitoring tool.
Collapse
Affiliation(s)
- Eric J Tan
- Centre for Mental Health and Brain Sciences, Swinburne University of Technology, Melbourne, Australia; Department of Psychiatry, St Vincent's Hospital, Melbourne, Australia.
| | - Erica Neill
- Centre for Mental Health and Brain Sciences, Swinburne University of Technology, Melbourne, Australia; Department of Psychiatry, St Vincent's Hospital, Melbourne, Australia
| | - Jacqui L Kleiner
- Centre for Mental Health and Brain Sciences, Swinburne University of Technology, Melbourne, Australia
| | - Susan L Rossell
- Centre for Mental Health and Brain Sciences, Swinburne University of Technology, Melbourne, Australia; Department of Psychiatry, St Vincent's Hospital, Melbourne, Australia
| |
Collapse
|
8
|
Suparatpinyo S, Soonthornphisaj N. Smart voice recognition based on deep learning for depression diagnosis. Artif Life Robotics 2023. [DOI: 10.1007/s10015-023-00852-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
|
9
|
Adibi P, Kalani S, Zahabi SJ, Asadi H, Bakhtiar M, Heidarpour MR, Roohafza H, Shahoon H, Amouzadeh M. Emotion recognition support system: Where physicians and psychiatrists meet linguists and data engineers. World J Psychiatry 2023; 13:1-14. [PMID: 36687372 PMCID: PMC9850871 DOI: 10.5498/wjp.v13.i1.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Revised: 09/18/2022] [Accepted: 12/21/2022] [Indexed: 01/13/2023] Open
Abstract
An important factor in the course of daily medical diagnosis and treatment is understanding patients’ emotional states by the caregiver physicians. However, patients usually avoid speaking out their emotions when expressing their somatic symptoms and complaints to their non-psychiatrist doctor. On the other hand, clinicians usually lack the required expertise (or time) and have a deficit in mining various verbal and non-verbal emotional signals of the patients. As a result, in many cases, there is an emotion recognition barrier between the clinician and the patients making all patients seem the same except for their different somatic symptoms. In particular, we aim to identify and combine three major disciplines (psychology, linguistics, and data science) approaches for detecting emotions from verbal communication and propose an integrated solution for emotion recognition support. Such a platform may give emotional guides and indices to the clinician based on verbal communication at the consultation time.
Collapse
Affiliation(s)
- Peyman Adibi
- Isfahan Gastroenterology and Hepatology Research Center, Isfahan University of Medical Sciences, Isfahan 8174673461, Iran
| | - Simindokht Kalani
- Department of Psychology, University of Isfahan, Isfahan 8174673441, Iran
| | - Sayed Jalal Zahabi
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 8415683111, Iran
| | - Homa Asadi
- Department of Linguistics, University of Isfahan, Isfahan 8174673441, Iran
| | - Mohsen Bakhtiar
- Department of Linguistics, Ferdowsi University of Mashhad, Mashhad 9177948974, Iran
| | - Mohammad Reza Heidarpour
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 8415683111, Iran
| | - Hamidreza Roohafza
- Department of Psychocardiology, Cardiac Rehabilitation Research Center, Cardiovascular Research Institute (WHO-Collaborating Center), Isfahan University of Medical Sciences, Isfahan 8187698191, Iran
| | - Hassan Shahoon
- Isfahan Gastroenterology and Hepatology Research Center, Isfahan University of Medical Sciences, Isfahan 8174673461, Iran
| | - Mohammad Amouzadeh
- Department of Linguistics, University of Isfahan, Isfahan 8174673441, Iran
- School of International Studies, Sun Yat-sen University, Zhuhai 519082, Guangdong Province, China
| |
Collapse
|
10
|
Ostermann TA, Fuchs M, Hinz A, Engel C, Berger T. Associations of Personality, Physical and Mental Health with Voice Range Profiles. J Voice 2023:S0892-1997(22)00377-0. [PMID: 36599716 DOI: 10.1016/j.jvoice.2022.11.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 11/17/2022] [Accepted: 11/18/2022] [Indexed: 01/03/2023]
Abstract
OBJECTIVES There is evidence in the literature that voice characteristics are linked to mental and physical health. The aim of this explorative study was to determine associations between voice parameters measured by a voice range profile (VRP) and personality, mental and physical health. STUDY DESIGN Cross-sectional population-based study. METHODS As part of the LIFE-Adult-Study, 2639 individuals aged 18-80 years, randomly sampled from the general population, completed both speaking and singing voice tasks and answered questionnaires on depression, anxiety, life satisfaction, personality and quality of life. The voice parameters used were fundamental frequency, sound pressure level, their ranges and maximum phonation time. The associations were examined with the help of correlation and regression analyses. RESULTS Wider ranges between the lowest and highest frequency, between the lowest and highest sound pressure level and longer maximum phonation time were significantly correlated with extraversion and quality of life in both sexes, as well as openness and agreeableness in women. Smaller ranges and shorter maximum phonation time were significantly correlated with depression. Neuroticism in men was inversely correlated with the maximum phonation time. In the speaking VRP, the associations for sound pressure level were more pronounced than for the fundamental frequency. The same was true in reverse for the singing VRP. Few associations were found for anxiety, life satisfaction and conscientiousness. CONCLUSIONS Weak associations between voice parameters derived from the VRP and mental and physical health, as well as personality were seen in this exploratory study. The results indicate that the VRP measurements in a clinical context are not significantly affected by these parameters and thus are a robust measurement method for voice parameters.
Collapse
Affiliation(s)
- Thomas A Ostermann
- Phoniatrics and Audiology, Department of Otorhinolaryngology, University of Leipzig, Leipzig, Germany; LIFE Leipzig Research Centre for Civilization Diseases, University of Leipzig, Leipzig, Germany; Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany
| | - Michael Fuchs
- Phoniatrics and Audiology, Department of Otorhinolaryngology, University of Leipzig, Leipzig, Germany; LIFE Leipzig Research Centre for Civilization Diseases, University of Leipzig, Leipzig, Germany
| | - Andreas Hinz
- LIFE Leipzig Research Centre for Civilization Diseases, University of Leipzig, Leipzig, Germany; Department of Medical Psychology and Medical Sociology, University of Leipzig, Leipzig, Germany
| | - Christoph Engel
- LIFE Leipzig Research Centre for Civilization Diseases, University of Leipzig, Leipzig, Germany; Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany
| | - Thomas Berger
- Phoniatrics and Audiology, Department of Otorhinolaryngology, University of Leipzig, Leipzig, Germany; LIFE Leipzig Research Centre for Civilization Diseases, University of Leipzig, Leipzig, Germany.
| |
Collapse
|
11
|
Dikaios K, Rempel S, Dumpala SH, Oore S, Kiefte M, Uher R. Applications of Speech Analysis in Psychiatry. Harv Rev Psychiatry 2023; 31:1-13. [PMID: 36608078 DOI: 10.1097/HRP.0000000000000356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
ABSTRACT The need for objective measurement in psychiatry has stimulated interest in alternative indicators of the presence and severity of illness. Speech may offer a source of information that bridges the subjective and objective in the assessment of mental disorders. We systematically reviewed the literature for articles exploring speech analysis for psychiatric applications. The utility of speech analysis depends on how accurately speech features represent clinical symptoms within and across disorders. We identified four domains of the application of speech analysis in the literature: diagnostic classification, assessment of illness severity, prediction of onset of illness, and prognosis and treatment outcomes. We discuss the findings in each of these domains, with a focus on how types of speech features characterize different aspects of psychopathology. Models that bring together multiple speech features can distinguish speakers with psychiatric disorders from healthy controls with high accuracy. Differentiating between types of mental disorders and symptom dimensions are more complex problems that expose the transdiagnostic nature of speech features. Convergent progress in speech research and computer sciences opens avenues for implementing speech analysis to enhance objectivity of assessment in clinical practice. Application of speech analysis will need to address issues of ethics and equity, including the potential to perpetuate discriminatory bias through models that learn from clinical assessment data. Methods that mitigate bias are available and should play a key role in the implementation of speech analysis.
Collapse
|
12
|
Liu Z, Yu H, Li G, Chen Q, Ding Z, Feng L, Yao Z, Hu B. Ensemble learning with speaker embeddings in multiple speech task stimuli for depression detection. Front Neurosci 2023; 17:1141621. [PMID: 37034153 PMCID: PMC10076578 DOI: 10.3389/fnins.2023.1141621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 03/09/2023] [Indexed: 04/11/2023] Open
Abstract
Introduction As a biomarker of depression, speech signal has attracted the interest of many researchers due to its characteristics of easy collection and non-invasive. However, subjects' speech variation under different scenes and emotional stimuli, the insufficient amount of depression speech data for deep learning, and the variable length of speech frame-level features have an impact on the recognition performance. Methods The above problems, this study proposes a multi-task ensemble learning method based on speaker embeddings for depression classification. First, we extract the Mel Frequency Cepstral Coefficients (MFCC), the Perceptual Linear Predictive Coefficients (PLP), and the Filter Bank (FBANK) from the out-domain dataset (CN-Celeb) and train the Resnet x-vector extractor, Time delay neural network (TDNN) x-vector extractor, and i-vector extractor. Then, we extract the corresponding speaker embeddings of fixed length from the depression speech database of the Gansu Provincial Key Laboratory of Wearable Computing. Support Vector Machine (SVM) and Random Forest (RF) are used to obtain the classification results of speaker embeddings in nine speech tasks. To make full use of the information of speech tasks with different scenes and emotions, we aggregate the classification results of nine tasks into new features and then obtain the final classification results by using Multilayer Perceptron (MLP). In order to take advantage of the complementary effects of different features, Resnet x-vectors based on different acoustic features are fused in the ensemble learning method. Results Experimental results demonstrate that (1) MFCC-based Resnet x-vectors perform best among the nine speaker embeddings for depression detection; (2) interview speech is better than picture descriptions speech, and neutral stimulus is the best among the three emotional valences in the depression recognition task; (3) our multi-task ensemble learning method with MFCC-based Resnet x-vectors can effectively identify depressed patients; (4) in all cases, the combination of MFCC-based Resnet x-vectors and PLP-based Resnet x-vectors in our ensemble learning method achieves the best results, outperforming other literature studies using the depression speech database. Discussion Our multi-task ensemble learning method with MFCC-based Resnet x-vectors can fuse the depression related information of different stimuli effectively, which provides a new approach for depression detection. The limitation of this method is that speaker embeddings extractors were pre-trained on the out-domain dataset. We will consider using the augmented in-domain dataset for pre-training to improve the depression recognition performance further.
Collapse
Affiliation(s)
- Zhenyu Liu
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, China
| | - Huimin Yu
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, China
| | - Gang Li
- Tianshui Third People’s Hospital, Tianshui, China
| | - Qiongqiong Chen
- Second Provincial People’s Hospital of Gansu, Lanzhou, China
- Affiliated Hospital of Northwest Minzu University, Lanzhou, China
| | - Zhijie Ding
- Tianshui Third People’s Hospital, Tianshui, China
| | - Lei Feng
- Department of Psychiatry, Beijing Anding Hospital of Capital Medical University, Beijing, China
| | - Zhijun Yao
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, China
| | - Bin Hu
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, China
- *Correspondence: Bin Hu,
| |
Collapse
|
13
|
Moon AM, Kim HP, Cook S, Blanchard RT, Haley KL, Jacks A, Shafer JS, Fried MW. Speech patterns and enunciation for encephalopathy determination-A prospective study of hepatic encephalopathy. Hepatol Commun 2022; 6:2876-2885. [PMID: 35861546 PMCID: PMC9512449 DOI: 10.1002/hep4.2054] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 06/01/2022] [Accepted: 06/22/2022] [Indexed: 01/05/2023] Open
Abstract
Hepatic encephalopathy (HE) is a complication of cirrhosis that benefits from early diagnosis and treatment. We aimed to characterize speech patterns of individuals with HE to investigate its potential to diagnose and monitor HE. This was a single-center prospective cohort study that included participants with cirrhosis with HE (minimal HE [MHE] and overt HE [OHE]), cirrhosis without HE, and participants without liver disease. Audio recordings of reading, sentence repetition, and picture description tasks were obtained from these groups. Two certified speech-language pathologists assessed speech rate (words per minute) and articulatory precision. An overall severity metric was derived from these measures. Cross-sectional analyses were performed using nonparametric Wilcoxon statistics to evaluate group differences. Change over time in speech measures was analyzed descriptively for individuals with HE. The study included 43 total participants. Speech results differed by task, but the overall pattern showed slower speech rate and less precise articulation in participants with OHE compared to other groups. When speech rate and precision ratings were combined into a single speech severity metric, the impairment of participants with OHE was more severe than all other groups, and MHE had greater speech impairment than non-liver disease controls. As OHE improved clinically, participants showed notable improvement in speech rate. Participants with OHE demonstrated impaired speech rate, precision, and speech severity compared with non-liver disease and non-HE cirrhosis. Participants with MHE had less pronounced impairments. Speech parameters improved as HE clinically improved. Conclusion: These data identify speech patterns that could improve HE diagnosis, grading, and remote monitoring.
Collapse
Affiliation(s)
- Andrew M. Moon
- Division of Gastroenterology and HepatologyDepartment of MedicineUniversity of North CarolinaChapel HillNorth CarolinaUSA
| | - Hannah P. Kim
- Division of Gastroenterology, Hepatology, and NutritionDepartment of MedicineVanderbilt University Medical CenterNashvilleTennesseeUSA
| | - Sarah Cook
- Division of Gastroenterology and HepatologyDepartment of MedicineUniversity of North CarolinaChapel HillNorth CarolinaUSA
| | - Renee T. Blanchard
- Division of Gastroenterology and HepatologyDepartment of MedicineUniversity of North CarolinaChapel HillNorth CarolinaUSA
| | - Katarina L. Haley
- Division of Speech and Hearing SciencesDepartment of Allied Health SciencesUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
| | - Adam Jacks
- Division of Speech and Hearing SciencesDepartment of Allied Health SciencesUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
| | - Jennifer S. Shafer
- Division of Speech and Hearing SciencesDepartment of Allied Health SciencesUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
| | - Michael W. Fried
- Division of Gastroenterology and HepatologyDepartment of MedicineUniversity of North CarolinaChapel HillNorth CarolinaUSA
| |
Collapse
|
14
|
Berger SE, Baria AT. Assessing Pain Research: A Narrative Review of Emerging Pain Methods, Their Technosocial Implications, and Opportunities for Multidisciplinary Approaches. Front Pain Res (Lausanne) 2022; 3:896276. [PMID: 35721658 PMCID: PMC9201034 DOI: 10.3389/fpain.2022.896276] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 05/12/2022] [Indexed: 11/13/2022] Open
Abstract
Pain research traverses many disciplines and methodologies. Yet, despite our understanding and field-wide acceptance of the multifactorial essence of pain as a sensory perception, emotional experience, and biopsychosocial condition, pain scientists and practitioners often remain siloed within their domain expertise and associated techniques. The context in which the field finds itself today-with increasing reliance on digital technologies, an on-going pandemic, and continued disparities in pain care-requires new collaborations and different approaches to measuring pain. Here, we review the state-of-the-art in human pain research, summarizing emerging practices and cutting-edge techniques across multiple methods and technologies. For each, we outline foreseeable technosocial considerations, reflecting on implications for standards of care, pain management, research, and societal impact. Through overviewing alternative data sources and varied ways of measuring pain and by reflecting on the concerns, limitations, and challenges facing the field, we hope to create critical dialogues, inspire more collaborations, and foster new ideas for future pain research methods.
Collapse
Affiliation(s)
- Sara E. Berger
- Responsible and Inclusive Technologies Research, Exploratory Sciences Division, IBM Thomas J. Watson Research Center, Yorktown Heights, NY, United States
| | | |
Collapse
|
15
|
Zhao Q, Fan HZ, Li YL, Liu L, Wu YX, Zhao YL, Tian ZX, Wang ZR, Tan YL, Tan SP. Vocal Acoustic Features as Potential Biomarkers for Identifying/Diagnosing Depression: A Cross-Sectional Study. Front Psychiatry 2022; 13:815678. [PMID: 35573349 PMCID: PMC9095973 DOI: 10.3389/fpsyt.2022.815678] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 03/30/2022] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND At present, there is no established biomarker for the diagnosis of depression. Meanwhile, studies show that acoustic features convey emotional information. Therefore, this study explored differences in acoustic characteristics between depressed patients and healthy individuals to investigate whether these characteristics can identify depression. METHODS Participants included 71 patients diagnosed with depression from a regional hospital in Beijing, China, and 62 normal controls from within the greater community. We assessed the clinical symptoms of depression of all participants using the Hamilton Depression Scale (HAMD), Hamilton Anxiety Scale (HAMA), and Patient Health Questionnaire (PHQ-9), and recorded the voice of each participant as they read positive, neutral, and negative texts. OpenSMILE was used to analyze their voice acoustics and extract acoustic characteristics from the recordings. RESULTS There were significant differences between the depression and control groups in all acoustic characteristics (p < 0.05). Several mel-frequency cepstral coefficients (MFCCs), including MFCC2, MFCC3, MFCC8, and MFCC9, differed significantly between different emotion tasks; MFCC4 and MFCC7 correlated positively with PHQ-9 scores, and correlations were stable in all emotion tasks. The zero-crossing rate in positive emotion correlated positively with HAMA total score and HAMA somatic anxiety score (r = 0.31, r = 0.34, respectively), and MFCC9 of neutral emotion correlated negatively with HAMD anxiety/somatization scores (r = -0.34). Linear regression showed that the MFCC7-negative was predictive on the PHQ-9 score (β = 0.90, p = 0.01) and MFCC9-neutral was predictive on HAMD anxiety/somatization score (β = -0.45, p = 0.049). Logistic regression showed a superior discriminant effect, with a discrimination accuracy of 89.66%. CONCLUSION The acoustic expression of emotion among patients with depression differs from that of normal controls. Some acoustic characteristics are related to the severity of depressive symptoms and may be objective biomarkers of depression. A systematic method of assessing vocal acoustic characteristics could provide an accurate and discreet means of screening for depression; this method may be used instead of-or in conjunction with-traditional screening methods, as it is not subject to the limitations associated with self-reported assessments wherein subjects may be inclined to provide socially acceptable responses rather than being truthful.
Collapse
Affiliation(s)
- Qing Zhao
- Peking University HuiLongGuan Clinical Medical School, Beijing Huilongguan Hospital, Beijing, China
| | - Hong-Zhen Fan
- Peking University HuiLongGuan Clinical Medical School, Beijing Huilongguan Hospital, Beijing, China
| | - Yan-Li Li
- Peking University HuiLongGuan Clinical Medical School, Beijing Huilongguan Hospital, Beijing, China
| | - Lei Liu
- Peking University HuiLongGuan Clinical Medical School, Beijing Huilongguan Hospital, Beijing, China
| | - Ya-Xue Wu
- Peking University HuiLongGuan Clinical Medical School, Beijing Huilongguan Hospital, Beijing, China
| | - Yan-Li Zhao
- Peking University HuiLongGuan Clinical Medical School, Beijing Huilongguan Hospital, Beijing, China
| | - Zhan-Xiao Tian
- Peking University HuiLongGuan Clinical Medical School, Beijing Huilongguan Hospital, Beijing, China
| | - Zhi-Ren Wang
- Peking University HuiLongGuan Clinical Medical School, Beijing Huilongguan Hospital, Beijing, China
| | - Yun-Long Tan
- Peking University HuiLongGuan Clinical Medical School, Beijing Huilongguan Hospital, Beijing, China
| | - Shu-Ping Tan
- Peking University HuiLongGuan Clinical Medical School, Beijing Huilongguan Hospital, Beijing, China
| |
Collapse
|
16
|
Yamada Y, Shinkawa K, Nemoto M, Arai T. Automatic Assessment of Loneliness in Older Adults Using Speech Analysis on Responses to Daily Life Questions. Front Psychiatry 2021; 12:712251. [PMID: 34966297 PMCID: PMC8710612 DOI: 10.3389/fpsyt.2021.712251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 11/19/2021] [Indexed: 11/13/2022] Open
Abstract
Loneliness is a perceived state of social and emotional isolation that has been associated with a wide range of adverse health effects in older adults. Automatically assessing loneliness by passively monitoring daily behaviors could potentially contribute to early detection and intervention for mitigating loneliness. Speech data has been successfully used for inferring changes in emotional states and mental health conditions, but its association with loneliness in older adults remains unexplored. In this study, we developed a tablet-based application and collected speech responses of 57 older adults to daily life questions regarding, for example, one's feelings and future travel plans. From audio data of these speech responses, we automatically extracted speech features characterizing acoustic, prosodic, and linguistic aspects, and investigated their associations with self-rated scores of the UCLA Loneliness Scale. Consequently, we found that with increasing loneliness scores, speech responses tended to have less inflections, longer pauses, reduced second formant frequencies, reduced variances of the speech spectrum, more filler words, and fewer positive words. The cross-validation results showed that regression and binary-classification models using speech features could estimate loneliness scores with an R 2 of 0.57 and detect individuals with high loneliness scores with 95.6% accuracy, respectively. Our study provides the first empirical results suggesting the possibility of using speech data that can be collected in everyday life for the automatic assessments of loneliness in older adults, which could help develop monitoring technologies for early detection and intervention for mitigating loneliness.
Collapse
Affiliation(s)
| | | | - Miyuki Nemoto
- Dementia Medical Center, University of Tsukuba Hospital, Tsukuba, Japan
| | - Tetsuaki Arai
- Division of Clinical Medicine, Department of Psychiatry, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan
| |
Collapse
|
17
|
Abstract
Late-life depression (LLD) is a major public health concern. Despite the availability of effective treatments for depression, barriers to screening and diagnosis still exist. The use of current standardized depression assessments can lead to underdiagnosis or misdiagnosis due to subjective symptom reporting and the distinct cognitive, psychomotor, and somatic features of LLD. To overcome these limitations, there has been a growing interest in the development of objective measures of depression using artificial intelligence (AI) technologies such as natural language processing (NLP). NLP approaches focus on the analysis of acoustic and linguistic aspects of human language derived from text and speech and can be integrated with machine learning approaches to classify depression and its severity. In this review, we will provide rationale for the use of NLP methods to study depression using speech, summarize previous research using NLP in LLD, compare findings to younger adults with depression and older adults with other clinical conditions, and discuss future directions including the use of complementary AI strategies to fully capture the spectrum of LLD.
Collapse
Affiliation(s)
| | | | | | - Anthony Yeung
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
18
|
Little B, Alshabrawy O, Stow D, Ferrier IN, McNaney R, Jackson DG, Ladha K, Ladha C, Ploetz T, Bacardit J, Olivier P, Gallagher P, O'Brien JT. Deep learning-based automated speech detection as a marker of social functioning in late-life depression. Psychol Med 2021; 51:1441-1450. [PMID: 31944174 PMCID: PMC8311821 DOI: 10.1017/s0033291719003994] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Revised: 10/23/2019] [Accepted: 12/13/2019] [Indexed: 11/24/2022]
Abstract
BACKGROUND Late-life depression (LLD) is associated with poor social functioning. However, previous research uses bias-prone self-report scales to measure social functioning and a more objective measure is lacking. We tested a novel wearable device to measure speech that participants encounter as an indicator of social interaction. METHODS Twenty nine participants with LLD and 29 age-matched controls wore a wrist-worn device continuously for seven days, which recorded their acoustic environment. Acoustic data were automatically analysed using deep learning models that had been developed and validated on an independent speech dataset. Total speech activity and the proportion of speech produced by the device wearer were both detected whilst maintaining participants' privacy. Participants underwent a neuropsychological test battery and clinical and self-report scales to measure severity of depression, general and social functioning. RESULTS Compared to controls, participants with LLD showed poorer self-reported social and general functioning. Total speech activity was much lower for participants with LLD than controls, with no overlap between groups. The proportion of speech produced by the participants was smaller for LLD than controls. In LLD, both speech measures correlated with attention and psychomotor speed performance but not with depression severity or self-reported social functioning. CONCLUSIONS Using this device, LLD was associated with lower levels of speech than controls and speech activity was related to psychomotor retardation. We have demonstrated that speech activity measured by wearable technology differentiated LLD from controls with high precision and, in this study, provided an objective measure of an aspect of real-world social functioning in LLD.
Collapse
Affiliation(s)
- Bethany Little
- Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, UK
| | - Ossama Alshabrawy
- Interdisciplinary Computing and Complex BioSystems (ICOS) group, School of Computing, Newcastle University, Newcastle upon Tyne, UK
- Faculty of Science, Damietta University, New Damietta, Egypt
| | - Daniel Stow
- Institute of Health and Society, Newcastle University, Newcastle upon Tyne, UK
| | - I. Nicol Ferrier
- Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, UK
| | | | - Daniel G. Jackson
- Open Lab, School of Computing, Newcastle University, Newcastle upon Tyne, UK
| | - Karim Ladha
- Open Lab, School of Computing, Newcastle University, Newcastle upon Tyne, UK
| | | | - Thomas Ploetz
- School of Interactive Computing, Georgia Institute of Technology, Atlanta, GA, USA
| | - Jaume Bacardit
- Interdisciplinary Computing and Complex BioSystems (ICOS) group, School of Computing, Newcastle University, Newcastle upon Tyne, UK
| | - Patrick Olivier
- Faculty of Information Technology, Monash University, Melbourne, Australia
| | - Peter Gallagher
- Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, UK
| | - John T. O'Brien
- Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, UK
- Department of Psychiatry, University of Cambridge, Cambridge, UK
| |
Collapse
|
19
|
Di Y, Wang J, Li W, Zhu T. Using i-vectors from voice features to identify major depressive disorder. J Affect Disord 2021; 288:161-166. [PMID: 33895418 DOI: 10.1016/j.jad.2021.04.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 03/27/2021] [Accepted: 04/02/2021] [Indexed: 11/28/2022]
Abstract
BACKGROUND Machine-learning methods using acoustic features in the diagnosis of major depressive disorder (MDD) have insufficient evidence from large-scale samples and clinical trials. This study aimed to evaluate the effectiveness of the promising i-vector method on a large sample of women with recurrent MDD diagnosed clinically, examine its robustness, and provide an explicit acoustic explanation of the i-vectors. METHODS We collected utterances edited from clinical interview speech records of 785 depressed and 1,023 healthy individuals. Then, we extracted Mel-frequency cepstral coefficient (MFCC) features and MFCC i-vectors from their utterances. To examine the effectiveness of i-vectors, we compared the performance of binary logistic regression between MFCC i-vectors and MFCC features and tested its robustness on different utterance durations. We also determined the correlation between MFCC features and MFCC i-vectors to analyze the acoustic meaning of i-vectors. RESULTS The i-vectors improved 7% and 14% of area under the curve (AUC) for MFCC features using different utterances. When the duration is > 40 s, the classification results are stabilized. The i-vectors are consistently correlated to the maximum, minimum, and deviations of MFCC features (either positively or negatively). LIMITATIONS This study included only women. CONCLUSIONS The i-vectors can improve 14% of the AUC on a large-scale clinical sample. This system is robust to utterance duration > 40 s. This study provides a foundation for exploring the clinical application of voice features in the diagnosis of MDD.
Collapse
Affiliation(s)
- Yazheng Di
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jingying Wang
- School of Optometry, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
| | - Weidong Li
- Shanghai Jiao Tong University, Shanghai 200240, China.
| | - Tingshao Zhu
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
20
|
Albuquerque L, Valente ARS, Teixeira A, Figueiredo D, Sa-Couto P, Oliveira C. Association between acoustic speech features and non-severe levels of anxiety and depression symptoms across lifespan. PLoS One 2021; 16:e0248842. [PMID: 33831018 PMCID: PMC8031302 DOI: 10.1371/journal.pone.0248842] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 03/07/2021] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Several studies have investigated the acoustic effects of diagnosed anxiety and depression. Anxiety and depression are not characteristics of the typical aging process, but minimal or mild symptoms can appear and evolve with age. However, the knowledge about the association between speech and anxiety or depression is scarce for minimal/mild symptoms, typical of healthy aging. As longevity and aging are still a new phenomenon worldwide, posing also several clinical challenges, it is important to improve our understanding of non-severe mood symptoms' impact on acoustic features across lifetime. The purpose of this study was to determine if variations in acoustic measures of voice are associated with non-severe anxiety or depression symptoms in adult population across lifetime. METHODS Two different speech tasks (reading vowels in disyllabic words and describing a picture) were produced by 112 individuals aged 35-97. To assess anxiety and depression symptoms, the Hospital Anxiety Depression Scale (HADS) was used. The association between the segmental and suprasegmental acoustic parameters and HADS scores were analyzed using the linear multiple regression technique. RESULTS The number of participants with presence of anxiety or depression symptoms is low (>7: 26.8% and 10.7%, respectively) and non-severe (HADS-A: 5.4 ± 2.9 and HADS-D: 4.2 ± 2.7, respectively). Adults with higher anxiety symptoms did not present significant relationships associated with the acoustic parameters studied. Adults with increased depressive symptoms presented higher vowel duration, longer total pause duration and short total speech duration. Finally, age presented a positive and significant effect only for depressive symptoms, showing that older participants tend to have more depressive symptoms. CONCLUSIONS Non-severe depression symptoms can be related to some acoustic parameters and age. Depression symptoms can be explained by acoustic parameters even among individuals without severe symptom levels.
Collapse
Affiliation(s)
- Luciana Albuquerque
- Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal
- Center of Health Technology and Services Research, University of Aveiro, Aveiro, Portugal
- Department of Electronics Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
- Department of Education and Psychology, University of Aveiro, Aveiro, Portugal
- * E-mail:
| | - Ana Rita S. Valente
- Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal
- Department of Electronics Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
| | - António Teixeira
- Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal
- Department of Electronics Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
| | - Daniela Figueiredo
- Center of Health Technology and Services Research, University of Aveiro, Aveiro, Portugal
- School of Health Science, University of Aveiro, Aveiro, Portugal
| | - Pedro Sa-Couto
- Center for Research and Development in Mathematics and Applications, University of Aveiro, Aveiro, Portugal
- Department of Mathematics, University of Aveiro, Aveiro, Portugal
| | - Catarina Oliveira
- Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal
- School of Health Science, University of Aveiro, Aveiro, Portugal
| |
Collapse
|
21
|
Suleman R, Tucker BV, Dursun SM, Demas ML. The Neurostimulation of the Brain in Depression Trial: Protocol for a Randomized Controlled Trial of Transcranial Direct Current Stimulation in Treatment-Resistant Depression. JMIR Res Protoc 2021; 10:e22805. [PMID: 33729165 PMCID: PMC8088846 DOI: 10.2196/22805] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 01/20/2021] [Accepted: 02/18/2021] [Indexed: 01/28/2023] Open
Abstract
Background Major depressive disorder (MDD) is the second highest cause of disability worldwide. Standard treatments for MDD include medicine and talk therapy; however, approximately 1 in 5 Canadians fail to respond to these approaches and must consider alternatives. Transcranial direct current stimulation (tDCS) is a safe, noninvasive method that uses electrical stimulation to change the activation pattern of different brain regions. By targeting those regions known to be affected in MDD, tDCS may be useful in ameliorating treatment-resistant depression. Objective The objective of the Neurostimulation of the Brain in Depression trial is to compare the effectiveness of active versus sham tDCS in treating patients with ultraresistant MDD. The primary outcome will be the improvement in depressive symptoms, as measured by the change on the Mongtomery-Asberg Depression Rating Scale. Secondary outcomes will include changes in the Quick Inventory of Depressive Symptomatology Scale (subjective assessment), the World Health Organization Disability Assessment Schedule 2.0 (functional assessment), and the Screen for Cognitive Impairment in Psychiatry (cognitive assessment). Adverse events will be captured using the Young Mania Rating Scale; tDCS Adverse Events Questionnaire; Frequency, Intensity, and Burden of Side Effects Rating Scale; and Patient-Rated Inventory of Side Effects Scale. A parallel component of the study will involve assaying for baseline language function and the effect of treatment on language using an exploratory acoustic and semantic corpus analysis on recorded interviews. Participant accuracy and response latency on an auditory lexical decision task will also be evaluated. Methods We will recruit inpatients and outpatients in the city of Edmonton, Alberta, and will deliver the study interventions at the Grey Nuns and University of Alberta Hospitals. Written informed consent will be obtained from all participants before enrollment. Eligible participants will be randomly assigned, in a double-blinded fashion, to receive active or sham tDCS, and they will continue receiving their usual pharmacotherapy and psychotherapy throughout the trial. In both groups, participants will receive 30 weekday stimulation sessions, each session being 30 minutes in length, with the anode over the left dorsolateral prefrontal cortex and the cathode over the right. Participants in the active group will be stimulated at 2 mA throughout, whereas the sham group will receive only a brief period of stimulation to mimic skin sensations felt in the active group. Measurements will be conducted at regular points throughout the trial and 30 days after trial completion. Results The trial has been approved by the University of Alberta Research Ethics Board and is scheduled to commence in June 2021. The target sample size is 60 participants. Conclusions This is a protocol for a multicenter, double-blinded, randomized controlled superiority trial comparing active versus sham tDCS in patients with treatment-resistant MDD. Trial Registration ClinicalTrials.gov NCT04159012; http://clinicaltrials.gov/ct2/show/NCT04159012. International Registered Report Identifier (IRRID) PRR1-10.2196/22805
Collapse
Affiliation(s)
- Raheem Suleman
- Department of Psychiatry, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada
| | - Benjamin V Tucker
- Department of Linguistics, Faculty of Arts, University of Alberta, Edmonton, AB, Canada
| | - Serdar M Dursun
- Department of Psychiatry, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada.,Grey Nuns Community Hospital, Edmonton, AB, Canada
| | - Michael L Demas
- Department of Psychiatry, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada.,Grey Nuns Community Hospital, Edmonton, AB, Canada
| |
Collapse
|
22
|
Tucker BV, Ford C, Hedges S. Speech aging: Production and perception. Wiley Interdiscip Rev Cogn Sci 2021; 12:e1557. [PMID: 33651922 DOI: 10.1002/wcs.1557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 12/18/2020] [Accepted: 02/05/2021] [Indexed: 11/06/2022]
Abstract
In this overview we describe literature on how speech production and speech perception change in healthy or normal aging across the adult lifespan. In the production section we review acoustic characteristics that have been investigated as potentially distinguishing younger and older adults. In the speech perception section studies concerning speaker age estimation and those investigating older listeners' perception are addressed. Our discussion focuses on major themes and other fruitful areas for future research. This article is categorized under: Linguistics > Language in Mind and Brain Linguistics > Linguistic Theory Psychology > Development and Aging.
Collapse
Affiliation(s)
- Benjamin V Tucker
- Department of Linguistics, University of Alberta, Edmonton, Alberta, Canada
| | - Catherine Ford
- Department of Linguistics, University of Alberta, Edmonton, Alberta, Canada
| | - Stephanie Hedges
- Department of Linguistics, University of Alberta, Edmonton, Alberta, Canada
| |
Collapse
|
23
|
Gaur S, Satapathy S, Kaushik R, Sikary AK, Behera C. A multifaceted expression study of audio-visual suicide notes. Asian J Psychiatr 2020; 54:102297. [PMID: 32674067 DOI: 10.1016/j.ajp.2020.102297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 07/04/2020] [Accepted: 07/06/2020] [Indexed: 11/29/2022]
Affiliation(s)
- Sanya Gaur
- Department of Forensic Medicine, All India Institute of Medical Sciences, New Delhi, India.
| | - Sujata Satapathy
- Department of Psychiatry, All India Institute of Medical Sciences, New Delhi, 110029, India.
| | - Ruchika Kaushik
- Department of Forensic Medicine, All India Institute of Medical Sciences, New Delhi, 110029, India.
| | - Asit Kumar Sikary
- Department of Forensic Medicine, ESI Medical College, Faridabad, Haryana, India.
| | - Chittaranjan Behera
- Department of Forensic Medicine, All India Institute of Medical Sciences, New Delhi, 110029, India.
| |
Collapse
|
24
|
Yamamoto M, Takamiya A, Sawada K, Yoshimura M, Kitazawa M, Liang KC, Fujita T, Mimura M, Kishimoto T. Using speech recognition technology to investigate the association between timing-related speech features and depression severity. PLoS One 2020; 15:e0238726. [PMID: 32915846 PMCID: PMC7485753 DOI: 10.1371/journal.pone.0238726] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 08/21/2020] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND There are no reliable and validated objective biomarkers for the assessment of depression severity. We aimed to investigate the association between depression severity and timing-related speech features using speech recognition technology. METHOD Patients with major depressive disorder (MDD), those with bipolar disorder (BP), and healthy controls (HC) were asked to engage in a non-structured interview with research psychologists. Using automated speech recognition technology, we measured three timing-related speech features: speech rate, pause time, and response time. The severity of depression was assessed using the Hamilton Depression Rating Scale 17-item version (HAMD-17). We conducted the current study to answer the following questions: 1) Are there differences in speech features among MDD, BP, and HC? 2) Do speech features correlate with depression severity? 3) Do changes in speech features correlate with within-subject changes in depression severity? RESULTS We collected 1058 data sets from 241 individuals for the study (97 MDD, 68 BP, and 76 HC). There were significant differences in speech features among groups; depressed patients showed slower speech rate, longer pause time, and longer response time than HC. All timing-related speech features showed significant associations with HAMD-17 total scores. Longitudinal changes in speech rate correlated with changes in HAMD-17 total scores. CONCLUSIONS Depressed individuals showed longer response time, longer pause time, and slower speech rate than healthy individuals, all of which were suggestive of psychomotor retardation. Our study suggests that speech features could be used as objective biomarkers for the assessment of depression severity.
Collapse
Affiliation(s)
- Mao Yamamoto
- Department of Neuropsychiatry, Keio University School of Medicine, Tokyo, Japan
| | - Akihiro Takamiya
- Department of Neuropsychiatry, Keio University School of Medicine, Tokyo, Japan
| | - Kyosuke Sawada
- Department of Neuropsychiatry, Keio University School of Medicine, Tokyo, Japan
| | - Michitaka Yoshimura
- Department of Neuropsychiatry, Keio University School of Medicine, Tokyo, Japan
| | - Momoko Kitazawa
- Department of Neuropsychiatry, Keio University School of Medicine, Tokyo, Japan
| | - Kuo-ching Liang
- Department of Neuropsychiatry, Keio University School of Medicine, Tokyo, Japan
| | - Takanori Fujita
- Department of Neuropsychiatry, Keio University School of Medicine, Tokyo, Japan
| | - Masaru Mimura
- Department of Neuropsychiatry, Keio University School of Medicine, Tokyo, Japan
| | - Taishiro Kishimoto
- Department of Neuropsychiatry, Keio University School of Medicine, Tokyo, Japan
| |
Collapse
|
25
|
Zhang L, Duvvuri R, Chandra KKL, Nguyen T, Ghomi RH. Automated voice biomarkers for depression symptoms using an online cross-sectional data collection initiative. Depress Anxiety 2020; 37:657-669. [PMID: 32383335 DOI: 10.1002/da.23020] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Revised: 03/05/2020] [Accepted: 03/27/2020] [Indexed: 01/02/2023] Open
Abstract
IMPORTANCE Depression is an illness affecting a large percentage of the world's population throughout the lifetime. To date, there is no available biomarker for depression detection and tracking of symptoms relies on patient self-report. OBJECTIVE To explore and validate features extracted from recorded voice samples of depressed subjects as digital biomarkers for suicidality, psychomotor disturbance, and depression severity. DESIGN We conducted a cross-sectional study over the course of 12 months using a frequently visited web form version of the PHQ9 hosted by Mental Health America (MHA) to ask subjects for anonymous voice samples via a separate web form hosted by NeuroLex Laboratories. Subjects were asked to provide demographics, answers to the PHQ9, and two voice samples. SETTING Online only. PARTICIPANTS Users of the MHA website. MAIN OUTCOMES AND MEASURES Performance of statistical models using extracted voice features to predict psychomotor disturbance, suicidality, and depression severity as indicated by the PHQ9. RESULTS Voice features extracted from recorded audio of depressed subjects were able to predict PHQ9 question 9 and total scores with an area under the curve of 0.821 and a mean absolute error of 4.7, respectively. Psychomotor Disturbance prediction was less powerful with an area under the curve of 0.61. CONCLUSION AND RELEVANCE Automated voice analysis using short recordings of patient speech may be used to augment depression screen and symptom management.
Collapse
Affiliation(s)
- Larry Zhang
- Neurology, University of Washington, Seattle, Washington
| | - Radhika Duvvuri
- Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland
| | | | | | - Reza H Ghomi
- Neurology, University of Washington, Seattle, Washington.,NeuroLex Laboratories, Newnan, Georgia
| |
Collapse
|
26
|
Antosik-Wójcińska AZ, Dominiak M, Chojnacka M, Kaczmarek-Majer K, Opara KR, Radziszewska W, Olwert A, Święcicki Ł. Smartphone as a monitoring tool for bipolar disorder: a systematic review including data analysis, machine learning algorithms and predictive modelling. Int J Med Inform 2020; 138:104131. [DOI: 10.1016/j.ijmedinf.2020.104131] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2019] [Revised: 03/15/2020] [Accepted: 03/22/2020] [Indexed: 01/06/2023]
|
27
|
Abstract
BACKGROUND Abnormalities in vocal expression during a depressed episode have frequently been reported in people with depression, but less is known about if these abnormalities only exist in special situations. In addition, the impacts of irrelevant demographic variables on voice were uncontrolled in previous studies. Therefore, this study compares the vocal differences between depressed and healthy people under various situations with irrelevant variables being regarded as covariates. METHODS To examine whether the vocal abnormalities in people with depression only exist in special situations, this study compared the vocal differences between healthy people and patients with unipolar depression in 12 situations (speech scenarios). Positive, negative and neutral voice expressions between depressed and healthy people were compared in four tasks. Multiple analysis of covariance (MANCOVA) was used for evaluating the main effects of variable group (depressed vs. healthy) on acoustic features. The significances of acoustic features were evaluated by both statistical significance and magnitude of effect size. RESULTS The results of multivariate analysis of covariance showed that significant differences between the two groups were observed in all 12 speech scenarios. Although significant acoustic features were not the same in different scenarios, we found that three acoustic features (loudness, MFCC5 and MFCC7) were consistently different between people with and without depression with large effect magnitude. CONCLUSIONS Vocal differences between depressed and healthy people exist in 12 scenarios. Acoustic features including loudness, MFCC5 and MFCC7 have potentials to be indicators for identifying depression via voice analysis. These findings support that depressed people's voices include both situation-specific and cross-situational patterns of acoustic features.
Collapse
Affiliation(s)
- Jingying Wang
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| | - Lei Zhang
- Department of Computer Science, Virginia Tech, Blacksburg, VA USA
| | - Tianli Liu
- Institute of Population Research, Peking University, Beijing, China
| | - Wei Pan
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| | - Bin Hu
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province China
| | - Tingshao Zhu
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
28
|
Marmar CR, Brown AD, Qian M, Laska E, Siegel C, Li M, Abu-Amara D, Tsiartas A, Richey C, Smith J, Knoth B, Vergyri D. Speech-based markers for posttraumatic stress disorder in US veterans. Depress Anxiety 2019; 36:607-616. [PMID: 31006959 PMCID: PMC6602854 DOI: 10.1002/da.22890] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Revised: 02/14/2019] [Accepted: 03/08/2019] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND The diagnosis of posttraumatic stress disorder (PTSD) is usually based on clinical interviews or self-report measures. Both approaches are subject to under- and over-reporting of symptoms. An objective test is lacking. We have developed a classifier of PTSD based on objective speech-marker features that discriminate PTSD cases from controls. METHODS Speech samples were obtained from warzone-exposed veterans, 52 cases with PTSD and 77 controls, assessed with the Clinician-Administered PTSD Scale. Individuals with major depressive disorder (MDD) were excluded. Audio recordings of clinical interviews were used to obtain 40,526 speech features which were input to a random forest (RF) algorithm. RESULTS The selected RF used 18 speech features and the receiver operating characteristic curve had an area under the curve (AUC) of 0.954. At a probability of PTSD cut point of 0.423, Youden's index was 0.787, and overall correct classification rate was 89.1%. The probability of PTSD was higher for markers that indicated slower, more monotonous speech, less change in tonality, and less activation. Depression symptoms, alcohol use disorder, and TBI did not meet statistical tests to be considered confounders. CONCLUSIONS This study demonstrates that a speech-based algorithm can objectively differentiate PTSD cases from controls. The RF classifier had a high AUC. Further validation in an independent sample and appraisal of the classifier to identify those with MDD only compared with those with PTSD comorbid with MDD is required.
Collapse
Affiliation(s)
- Charles R. Marmar
- Department of Psychiatry, New York University School of Medicine, New York, New York; Steven and Alexandra Cohen Veterans Center for the Study of Post-Traumatic Stress and Traumatic Brain Injury, New York, New York,Corresponding Author: Charles R. Marmar, MD - Department of Psychiatry, New York University School of Medicine, 1 Park Avenue, New York, NY 10016,
| | - Adam D. Brown
- Department of Psychiatry, New York University School of Medicine, New York, New York; Steven and Alexandra Cohen Veterans Center for the Study of Post-Traumatic Stress and Traumatic Brain Injury, New York, New York,Department of Psychology, New School for Social Research, New York, New York
| | - Meng Qian
- Department of Psychiatry, New York University School of Medicine, New York, New York; Steven and Alexandra Cohen Veterans Center for the Study of Post-Traumatic Stress and Traumatic Brain Injury, New York, New York
| | - Eugene Laska
- Department of Psychiatry, New York University School of Medicine, New York, New York; Steven and Alexandra Cohen Veterans Center for the Study of Post-Traumatic Stress and Traumatic Brain Injury, New York, New York
| | - Carole Siegel
- Department of Psychiatry, New York University School of Medicine, New York, New York; Steven and Alexandra Cohen Veterans Center for the Study of Post-Traumatic Stress and Traumatic Brain Injury, New York, New York
| | - Meng Li
- Department of Psychiatry, New York University School of Medicine, New York, New York; Steven and Alexandra Cohen Veterans Center for the Study of Post-Traumatic Stress and Traumatic Brain Injury, New York, New York
| | - Duna Abu-Amara
- Department of Psychiatry, New York University School of Medicine, New York, New York; Steven and Alexandra Cohen Veterans Center for the Study of Post-Traumatic Stress and Traumatic Brain Injury, New York, New York
| | | | | | | | | | | |
Collapse
|
29
|
Verfaillie SCJ, Witteman J, Slot RER, Pruis IJ, Vermaat LEW, Prins ND, Schiller NO, van de Wiel M, Scheltens P, van Berckel BNM, van der Flier WM, Sikkes SAM. High amyloid burden is associated with fewer specific words during spontaneous speech in individuals with subjective cognitive decline. Neuropsychologia 2019; 131:184-92. [PMID: 31075283 DOI: 10.1016/j.neuropsychologia.2019.05.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Revised: 05/03/2019] [Accepted: 05/06/2019] [Indexed: 12/29/2022]
Abstract
Self-perceived word-finding difficulties are common in aging individuals as well as in Alzheimer's Disease (AD). Language and speech deficits are difficult to objectify with neuropsychological assessments. We therefore aimed to investigate whether amyloid, an early AD pathological hallmark, is associated with speech-derived semantic complexity. We included 63 individuals with subjective cognitive decline (age 64 ± 8, MMSE 29 ± 1), with amyloid status (positron emission tomography [PET] scans n = 59, or Aβ1-42 cerebrospinal fluid [CSF] n = 4). Spontaneous speech was recorded using three open-ended tasks (description of cookie theft picture, abstract painting and a regular Sunday), transcribed verbatim and subsequently, linguistic parameters were extracted using T-scan computational software, including specific words (content words, frequent, concrete and abstract nouns, and fillers), lexical complexity (lemma frequency, Type-Token-Ratio) and syntactic complexity (Developmental Level scale). Nineteen individuals (30%) had high levels of amyloid burden, and there were no differences between groups on conventional neuropsychological tests. Using multinomial regression with linguistic parameters (in tertiles), we found that high amyloid burden is associated with fewer concrete nouns (ORmiddle (95%CI): 7.6 (1.4-41.2), ORlowest: 6.7 (1.2-37.1)) and content words (ORlowest: 6.3 (1.0-38.1). In addition, we found an interaction for education between high amyloid burden and more abstract nouns. In conclusion, high amyloid burden was modestly associated with fewer specific words, but not with syntactic complexity, lexical complexity or conventional neuropsychological tests, suggesting that subtle spontaneous speech deficits might occur in preclinical AD.
Collapse
|
30
|
Rana R, Latif S, Gururajan R, Gray A, Mackenzie G, Humphris G, Dunn J. Automated screening for distress: A perspective for the future. Eur J Cancer Care (Engl) 2019; 28:e13033. [DOI: 10.1111/ecc.13033] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 02/05/2019] [Accepted: 02/18/2019] [Indexed: 01/13/2023]
Affiliation(s)
- Rajib Rana
- University of Southern Queensland Springfield Queensland Australia
| | - Siddique Latif
- University of Southern Queensland Springfield Queensland Australia
| | - Raj Gururajan
- University of Southern Queensland Springfield Queensland Australia
| | - Anthony Gray
- University of Southern Queensland Springfield Queensland Australia
| | | | | | - Jeff Dunn
- University of Southern Queensland Springfield Queensland Australia
- Griffith University Brisbane Queensland Australia
- University of Technology Sydney Sydney New South Wales Australia
| |
Collapse
|
31
|
Samareh A, Jin Y, Wang Z, Chang X, Huang S. Detect depression from communication: how computer vision, signal processing, and sentiment analysis join forces. ACTA ACUST UNITED AC 2018. [DOI: 10.1080/24725579.2018.1496494] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Affiliation(s)
- Aven Samareh
- Industrial & Systems Engineering Department, University of Washington, Seattle, Washington, USA
| | - Yan Jin
- Research Engineer, JD.com, Inc., San francisco, California, USA
| | - Zhangyang Wang
- Department of Computer Science and Engineering, Texas A&M University, College Station, Texas, USA
| | - Xiangyu Chang
- School of Management, Xi’an Jiaotong University Shaanxi, P.R. China
| | - Shuai Huang
- Industrial & Systems Engineering Department, University of Washington, Seattle, Washington, USA
| |
Collapse
|
32
|
Fiquer JT, Moreno RA, Brunoni AR, Barros VB, Fernandes F, Gorenstein C. What is the nonverbal communication of depression? Assessing expressive differences between depressive patients and healthy volunteers during clinical interviews. J Affect Disord 2018; 238:636-644. [PMID: 29957481 DOI: 10.1016/j.jad.2018.05.071] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 05/05/2018] [Accepted: 05/28/2018] [Indexed: 11/24/2022]
Abstract
BACKGROUND It is unclear if individuals with Major Depressive Disorder (MDD) present different nonverbal behavior (NVB) compared with healthy individuals, and also if depression treatments affect NVB. In this study, we compared the NVB of MDD subjects and healthy controls. We also verified how MDD subjects' NVB is affected by depression severity and acute treatments. METHODS We evaluated 100 MDD outpatients and 83 controls. We used a 21-category ethogram to assess the frequency of positive and negative NVB at baseline. MDD subjects were also assessed after eight weeks of treatment (pharmacotherapy or neuromodulation). We used the Wilcoxon signed-rank test to compare the NVB of MDD subjects and controls; beta regression models to verify associations between MDD severity and NVB; the Shapiro-Wilk test to verify changes in NVB after treatment; and logistic regression models to verify NVB associated with treatment response according to the Hamilton depression rating scale. RESULTS Compared with controls, MDD subjects presented higher levels of six negative NVB (shrug, head and lips down, adaptive hand gestures, frown and cry) and lower levels of two positive NVB (eye contact and smile). MDD subjects' NVB was not associated with depression severity, and did not significantly change after depression treatment. Treatment responders showed more interpersonal proximity at baseline than non-responders. LIMITATIONS Our ethogram had no measure of behavior duration, and we had a short follow-up period. CONCLUSIONS MDD subjects have more negative and less positive social NVB than controls. Their nonverbal behavior remained stable after clinical response to acute depression treatments.
Collapse
Affiliation(s)
- Juliana Teixeira Fiquer
- Laboratory of Medical Investigation (LIM 23), Department and Institute of Psychiatry, University of São Paulo Medical School, São Paulo, Brazil.
| | - Ricardo Alberto Moreno
- Mood Disorders Unit (GRUDA), Department and Institute of Psychiatry, University of São Paulo Medical School, São Paulo, Brazil
| | - Andre R Brunoni
- Service of Interdisciplinary Neuromodulation, Laboratory of Neurosciences (LIM-27) and National Institute of Biomarkers in Psychiatry (INBioN), Department and Institute of Psychiatry, University of São Paulo Medical School, São Paulo, Brazil; Department of Psychiatry and Psychotherapy, Ludwig-Maximilians-University, Munich, Germany
| | - Vivian Boschesi Barros
- University of São Paulo School of Public Health, University of São Paulo, São Paulo, Brazil
| | - Fernando Fernandes
- Mood Disorders Unit (GRUDA), Department and Institute of Psychiatry, University of São Paulo Medical School, São Paulo, Brazil
| | - Clarice Gorenstein
- Laboratory of Medical Investigation (LIM 23), Department and Institute of Psychiatry, University of São Paulo Medical School, São Paulo, Brazil; Department of Pharmacology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, Brazil.
| |
Collapse
|
33
|
Jiang H, Hu B, Liu Z, Wang G, Zhang L, Li X, Kang H. Detecting Depression Using an Ensemble Logistic Regression Model Based on Multiple Speech Features. Comput Math Methods Med 2018; 2018:6508319. [PMID: 30344616 DOI: 10.1155/2018/6508319] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Accepted: 08/28/2018] [Indexed: 11/18/2022]
Abstract
Early intervention for depression is very important to ease the disease burden, but current diagnostic methods are still limited. This study investigated automatic depressed speech classification in a sample of 170 native Chinese subjects (85 healthy controls and 85 depressed patients). The classification performances of prosodic, spectral, and glottal speech features were analyzed in recognition of depression. We proposed an ensemble logistic regression model for detecting depression (ELRDD) in speech. The logistic regression, which was superior in recognition of depression, was selected as the base classifier. This ensemble model extracted many speech features from different aspects and ensured diversity of the base classifier. ELRDD provided better classification results than the other compared classifiers. A technique for identifying depression based on ELRDD, ELRDD-E, was here suggested and tested. It offered encouraging outcomes, revealing a high accuracy level of 75.00% for females and 81.82% for males, as well as an advantageous sensitivity/specificity ratio of 79.25%/70.59% for females and 78.13%/85.29% for males.
Collapse
|
34
|
Abstract
Depression is one of the most common psychiatric disorders worldwide, with over 350 million people affected. Current methods to screen for and assess depression depend almost entirely on clinical interviews and self-report scales. While useful, such measures lack objective, systematic, and efficient ways of incorporating behavioral observations that are strong indicators of depression presence and severity. Using dynamics of facial and head movement and vocalization, we trained classifiers to detect three levels of depression severity. Participants were a community sample diagnosed with major depressive disorder. They were recorded in clinical interviews (Hamilton Rating Scale for Depression, HRSD) at seven-week intervals over a period of 21 weeks. At each interview, they were scored by the HRSD as moderately to severely depressed, mildly depressed, or remitted. Logistic regression classifiers using leave-one-participant-out validation were compared for facial movement, head movement, and vocal prosody individually and in combination. Accuracy of depression severity measurement from facial movement dynamics was higher than that for head movement dynamics, and each was substantially higher than that for vocal prosody. Accuracy using all three modalities combined only marginally exceeded that of face and head combined. These findings suggest that automatic detection of depression severity from behavioral indicators in patients is feasible and that multimodal measures afford the most powerful detection.
Collapse
|
35
|
Zhang J, Pan Z, Gui C, Xue T, Lin Y, Zhu J, Cui D. Analysis on speech signal features of manic patients. J Psychiatr Res 2018; 98:59-63. [PMID: 29291581 DOI: 10.1016/j.jpsychires.2017.12.012] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/06/2017] [Revised: 12/15/2017] [Accepted: 12/18/2017] [Indexed: 10/18/2022]
Abstract
Given the lack of effective biological markers for early diagnosis of bipolar mania, and the tendency for voice fluctuation during transition between mood states, this study aimed to investigate the speech features of manic patients to identify a potential set of biomarkers for diagnosis of bipolar mania. 30 manic patients and 30 healthy controls were recruited and their corresponding speech features were collected during natural dialogue using the Automatic Voice Collecting System. Bech-Rafaelsdn Mania Rating Scale (BRMS) and Clinical impression rating scale (CGI) were used to assess illness. The speech features were compared between two groups: mood group (mania vs remission) and bipolar group (manic patients vs healthy individuals). We found that the characteristic speech signals differed between mood groups and bipolar groups. The fourth formant (F4) and Linear Prediction Coefficient (LPC) (P < .05) were significantly differed when patients transmitted from manic to remission state. The first formant (F1), the second formant (F2), and LPC (P < .05) also played key roles in distinguishing between patients and healthy individuals. In addition, there was a significantly correlation between LPC and BRMS, indicating that LPC may play an important role in diagnosis of bipolar mania. In this study we traced speech features of bipolar mania during natural dialogue (conversation), which is an accessible approach in clinic practice. Such specific indicators may respectively serve as promising biomarkers for benefiting the diagnosis and clinical therapeutic evaluation of bipolar mania.
Collapse
Affiliation(s)
- Jing Zhang
- Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Shanghai Jiading Mental Health Center, Shanghai, China
| | - Zhongde Pan
- Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Chao Gui
- Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Ting Xue
- Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yezhe Lin
- Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Jie Zhu
- Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, China.
| | - Donghong Cui
- Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Brain Science and Technology Research Center, Shanghai Jiao Tong University, China.
| |
Collapse
|
36
|
Sonnenschein AR, Hofmann SG, Ziegelmayer T, Lutz W. Linguistic analysis of patients with mood and anxiety disorders during cognitive behavioral therapy. Cogn Behav Ther 2018; 47:315-327. [DOI: 10.1080/16506073.2017.1419505] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Anke R. Sonnenschein
- Department of Psychology, University of Trier, Trier, Universitätsring 15, 54296, Germany
| | - Stefan G. Hofmann
- Department of Psychology, Boston University, 648 Beacon Street, Boston, MA, 02215, USA
| | - Tobias Ziegelmayer
- Department of Psychology, University of Trier, Trier, Universitätsring 15, 54296, Germany
| | - Wolfgang Lutz
- Department of Psychology, University of Trier, Trier, Universitätsring 15, 54296, Germany
| |
Collapse
|
37
|
Taguchi T, Tachikawa H, Nemoto K, Suzuki M, Nagano T, Tachibana R, Nishimura M, Arai T. Major depressive disorder discrimination using vocal acoustic features. J Affect Disord 2018; 225:214-220. [PMID: 28841483 DOI: 10.1016/j.jad.2017.08.038] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/15/2017] [Revised: 06/01/2017] [Accepted: 08/14/2017] [Indexed: 11/26/2022]
Abstract
BACKGROUND The voice carries various information produced by vibrations of the vocal cords and the vocal tract. Though many studies have reported a relationship between vocal acoustic features and depression, including mel-frequency cepstrum coefficients (MFCCs) which applied to speech recognition, there have been few studies in which acoustic features allowed discrimination of patients with depressive disorder. Vocal acoustic features as biomarker of depression could make differential diagnosis of patients with depressive state. In order to achieve differential diagnosis of depression, in this preliminary study, we examined whether vocal acoustic features could allow discrimination between depressive patients and healthy controls. METHODS Subjects were 36 patients who met the criteria for major depressive disorder and 36 healthy controls with no current or past psychiatric disorders. Voices of reading out digits before and after verbal fluency task were recorded. Voices were analyzed using OpenSMILE. The extracted acoustic features, including MFCCs, were used for group comparison and discriminant analysis between patients and controls. RESULTS The second dimension of MFCC (MFCC 2) was significantly different between groups and allowed the discrimination between patients and controls with a sensitivity of 77.8% and a specificity of 86.1%. The difference in MFCC 2 between the two groups reflected an energy difference of frequency around 2000-3000Hz. CONCLUSIONS The MFCC 2 was significantly different between depressive patients and controls. This feature could be a useful biomarker to detect major depressive disorder. LIMITATIONS Sample size was relatively small. Psychotropics could have a confounding effect on voice.
Collapse
Affiliation(s)
- Takaya Taguchi
- Department of Psychiatry, Graduate School of Comprehensive Human Sciences, University of Tsukuba, Japan; University of Tsukuba Hospital, Japan
| | - Hirokazu Tachikawa
- Department of Psychiatry, Graduate School of Comprehensive Human Sciences, University of Tsukuba, Japan; Department of Psychiatry, Faculty of Medicine, University of Tsukuba, Japan.
| | - Kiyotaka Nemoto
- Department of Psychiatry, Graduate School of Comprehensive Human Sciences, University of Tsukuba, Japan; Department of Psychiatry, Faculty of Medicine, University of Tsukuba, Japan
| | | | | | | | - Masafumi Nishimura
- Graduate School of Integrated Science and Technology, Shizuoka University, Japan
| | - Tetsuaki Arai
- Department of Psychiatry, Graduate School of Comprehensive Human Sciences, University of Tsukuba, Japan; Department of Psychiatry, Faculty of Medicine, University of Tsukuba, Japan
| |
Collapse
|
38
|
Dauphin B, Halverson S, Pouliot S, Slowik L. Listening to a patient: An exploratory experimental investigation into the effects of vocalization and therapist gender on interpreting clinical material. Bull Menninger Clin 2017; 82:19-45. [PMID: 29120668 DOI: 10.1521/bumc_2017_81_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Carefully listening to the patient is of paramount importance for psychoanalysis and psychoanalytic psychotherapy. The present study explored whether patient vocalization as well as the gender of the analyst play significant roles in clinical listening. Fifty-one psychoanalysts and psychoanalytic therapists were randomly assigned to listen to one of two dramatized psychoanalytic sessions. The content of the sessions was the same for both versions, but the sessions were dramatized differently. Some differences emerged between the versions, especially on ratings of reality testing, impulse control, pressured speech, patient was confusing, and awareness of imagery. Furthermore, differences emerged between male and female analysts in terms of ratings of intervention strategies and countertransference reactions to the patient material. Session version and gender affect different ratings. Implications of the findings are discussed as is the utility of using more ecologically valid material in conducting empirical research into clinical judgment.
Collapse
|
39
|
Liu Z, Hu B, Li X, Liu F, Wang G, Yang J. Detecting Depression in Speech Under Different Speaking Styles and Emotional Valences. Brain Inform 2017. [DOI: 10.1007/978-3-319-70772-3_25] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
40
|
Faurholt-Jepsen M, Busk J, Frost M, Vinberg M, Christensen EM, Winther O, Bardram JE, Kessing LV. Voice analysis as an objective state marker in bipolar disorder. Transl Psychiatry 2016; 6:e856. [PMID: 27434490 PMCID: PMC5545710 DOI: 10.1038/tp.2016.123] [Citation(s) in RCA: 114] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Revised: 04/04/2016] [Accepted: 05/05/2016] [Indexed: 12/30/2022] Open
Abstract
Changes in speech have been suggested as sensitive and valid measures of depression and mania in bipolar disorder. The present study aimed at investigating (1) voice features collected during phone calls as objective markers of affective states in bipolar disorder and (2) if combining voice features with automatically generated objective smartphone data on behavioral activities (for example, number of text messages and phone calls per day) and electronic self-monitored data (mood) on illness activity would increase the accuracy as a marker of affective states. Using smartphones, voice features, automatically generated objective smartphone data on behavioral activities and electronic self-monitored data were collected from 28 outpatients with bipolar disorder in naturalistic settings on a daily basis during a period of 12 weeks. Depressive and manic symptoms were assessed using the Hamilton Depression Rating Scale 17-item and the Young Mania Rating Scale, respectively, by a researcher blinded to smartphone data. Data were analyzed using random forest algorithms. Affective states were classified using voice features extracted during everyday life phone calls. Voice features were found to be more accurate, sensitive and specific in the classification of manic or mixed states with an area under the curve (AUC)=0.89 compared with an AUC=0.78 for the classification of depressive states. Combining voice features with automatically generated objective smartphone data on behavioral activities and electronic self-monitored data increased the accuracy, sensitivity and specificity of classification of affective states slightly. Voice features collected in naturalistic settings using smartphones may be used as objective state markers in patients with bipolar disorder.
Collapse
Affiliation(s)
- M Faurholt-Jepsen
- Psychiatric Center Copenhagen, Rigshospitalet, Copenhagen, Denmark,Psychiatric Center Copenhagen, Rigshospitalet, Blegdamsvej 9, DK- 2100 Copenhagen, Denmark. E-mail:
| | - J Busk
- DTU Compute, Technical University of Denmark (DTU), Lyngby, Denmark
| | - M Frost
- The Pervasive Interaction Laboratory, IT University of Copenhagen, Copenhagen, Denmark
| | - M Vinberg
- Psychiatric Center Copenhagen, Rigshospitalet, Copenhagen, Denmark
| | - E M Christensen
- Psychiatric Center Copenhagen, Rigshospitalet, Copenhagen, Denmark
| | - O Winther
- DTU Compute, Technical University of Denmark (DTU), Lyngby, Denmark
| | - J E Bardram
- DTU Compute, Technical University of Denmark (DTU), Lyngby, Denmark
| | - L V Kessing
- Psychiatric Center Copenhagen, Rigshospitalet, Copenhagen, Denmark
| |
Collapse
|
41
|
Hogenelst K, Sarampalis A, Leander NP, Müller BCN, Schoevers RA, aan het Rot M. The effects of acute tryptophan depletion on speech and behavioural mimicry in individuals at familial risk for depression. J Psychopharmacol 2016; 30:303-11. [PMID: 26755543 DOI: 10.1177/0269881115625156] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Major depressive disorder (MDD) has been associated with abnormalities in speech and behavioural mimicry. These abnormalities may contribute to the impairments in interpersonal functioning that are often seen in MDD patients. MDD has also been associated with disturbances in the brain serotonin system, but the extent to which serotonin regulates speech and behavioural mimicry remains unclear. In a randomized, double-blind, crossover study, we induced acute tryptophan depletion (ATD) in individuals with or without a family history of MDD. Five hours afterwards, participants engaged in two behavioural-mimicry experiments in which speech and behaviour were recorded. ATD reduced the time participants waited before speaking, which might indicate increased impulsivity. However, ATD did not significantly alter speech otherwise, nor did it affect mimicry. This suggests that a brief lowering of brain serotonin has limited effects on verbal and non-verbal social behaviour. The null findings may be due to low test sensitivity, but they otherwise suggest that low serotonin has little effect on social interaction quality in never-depressed individuals. It remains possible that recovered MDD patients are more strongly affected.
Collapse
Affiliation(s)
- Koen Hogenelst
- University of Groningen, Department of Psychology, Groningen, the Netherlands University of Groningen, School of Behavioral and Cognitive Neurosciences, Groningen, the Netherlands
| | - Anastasios Sarampalis
- University of Groningen, Department of Psychology, Groningen, the Netherlands University of Groningen, School of Behavioral and Cognitive Neurosciences, Groningen, the Netherlands
| | - N Pontus Leander
- University of Groningen, Department of Psychology, Groningen, the Netherlands
| | - Barbara C N Müller
- Behavioural Science Institute, Radboud University Nijmegen, Department of Social and Cultural Psychology, Nijmegen, the Netherlands
| | - Robert A Schoevers
- University of Groningen, University Medical Centre Groningen, Department of Psychiatry, Groningen, the Netherlands
| | - Marije aan het Rot
- University of Groningen, Department of Psychology, Groningen, the Netherlands University of Groningen, School of Behavioral and Cognitive Neurosciences, Groningen, the Netherlands
| |
Collapse
|
42
|
Abstract
Objective: The aim of this study was to explore the life stories of depressive adolescents and compare them with non-clinical adolescents' life stories. Methods: For this purpose, we compared 20 life stories of hospitalized adolescents suffering from major depressive episode with 40 life stories of adolescents attending school divided into two groups: 20 non-depressed and 20 depressed adolescents. Results: Results showed that life stories differed as a function of psychopathology. Depressed hospitalized adolescents spoke about their disease and defined themselves by their depression. The depressed adolescents in school concentrated on schooling and school achievements, while the non-depressed group defined themselves by their family, friends and inclusion in a peer group. Conclusion: These analyses allowed us to highlight specific themes mentioned by each of the three groups of adolescents. Although life stories are personal and unique, analysis of such stories allows us to better understand the daily reality of depressive adolescents and the relationships between the life events they experience, daily stressors, depression and how they construct their personal history.
Collapse
Affiliation(s)
- Aurore Boulard
- Département de psychologies et cliniques des systèmes humains, University of Liège, Belgium
| |
Collapse
|
43
|
Ackermann H, Hage SR, Ziegler W. Phylogenetic reorganization of the basal ganglia: A necessary, but not the only, bridge over a primate Rubicon of acoustic communication. Behav Brain Sci 2014; 37:577-604. [DOI: 10.1017/s0140525x1400003x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
AbstractIn this response to commentaries, we revisit the two main arguments of our target article. Based on data drawn from a variety of research areas – vocal behavior in nonhuman primates, speech physiology and pathology, neurobiology of basal ganglia functions, motor skill learning, paleoanthropological concepts – the target article, first, suggests a two-stage model of the evolution of the crucial motor prerequisites of spoken language within the hominin lineage: (1) monosynaptic refinement of the projections of motor cortex to brainstem nuclei steering laryngeal muscles, and (2) subsequent “vocal-laryngeal elaboration” of cortico-basal ganglia circuits, driven by human-specific FOXP2 mutations. Second, as concerns the ontogenetic development of verbal communication, age-dependent interactions between the basal ganglia and their cortical targets are assumed to contribute to the time course of the acquisition of articulate speech. Whereas such a phylogenetic reorganization of cortico-striatal circuits must be considered a necessary prerequisite for ontogenetic speech acquisition, the 30 commentaries – addressing the whole range of data sources referred to – point at several further aspects of acoustic communication which have to be added to or integrated with the presented model. For example, the relationships between vocal tract movement sequencing – the focus of the target article – and rhythmical structures of movement organization, the connections between speech motor control and the central-auditory and central-visual systems, the impact of social factors upon the development of vocal behavior (in nonhuman primates and in our species), and the interactions of ontogenetic speech acquisition – based upon FOXP2-driven structural changes at the level of the basal ganglia – with preceding subvocal stages of acoustic communication as well as higher-order (cognitive) dimensions of phonological development. Most importantly, thus, several promising future research directions unfold from these contributions – accessible to clinical studies and functional imaging in our species as well as experimental investigations in nonhuman primates.
Collapse
|
44
|
Scherer S, Hammal Z, Yang Y, Morency LP, Cohn JF. Dyadic Behavior Analysis in Depression Severity Assessment Interviews. Proc ACM Int Conf Multimodal Interact 2014; 2014:112-119. [PMID: 28345076 PMCID: PMC5365085 DOI: 10.1145/2663204.2663238] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Previous literature suggests that depression impacts vocal timing of both participants and clinical interviewers but is mixed with respect to acoustic features. To investigate further, 57 middle-aged adults (men and women) with Major Depression Disorder and their clinical interviewers (all women) were studied. Participants were interviewed for depression severity on up to four occasions over a 21 week period using the Hamilton Rating Scale for Depression (HRSD), which is a criterion measure for depression severity in clinical trials. Acoustic features were extracted for both participants and interviewers using COVAREP Toolbox. Missing data occurred due to missed appointments, technical problems, or insufficient vocal samples. Data from 36 participants and their interviewers met criteria and were included for analysis to compare between high and low depression severity. Acoustic features for participants varied between men and women as expected, and failed to vary with depression severity for participants. For interviewers, acoustic characteristics strongly varied with severity of the interviewee's depression. Accommodation - the tendency of interactants to adapt their communicative behavior to each other - between interviewers and interviewees was inversely related to depression severity. These findings suggest that interviewers modify their acoustic features in response to depression severity, and depression severity strongly impacts interpersonal accommodation.
Collapse
Affiliation(s)
- Stefan Scherer
- USC Institute for Creative Technologies, 12015 Waterfront Dr. Playa Vista, CA
| | - Zakia Hammal
- Carnegie Mellon University, 5000 Fifth Avenue, Pittsburgh, PA
| | - Ying Yang
- University of Pittsburgh, Biomedical Science Tower, Pittsburgh, PA 15213
| | | | - Jeffrey F Cohn
- University of Pittsburgh, 210 S. Bouquet St. Pittsburgh, PA 15260
| |
Collapse
|
45
|
Asgari M, Shafran I, Sheeber LB. INFERRING CLINICAL DEPRESSION FROM SPEECH AND SPOKEN UTTERANCES. IEEE Int Workshop Mach Learn Signal Process 2014; 2014:10.1109/mlsp.2014.6958856. [PMID: 33288990 PMCID: PMC7719299 DOI: 10.1109/mlsp.2014.6958856] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In this paper, we investigate the problem of detecting depression from recordings of subjects' speech using speech processing and machine learning. There has been considerable interest in this problem in recent years due to the potential for developing objective assessments from real-world behaviors, which may provide valuable supplementary clinical information or may be useful in screening. The cues for depression may be present in "what is said" (content) and "how it is said" (prosody). Given the limited amounts of text data, even in this relatively large study, it is difficult to employ standard method of learning models from n-gram features. Instead, we learn models using word representations in an alternative feature space of valence and arousal. This is akin to embedding words into a real vector space albeit with manual ratings instead of those learned with deep neural networks [1]. For extracting prosody, we employ standard feature extractors such as those implemented in openSMILE and compare them with features extracted from harmonic models that we have been developing in recent years. Our experiments show that our features from harmonic model improve the performance of detecting depression from spoken utterances than other alternatives. The context features provide additional improvements to achieve an accuracy of about 74%, sufficient to be useful in screening applications.
Collapse
Affiliation(s)
- Meysam Asgari
- Center for Spoken Language Understanding Oregon Health & Science University, Portland, Oregon
| | - Izhak Shafran
- Center for Spoken Language Understanding Oregon Health & Science University, Portland, Oregon
| | | |
Collapse
|
46
|
Abstract
The aim of the study was to determine the impact of body height on speaking fundamental frequency (SF0) while controlling for as many as possible influencing factors such as habits, biophysical conditions, medication, diseases, and others. Fifty-eight females were analyzed during spontaneous speech (i.e. explaining driving directions or a cooking recipe) of at least 60 seconds at comfortable pitch and loudness. The subjects showed a moderate negative and significant correlation between body height and SF0 (r = -0.40, P = 0.002). With r(2) = 0.16, however, a reasonable portion (16%) of the variance in SF0 is explained by the variance in body height. In comparison with other factors for which a correlation with SF0 was mentioned in literature (hypothyrodism, hemodialysis, auditory-maleness after female-to-male transsexualism, body weight, body mass index, and body fat), body height accounted for most of the proportion of SF0 in females. It is therefore possible to validate body height as a factor to account for in clinical F0 measurement.
Collapse
Affiliation(s)
- Ben Barsties
- a Faculty of Health Care, HU University of Applied Sciences Utrecht , Utrecht , The Netherlands.,b Faculty of Medicine and Health Sciences, University of Antwerp , Antwerp , Belgium
| | - Rudi Verfaillie
- c Department of Speech-Language Therapy , Zuyd University College , Heerlen , The Netherlands
| | - Peter Dicks
- d Vocational School, University Hospital Aix-la-Chapelle , Aachen , Germany
| | - Youri Maryn
- b Faculty of Medicine and Health Sciences, University of Antwerp , Antwerp , Belgium.,e Department of Otorhinolaryngology and Head & Neck Surgery , Speech-Language Pathology and Audiology, Sint-Jan General Hospital , Bruges , Belgium.,f Faculty of Education, Health & Social Work, University College Ghent , Ghent , Belgium
| |
Collapse
|
47
|
Bennabi D, Vandel P, Papaxanthis C, Pozzo T, Haffen E. Psychomotor retardation in depression: a systematic review of diagnostic, pathophysiologic, and therapeutic implications. Biomed Res Int 2013; 2013:158746. [PMID: 24286073 DOI: 10.1155/2013/158746] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Psychomotor retardation is a central feature of depression which includes motor and cognitive impairments. Effective management may be useful to improve the classification of depressive subtypes and treatment selection, as well as prediction of outcome in patients with depression. The aim of this paper was to review the current status of knowledge regarding psychomotor retardation in depression, in order to clarify its role in the diagnostic management of mood disorders. Retardation modifies all the actions of the individual, including motility, mental activity, and speech. Objective assessments can highlight the diagnostic importance of psychomotor retardation, especially in melancholic and bipolar depression. Psychomotor retardation is also related to depression severity and therapeutic change and could be considered a good criterion for the prediction of therapeutic effect. The neurobiological process underlying the inhibition of activity includes functional deficits in the prefrontal cortex and abnormalities in dopamine neurotransmission. Future investigations of psychomotor retardation should help improve the understanding of the pathophysiological mechanisms underlying mood disorders and contribute to improving their therapeutic management.
Collapse
|
48
|
Bennabi D, Vandel P, Papaxanthis C, Pozzo T, Haffen E. Psychomotor retardation in depression: a systematic review of diagnostic, pathophysiologic, and therapeutic implications. Biomed Res Int 2013; 2013:158746. [PMID: 24286073 DOI: 10.1155/2013/158746] [Citation(s) in RCA: 123] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 05/14/2013] [Revised: 07/26/2013] [Accepted: 08/26/2013] [Indexed: 11/23/2022]
Abstract
Psychomotor retardation is a central feature of depression which includes motor and cognitive impairments. Effective management may be useful to improve the classification of depressive subtypes and treatment selection, as well as prediction of outcome in patients with depression. The aim of this paper was to review the current status of knowledge regarding psychomotor retardation in depression, in order to clarify its role in the diagnostic management of mood disorders. Retardation modifies all the actions of the individual, including motility, mental activity, and speech. Objective assessments can highlight the diagnostic importance of psychomotor retardation, especially in melancholic and bipolar depression. Psychomotor retardation is also related to depression severity and therapeutic change and could be considered a good criterion for the prediction of therapeutic effect. The neurobiological process underlying the inhibition of activity includes functional deficits in the prefrontal cortex and abnormalities in dopamine neurotransmission. Future investigations of psychomotor retardation should help improve the understanding of the pathophysiological mechanisms underlying mood disorders and contribute to improving their therapeutic management.
Collapse
|
49
|
Oreg S, Sverdlik N. Source Personality and Persuasiveness: Big Five Predispositions to Being Persuasive and the Role of Message Involvement. J Pers 2013; 82:250-64. [DOI: 10.1111/jopy.12049] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
50
|
Abstract
To investigate the relation between vocal prosody and change in depression severity over time, 57 participants from a clinical trial for treatment of depression were evaluated at seven-week intervals using a semi-structured clinical interview for depression severity (Hamilton Rating Scale for Depression: HRSD). All participants met criteria for Major Depressive Disorder at week 1. Using both perceptual judgments by naive listeners and quantitative analyses of vocal timing and fundamental frequency, three hypotheses were tested: 1) Naive listeners can perceive the severity of depression from vocal recordings of depressed participants and interviewers. 2) Quantitative features of vocal prosody in depressed participants reveal change in symptom severity over the course of depression. And 3) Interpersonal effects occur as well; such that vocal prosody in interviewers shows corresponding effects. These hypotheses were strongly supported. Together, participants' and interviewers' vocal prosody accounted for about 60% of variation in depression scores, and detected ordinal range of depression severity (low, mild, and moderate-to-severe) in 69% of cases (kappa = 0.53). These findings suggest that analysis of vocal prosody could be a powerful tool to assist in depression screening and monitoring over the course of depressive disorder and recovery.
Collapse
Affiliation(s)
- Ying Yang
- Rehabilitation and Neural Engineering Laboratory, University of Pittsburgh.
| | | | | |
Collapse
|