Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Jiang H, Hu B, Liu Z, Wang G, Zhang L, Li X, Kang H. Detecting Depression Using an Ensemble Logistic Regression Model Based on Multiple Speech Features. Comput Math Methods Med 2018;2018:6508319. [PMID: 30344616 DOI: 10.1155/2018/6508319] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Accepted: 08/28/2018] [Indexed: 11/18/2022]

For:	Jiang H, Hu B, Liu Z, Wang G, Zhang L, Li X, Kang H. Detecting Depression Using an Ensemble Logistic Regression Model Based on Multiple Speech Features. Comput Math Methods Med 2018;2018:6508319. [PMID: 30344616 DOI: 10.1155/2018/6508319] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Accepted: 08/28/2018] [Indexed: 11/18/2022]

Number

Cited by Other Article(s)

Xu X, Li J, Zhu Z, Zhao L, Wang H, Song C, Chen Y, Zhao Q, Yang J, Pei Y. A Comprehensive Review on Synergy of Multi-Modal Data and AI Technologies in Medical Diagnosis. Bioengineering (Basel) 2024;11:219. [PMID: 38534493 DOI: 10.3390/bioengineering11030219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 02/15/2024] [Accepted: 02/21/2024] [Indexed: 03/28/2024] Open

黄祥, 廖义, 张文, 张莉. [A research on depression recognition based on voice pre-training model]. SHENG WU YI XUE GONG CHENG XUE ZA ZHI = JOURNAL OF BIOMEDICAL ENGINEERING = SHENGWU YIXUE GONGCHENGXUE ZAZHI 2024;41:9-16. [PMID: 38403599 PMCID: PMC10894747 DOI: 10.7507/1001-5515.202304008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 11/24/2023] [Indexed: 02/27/2024]

Mao K, Wu Y, Chen J. A systematic review on automated clinical depression diagnosis. NPJ MENTAL HEALTH RESEARCH 2023;2:20. [PMID: 38609509 PMCID: PMC10955993 DOI: 10.1038/s44184-023-00040-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 09/27/2023] [Indexed: 04/14/2024]

Yang W, Liu J, Cao P, Zhu R, Wang Y, Liu JK, Wang F, Zhang X. Attention guided learnable time-domain filterbanks for speech depression detection. Neural Netw 2023;165:135-149. [PMID: 37285730 DOI: 10.1016/j.neunet.2023.05.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Revised: 05/13/2023] [Accepted: 05/20/2023] [Indexed: 06/09/2023]

Abstract

Depression, as a global mental health problem, is lacking effective screening methods that can help with early detection and treatment. This paper aims to facilitate the large-scale screening of depression by focusing on the speech depression detection (SDD) task. Currently, direct modeling on the raw signal yields a large number of parameters, and the existing deep learning-based SDD models mainly use the fixed Mel-scale spectral features as input. However, these features are not designed for depression detection, and the manual settings limit the exploration of fine-grained feature representations. In this paper, we learn the effective representations of the raw signals from an interpretable perspective. Specifically, we present a joint learning framework with attention-guided learnable time-domain filterbanks for depression classification (DALF), which collaborates with the depression filterbanks features learning (DFBL) module and multi-scale spectral attention learning (MSSA) module. DFBL is capable of producing biologically meaningful acoustic features by employing learnable time-domain filters, and MSSA is used to guide the learnable filters to better retain the useful frequency sub-bands. We collect a new dataset, the Neutral Reading-based Audio Corpus (NRAC), to facilitate the research in depression analysis, and we evaluate the performance of DALF on the NRAC and the public DAIC-woz datasets. The experimental results demonstrate that our method outperforms the state-of-the-art SDD methods with an F1 of 78.4% on the DAIC-woz dataset. In particular, DALF achieves F1 scores of 87.3% and 81.7% on two parts of the NRAC dataset. By analyzing the filter coefficients, we find that the most important frequency range identified by our method is 600-700Hz, which corresponds to the Mandarin vowels /e/ and /eˆ/ and can be considered as an effective biomarker for the SDD task. Taken together, our DALF model provides a promising approach to depression detection.

Collapse

Pan W, Deng F, Wang X, Hang B, Zhou W, Zhu T. Exploring the ability of vocal biomarkers in distinguishing depression from bipolar disorder, schizophrenia, and healthy controls. Front Psychiatry 2023;14:1079448. [PMID: 37575564 PMCID: PMC10415910 DOI: 10.3389/fpsyt.2023.1079448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 06/30/2023] [Indexed: 08/15/2023] Open

Abstract

Background

Vocal features have been exploited to distinguish depression from healthy controls. While there have been some claims for success, the degree to which changes in vocal features are specific to depression has not been systematically studied. Hence, we examined the performances of vocal features in differentiating depression from bipolar disorder (BD), schizophrenia and healthy controls, as well as pairwise classifications for the three disorders.

Methods

We sampled 32 bipolar disorder patients, 106 depression patients, 114 healthy controls, and 20 schizophrenia patients. We extracted i-vectors from Mel-frequency cepstrum coefficients (MFCCs), and built logistic regression models with ridge regularization and 5-fold cross-validation on the training set, then applied models to the test set. There were seven classification tasks: any disorder versus healthy controls; depression versus healthy controls; BD versus healthy controls; schizophrenia versus healthy controls; depression versus BD; depression versus schizophrenia; BD versus schizophrenia.

Results

The area under curve (AUC) score for classifying depression and bipolar disorder was 0.5 (F-score = 0.44). For other comparisons, the AUC scores ranged from 0.75 to 0.92, and the F-scores ranged from 0.73 to 0.91. The model performance (AUC) of classifying depression and bipolar disorder was significantly worse than that of classifying bipolar disorder and schizophrenia (corrected p < 0.05). While there were no significant differences in the remaining pairwise comparisons of the 7 classification tasks.

Conclusion

Vocal features showed discriminatory potential in classifying depression and the healthy controls, as well as between depression and other mental disorders. Future research should systematically examine the mechanisms of voice features in distinguishing depression with other mental disorders and develop more sophisticated machine learning models so that voice can assist clinical diagnosis better.

Collapse

Tang SX, Hänsel K, Cong Y, Nikzad AH, Mehta A, Cho S, Berretta S, Behbehani L, Pradhan S, John M, Liberman MY. Latent Factors of Language Disturbance and Relationships to Quantitative Speech Features. Schizophr Bull 2023;49:S93-S103. [PMID: 36946530 PMCID: PMC10031730 DOI: 10.1093/schbul/sbac145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/23/2023]

Du M, Liu S, Wang T, Zhang W, Ke Y, Chen L, Ming D. Depression recognition using a proposed speech chain model fusing speech production and perception features. J Affect Disord 2023;323:299-308. [PMID: 36462607 DOI: 10.1016/j.jad.2022.11.060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 10/22/2022] [Accepted: 11/20/2022] [Indexed: 12/05/2022]

Eysenbach G, Jang EH, Lee SH, Choi KY, Park JG, Shin HC. Automatic Depression Detection Using Smartphone-Based Text-Dependent Speech Signals: Deep Convolutional Neural Network Approach. J Med Internet Res 2023;25:e34474. [PMID: 36696160 PMCID: PMC9909514 DOI: 10.2196/34474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 05/20/2022] [Accepted: 12/18/2022] [Indexed: 12/23/2022] Open

Abstract

BACKGROUND

Automatic diagnosis of depression based on speech can complement mental health treatment methods in the future. Previous studies have reported that acoustic properties can be used to identify depression. However, few studies have attempted a large-scale differential diagnosis of patients with depressive disorders using acoustic characteristics of non-English speakers.

OBJECTIVE

This study proposes a framework for automatic depression detection using large-scale acoustic characteristics based on the Korean language.

METHODS

We recruited 153 patients who met the criteria for major depressive disorder and 165 healthy controls without current or past mental illness. Participants' voices were recorded on a smartphone while performing the task of reading predefined text-based sentences. Three approaches were evaluated and compared to detect depression using data sets with text-dependent read speech tasks: conventional machine learning models based on acoustic features, a proposed model that trains and classifies log-Mel spectrograms by applying a deep convolutional neural network (CNN) with a relatively small number of parameters, and models that train and classify log-Mel spectrograms by applying well-known pretrained networks.

RESULTS

The acoustic characteristics of the predefined text-based sentence reading automatically detected depression using the proposed CNN model. The highest accuracy achieved with the proposed CNN on the speech data was 78.14%. Our results show that the deep-learned acoustic characteristics lead to better performance than those obtained using the conventional approach and pretrained models.

CONCLUSIONS

Checking the mood of patients with major depressive disorder and detecting the consistency of objective descriptions are very important research topics. This study suggests that the analysis of speech data recorded while reading text-dependent sentences could help predict depression status automatically by capturing the characteristics of depression. Our method is smartphone based, is easily accessible, and can contribute to the automatic identification of depressive states.

Collapse

Koops S, Brederoo SG, de Boer JN, Nadema FG, Voppel AE, Sommer IE. Speech as a Biomarker for Depression. CNS & NEUROLOGICAL DISORDERS DRUG TARGETS 2023;22:152-160. [PMID: 34961469 DOI: 10.2174/1871527320666211213125847] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 10/10/2021] [Accepted: 10/10/2021] [Indexed: 01/01/2023]

Liu D, Liu B, Lin T, Liu G, Yang G, Qi D, Qiu Y, Lu Y, Yuan Q, Shuai SC, Li X, Liu O, Tang X, Shuai J, Cao Y, Lin H. Measuring depression severity based on facial expression and body movement using deep convolutional neural network. Front Psychiatry 2022;13:1017064. [PMID: 36620657 PMCID: PMC9810804 DOI: 10.3389/fpsyt.2022.1017064] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 12/02/2022] [Indexed: 12/24/2022] Open

Abstract

Introduction

Real-time evaluations of the severity of depressive symptoms are of great significance for the diagnosis and treatment of patients with major depressive disorder (MDD). In clinical practice, the evaluation approaches are mainly based on psychological scales and doctor-patient interviews, which are time-consuming and labor-intensive. Also, the accuracy of results mainly depends on the subjective judgment of the clinician. With the development of artificial intelligence (AI) technology, more and more machine learning methods are used to diagnose depression by appearance characteristics. Most of the previous research focused on the study of single-modal data; however, in recent years, many studies have shown that multi-modal data has better prediction performance than single-modal data. This study aimed to develop a measurement of depression severity from expression and action features and to assess its validity among the patients with MDD.

Methods

We proposed a multi-modal deep convolutional neural network (CNN) to evaluate the severity of depressive symptoms in real-time, which was based on the detection of patients' facial expression and body movement from videos captured by ordinary cameras. We established behavioral depression degree (BDD) metrics, which combines expression entropy and action entropy to measure the depression severity of MDD patients.

Results

We found that the information extracted from different modes, when integrated in appropriate proportions, can significantly improve the accuracy of the evaluation, which has not been reported in previous studies. This method presented an over 74% Pearson similarity between BDD and self-rating depression scale (SDS), self-rating anxiety scale (SAS), and Hamilton depression scale (HAMD). In addition, we tracked and evaluated the changes of BDD in patients at different stages of a course of treatment and the results obtained were in agreement with the evaluation from the scales.

Discussion

The BDD can effectively measure the current state of patients' depression and its changing trend according to the patient's expression and action features. Our model may provide an automatic auxiliary tool for the diagnosis and treatment of MDD.

Collapse

Affiliation(s)

Dongdong Liu Department of Physics, Fujian Provincial Key Laboratory for Soft Functional Materials Research, Xiamen University, Xiamen, China
Bowen Liu Department of Psychiatry, National Clinical Research Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China Department of Psychiatry, Baoan Mental Health Center, Shenzhen Baoan Center for Chronic Disease Control, Shenzhen, China
Tao Lin Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, China
Guangya Liu Integrated Chinese and Western Therapy of Depression Ward, Hunan Brain Hospital, Changsha, China
Guoyu Yang Department of Physics, Fujian Provincial Key Laboratory for Soft Functional Materials Research, Xiamen University, Xiamen, China
Dezhen Qi Department of Physics, Fujian Provincial Key Laboratory for Soft Functional Materials Research, Xiamen University, Xiamen, China
Ye Qiu Department of Physics, Fujian Provincial Key Laboratory for Soft Functional Materials Research, Xiamen University, Xiamen, China Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, China
Yuer Lu Department of Physics, Fujian Provincial Key Laboratory for Soft Functional Materials Research, Xiamen University, Xiamen, China Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, China
Qinmei Yuan Department of Psychiatry, National Clinical Research Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
Stella C. Shuai Department of Biological Sciences, Northwestern University, Evanston, IL, United States
Xiang Li Department of Physics, Fujian Provincial Key Laboratory for Soft Functional Materials Research, Xiamen University, Xiamen, China
Ou Liu Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, China
Xiangdong Tang Sleep Medicine Center, Mental Health Center, Department of Respiratory and Critical Care Medicine, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
Jianwei Shuai Department of Physics, Fujian Provincial Key Laboratory for Soft Functional Materials Research, Xiamen University, Xiamen, China Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, China State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China
Yuping Cao Department of Psychiatry, National Clinical Research Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
Hai Lin Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, China

Collapse

Exploration of Despair Eccentricities Based on Scale Metrics with Feature Sampling Using a Deep Learning Algorithm. Diagnostics (Basel) 2022;12:diagnostics12112844. [PMID: 36428903 PMCID: PMC9689169 DOI: 10.3390/diagnostics12112844] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 11/11/2022] [Accepted: 11/15/2022] [Indexed: 11/19/2022] Open

Othmani A, Zeghina AO, Muzammel M. A Model of Normality Inspired Deep Learning Framework for Depression Relapse Prediction Using Audiovisual Data. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022;226:107132. [PMID: 36183638 DOI: 10.1016/j.cmpb.2022.107132] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 09/04/2022] [Accepted: 09/13/2022] [Indexed: 06/16/2023]

Cui Y, Li Z, Liu L, Zhang J, Liu J. Privacy-preserving Speech-based Depression Diagnosis via Federated Learning. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022;2022:1371-1374. [PMID: 36085955 DOI: 10.1109/embc48229.2022.9871861] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Saad E, Sadiq S, Jamil R, Rustam F, Mehmood A, Choi GS, Ashraf I. Novel extreme regression-voting classifier to predict death risk in vaccinated people using VAERS data. PLoS One 2022;17:e0270327. [PMID: 35767542 PMCID: PMC9242465 DOI: 10.1371/journal.pone.0270327] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Accepted: 06/09/2022] [Indexed: 12/23/2022] Open

Abstract

COVID-19 vaccination raised serious concerns among the public and people are mind stuck by various rumors regarding the resulting illness, adverse reactions, and death. Such rumors are dangerous to the campaign against the COVID-19 and should be dealt with accordingly and timely. One prospective solution is to use machine learning-based models to predict the death risk for vaccinated people and clarify people’s perceptions regarding death risk. This study focuses on the prediction of the death risks associated with vaccinated people followed by a second dose for two reasons; first to build consensus among people to get the vaccines; second, to reduce the fear regarding vaccines. Given that, this study utilizes the COVID-19 VAERS dataset that records adverse events after COVID-19 vaccination as ‘recovered’, ‘not recovered’, and ‘survived’. To obtain better prediction results, a novel voting classifier extreme regression-voting classifier (ER-VC) is introduced. ER-VC ensembles extra tree classifier and logistic regression using soft voting criterion. To avoid model overfitting and get better results, two data balancing techniques synthetic minority oversampling (SMOTE) and adaptive synthetic sampling (ADASYN) have been applied. Moreover, three feature extraction techniques term frequency-inverse document frequency (TF-IDF), bag of words (BoW), and global vectors (GloVe) have been used for comparison. Both machine learning and deep learning models are deployed for experiments. Results obtained from extensive experiments reveal that the proposed model in combination with TF-TDF has shown robust results with a 0.85 accuracy when trained on the SMOTE-balanced dataset. In line with this, validation of the proposed voting classifier on binary classification shows state-of-the-art results with a 0.98 accuracy. Results show that machine learning models can predict the death risk with high accuracy and can assist the authors in taking timely measures.

Collapse

Wu P, Wang R, Lin H, Zhang F, Tu J, Sun M. Automatic depression recognition by intelligent speech signal processing: A systematic survey. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2022. [DOI: 10.1049/cit2.12113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Wang K, Zhou YE, Xu JX, Yang G. RETRACTED ARTICLE: Reviewing big data based mental health education process for promoting education system. CURRENT PSYCHOLOGY 2022. [DOI: 10.1007/s12144-020-00800-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Bhadra S, Kumar CJ. An insight into diagnosis of depression using machine learning techniques: a systematic review. Curr Med Res Opin 2022;38:749-771. [PMID: 35129401 DOI: 10.1080/03007995.2022.2038487] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]

Abstract

BACKGROUND

In this modern era, depression is one of the most prevalent mental disorders from which millions of individuals are affected today. The symptoms of depression are heterogeneous and often coincide with other disorders such as bipolar disorder, Parkinson's, schizophrenia, etc. It is a serious mental illness that may lead to other health problems if left untreated. Currently, identifying individuals with depression is totally based on the expertise of the clinician's experience. In order to assist clinicians in identifying the characteristics and classifying depressed people, different types of data modalities and machine learning techniques have been incorporated by researchers in this field. This study aims to find the answers to some important questions related to the trend of publications, data modality, machine learning models, dataset usage, pre-processing techniques and feature extraction and selection techniques that are prevalent and guide the direction of future research on depression diagnosis.

METHODS

This systematic review was conducted using a broad range of articles from two major databases: IEEE Xplore and PubMed. Studies ranging from the years 2011 to April 2021 were retrieved from the databases resulting in a total of 590 articles (53 articles from the IEEE Xplore database and 537 articles from the PubMed database). Out of those, the articles which satisfied the defined inclusion criteria were investigated for further analysis.

RESULTS

A total of 135 articles were identified and analysed for this review. High growth in the number of publications has been observed in recent years. Furthermore, significant diversity in the use of data modalities and machine learning classifiers has also been noted in this study. fMRI data with an SVM classifier was found to be the most popular choice among researchers. In most of the studies, data scarcity and small sample size, particularly for neuroimaging data are major concerns. The use of identical data pre-processing tools for similar data modalities can be seen. This study also provides statistical analysis of the current framework with respect to the modality, machine learning classifier, sample size and accuracy by applying one-way ANOVA and the Tukey - Kramer test.

CONCLUSION

The results indicate that an effective fusion of machine learning techniques with a potential data modality has a promising future for assisting clinicians in automatic depression diagnosis.

Collapse

Almaghrabi SA, Thewlis D, Thwaites S, Rogasch NC, Lau S, Clark SR, Baumert M. The reproducibility of bio-acoustic features is associated with sample duration, speech task and gender. IEEE Trans Neural Syst Rehabil Eng 2022;30:167-175. [PMID: 35038295 DOI: 10.1109/tnsre.2022.3143117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Rejaibi E, Komaty A, Meriaudeau F, Agrebi S, Othmani A. MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103107] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

Ye J, Yu Y, Wang Q, Li W, Liang H, Zheng Y, Fu G. Multi-modal depression detection based on emotional audio and evaluation text. J Affect Disord 2021;295:904-913. [PMID: 34706461 DOI: 10.1016/j.jad.2021.08.090] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 07/02/2021] [Accepted: 08/26/2021] [Indexed: 10/20/2022]

Stasak B, Huang Z, Epps J, Joachim D. Depression Classification Using n-Gram Speech Errors from Manual and Automatic Stroop Color Test Transcripts. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021;2021:1631-1635. [PMID: 34891598 DOI: 10.1109/embc46164.2021.9629881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Klangpornkun N, Ruangritchai M, Munthuli A, Onsuwan C, Jaisin K, Pattanaseri K, Lortrakul J, Thanakulakkarachai P, Anansiripinyo T, Amornlaksananon A, Laohawee S, Tantibundhit C. Classification of Depression and Other Psychiatric Conditions Using Speech Features Extracted from a Thai Psychiatric and Verbal Screening Test. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021;2021:651-656. [PMID: 34891377 DOI: 10.1109/embc46164.2021.9629571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Muzammel M, Salam H, Othmani A. End-to-end multimodal clinical depression recognition using deep neural networks: A comparative analysis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021;211:106433. [PMID: 34614452 DOI: 10.1016/j.cmpb.2021.106433] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 09/15/2021] [Indexed: 06/13/2023]

Abstract

BACKGROUND AND OBJECTIVE

Major Depressive Disorder is a highly prevalent and disabling mental health condition. Numerous studies explored multimodal fusion systems combining visual, audio, and textual features via deep learning architectures for clinical depression recognition. Yet, no comparative analysis for multimodal depression analysis has been proposed in the literature.

METHODS

In this paper, an up-to-date literature overview of multimodal depression recognition is presented and an extensive comparative analysis of different deep learning architectures for depression recognition is performed. First, audio features based Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) are studied. Then, early-level and model-level fusion of deep audio features with visual and textual features through LSTM and CNN architectures are investigated.

RESULTS

The performance of the proposed architectures using an hold-out strategy on the DAIC-WOZ dataset (80% training, 10% validation, 10% test split) for binary and severity levels of depression recognition is tested. Using this strategy, a set of experiments have been performed and they have demonstrated: (1) LSTM-based audio features perform slightly better than CNN ones with an accuracy of 66.25% versus 65.60% for binary depression classes. (2) the model level fusion of deep audio and visual features using LSTM network performed the best with an accuracy of 77.16%, a precision of 53% for the depressed class, and a precision of 83% for the non-depressed class. The given network obtained a normalized Root Mean Square Error (RMSE) of 0.15 for depression severity level prediction. Using a Leave-One-Subject-Out strategy, this network achieved an accuracy of 95.38% for binary depression detection, and a normalized RMSE of 0.1476 for depression severity level prediction. Our best-performing architecture outperforms all state-of-the-art approaches on DAIC-WOZ dataset.

CONCLUSIONS

The obtained results show that the proposed LSTM-based surpass the proposed CNN-based architectures allowing to learn temporal dynamics representations of multimodal features. Furthermore, model-level fusion of audio and visual features using an LSTM network leads to the best performance. Our best-performing architecture successfully detects depression using a speech segment of less than 8 seconds, and an average prediction computation time of less than 6ms; making it suitable for real-world clinical applications.

Collapse

Zhao Y, Liang Z, Du J, Zhang L, Liu C, Zhao L. Multi-Head Attention-Based Long Short-Term Memory for Depression Detection From Speech. Front Neurorobot 2021;15:684037. [PMID: 34512301 PMCID: PMC8426553 DOI: 10.3389/fnbot.2021.684037] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 07/19/2021] [Indexed: 11/13/2022] Open

Little B, Alshabrawy O, Stow D, Ferrier IN, McNaney R, Jackson DG, Ladha K, Ladha C, Ploetz T, Bacardit J, Olivier P, Gallagher P, O'Brien JT. Deep learning-based automated speech detection as a marker of social functioning in late-life depression. Psychol Med 2021;51:1441-1450. [PMID: 31944174 PMCID: PMC8311821 DOI: 10.1017/s0033291719003994] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Revised: 10/23/2019] [Accepted: 12/13/2019] [Indexed: 11/24/2022]

Muzammel M, Salam H, Hoffmann Y, Chetouani M, Othmani A. AudVowelConsNet: A phoneme-level based deep CNN architecture for clinical depression diagnosis. MACHINE LEARNING WITH APPLICATIONS 2020. [DOI: 10.1016/j.mlwa.2020.100005] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open

A Game Theory-Based Model for Predicting Depression due to Frustration in Competitive Environments. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2020;2020:3573267. [PMID: 32565879 PMCID: PMC7290902 DOI: 10.1155/2020/3573267] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Revised: 04/22/2020] [Accepted: 05/08/2020] [Indexed: 11/30/2022]

Kaonga NN, Morgan J. Common themes and emerging trends for the use of technology to support mental health and psychosocial well-being in limited resource settings: A review of the literature. Psychiatry Res 2019;281:112594. [PMID: 31605874 DOI: 10.1016/j.psychres.2019.112594] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/18/2019] [Revised: 09/29/2019] [Accepted: 09/29/2019] [Indexed: 12/19/2022]

Depression Detection Through Speech Analysis : A Survey. ACTA ACUST UNITED AC 2019. [DOI: 10.32628/cseit1952190] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]