1
|
Xu X, An F, Wu S, Wang H, Kang Q, Wang Y, Zhu T, Zhang B, Huang W, Liu X, Wang X. Affective norms for 501 Chinese words from three emotional dimensions rated by depressive disorder patients. Front Psychiatry 2024; 15:1309501. [PMID: 38469031 PMCID: PMC10925686 DOI: 10.3389/fpsyt.2024.1309501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Accepted: 02/07/2024] [Indexed: 03/13/2024] Open
Abstract
Introduction Emotional words are often used as stimulus material to explore the cognitive and emotional characteristics of individuals with depressive disorder, while normal individuals mostly rate the scores of affective words. Given that individuals with depressive disorder exhibit a negative cognitive bias, it is possible that their depressive state could influence the ratings of affective words. To enhance the validity of the stimulus material, we specifically recruited patients with depression to provide these ratings. Methods This study provided subjective ratings for 501 Chinese affective norms, incorporating 167 negative words selected from depressive disorder patients' Sino Weibo blogs, and 167 neutral words and 167 positive words selected from the Chinese Affective Word System. The norms are based on the assessments made by 91 patients with depressive disorder and 92 normal individuals, by using the paper-and-pencil quiz on a 9-point scale. Results Regardless of the group, the results show high reliability and validity. We identified group differences in three dimensions: valence, arousal, and self-relevance: the depression group rated negative words higher, but positive and neutral words lower than the normal control group. Conclusion The emotional perception affected the individual's perception of words, to some extent, this database expanded the ratings and provided a reference for exploring norms for individuals with different emotional states.
Collapse
Affiliation(s)
- Xinyue Xu
- Department of Military Medical Psychology, Air Force Medical University, Xi'an, China
- Department of Clinical Psychology, Dongguan Seventh People’s Hospital, Dongguan, China
| | - Fei An
- Department of Military Medical Psychology, Air Force Medical University, Xi'an, China
| | - Shengjun Wu
- Department of Military Medical Psychology, Air Force Medical University, Xi'an, China
| | - Hui Wang
- Department of Military Medical Psychology, Air Force Medical University, Xi'an, China
| | - Qi Kang
- Center for Psychological Crisis Intervention, the 904th Hospital of the Joint Logistics Support Unit, Changzhou, China
| | - Ying Wang
- Department of Psychosomatic Medicine, Xi’an International Medical Center, Xi'an, China
| | - Ting Zhu
- Xinfeng Psychiatric Hospital, Xi ‘an Ninth Hospital, Xi'an, China
| | - Bing Zhang
- Department of Medical Psychology, the 984th Hospital of the Joint Logistics Support Unit, Beijing, China
| | - Wei Huang
- Department of Psychiatry, the 923th Hospital of the Joint Logistics Support Unit, Nanning, China
| | - Xufeng Liu
- Department of Military Medical Psychology, Air Force Medical University, Xi'an, China
| | - Xiuchao Wang
- Department of Military Medical Psychology, Air Force Medical University, Xi'an, China
| |
Collapse
|
2
|
Xu C, Wongpakaran N, Wongpakaran T, Siriwittayakorn T, Wedding D, Varnado P. Syntactic Errors in Older Adults with Depression. MEDICINA (KAUNAS, LITHUANIA) 2023; 59:2133. [PMID: 38138236 PMCID: PMC10744892 DOI: 10.3390/medicina59122133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 12/03/2023] [Accepted: 12/05/2023] [Indexed: 12/24/2023]
Abstract
Background and Objectives: This study investigated the differences in syntactic errors in older individuals with and without major depressive disorder and cognitive function disparities between groups. We also explored the correlation between syntax scores and depression severity. Materials and Methods: Forty-four participants, assessed for dementia with the Mini-Cog, completed the 15-item Geriatric Depression Scale (TGDS-15) and specific language tests. Following a single-anonymized procedure, clinical psychologists rated the tests and syntax scores. Results: The results showed that the depressive disorders group had lower syntax scores than the non-depressed group, primarily on specific subtests. Additionally, cognitive test scores were generally lower among the depressed group. A significant relationship between depression severity and syntax scores was observed (r = -0.426, 95% CI = -0.639, -0.143). Conclusions: In conclusion, major depressive disorder is associated with reduced syntactic abilities, particularly in specific tests. However, the relatively modest sample size limited the sensitivity of this association. This study also considered the potential influence of cultural factors. Unique linguistic characteristics in the study's context were also addressed and considered as potential contributors to the observed findings.
Collapse
Affiliation(s)
- Chengjie Xu
- Master of Science Program in Mental Health, Multidisciplinary and Interdisciplinary School, Chiang Mai University, Chiang Mai 50200, Thailand; (C.X.); (T.W.); (T.S.); (D.W.)
| | - Nahathai Wongpakaran
- Master of Science Program in Mental Health, Multidisciplinary and Interdisciplinary School, Chiang Mai University, Chiang Mai 50200, Thailand; (C.X.); (T.W.); (T.S.); (D.W.)
- Department of Psychiatry, Faculty of Medicine, Chiang Mai University, 110 Intawaroros Rd., T. Sriphum, A. Muang, Chiang Mai 50200, Thailand;
| | - Tinakon Wongpakaran
- Master of Science Program in Mental Health, Multidisciplinary and Interdisciplinary School, Chiang Mai University, Chiang Mai 50200, Thailand; (C.X.); (T.W.); (T.S.); (D.W.)
- Department of Psychiatry, Faculty of Medicine, Chiang Mai University, 110 Intawaroros Rd., T. Sriphum, A. Muang, Chiang Mai 50200, Thailand;
| | - Teeranoot Siriwittayakorn
- Master of Science Program in Mental Health, Multidisciplinary and Interdisciplinary School, Chiang Mai University, Chiang Mai 50200, Thailand; (C.X.); (T.W.); (T.S.); (D.W.)
- Department of English, Faculty of Humanities, Chiang Mai University, 239, Huay Kaew Road, Muang District, Chiang Mai 50200, Thailand
| | - Danny Wedding
- Master of Science Program in Mental Health, Multidisciplinary and Interdisciplinary School, Chiang Mai University, Chiang Mai 50200, Thailand; (C.X.); (T.W.); (T.S.); (D.W.)
- School of Humanistics and Clinical Psychology, Saybrook University, Oakland, CA 91103, USA
| | - Pairada Varnado
- Department of Psychiatry, Faculty of Medicine, Chiang Mai University, 110 Intawaroros Rd., T. Sriphum, A. Muang, Chiang Mai 50200, Thailand;
| |
Collapse
|
3
|
Shi J, Khoo Z. Words for the hearts: a corpus study of metaphors in online depression communities. Front Psychol 2023; 14:1227123. [PMID: 37829080 PMCID: PMC10566633 DOI: 10.3389/fpsyg.2023.1227123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 08/18/2023] [Indexed: 10/14/2023] Open
Abstract
Purpose/significance Humans understand, think, and express themselves through metaphors. The current paper emphasizes the importance of identifying the metaphorical language used in online health communities (OHC) to understand how users frame and make sense of their experiences, which can boost the effectiveness of counseling and interventions for this population. Methods/process We used a web crawler to obtain a corpus of an online depression community. We introduced a three-stage procedure for metaphor identification in a Chinese Corpus: (1) combine MIPVU to identify metaphorical expressions (ME) bottom-up and formulate preliminary working hypotheses; (2) collect more ME top-down in the corpus by performing semantic domain analysis on identified ME; and (3) analyze ME and categorize conceptual metaphors using a reference list. In this way, we have gained a greater understanding of how depression sufferers conceptualize their experience metaphorically in an under-represented language in the literature (Chinese) of a new genre (online health community). Results/conclusion Main conceptual metaphors for depression are classified into PERSONAL LIFE, INTERPERSONAL RELATIONSHIP, TIME, and CYBERCULTURE metaphors. Identifying depression metaphors in the Chinese corpus pinpoints the sociocultural environment people with depression are experiencing: lack of offline support, social stigmatization, and substitutability of offline support with online support. We confirm a number of depression metaphors found in other languages, providing a theoretical basis for researching, identifying, and treating depression in multilingual settings. Our study also identifies new metaphors with source-target connections based on embodied, sociocultural, and idiosyncratic levels. From these three levels, we analyze metaphor research's theoretical and practical implications, finding ways to emphasize its inherent cross-disciplinarity meaningfully.
Collapse
Affiliation(s)
- Jiayi Shi
- School of Foreign Studies, Xi’an Jiaotong University, Xi’an, China
| | - Zhaowei Khoo
- School of Mathematical and Computer Sciences, Heriot-Watt University, Putrajaya, Malaysia
| |
Collapse
|
4
|
Marszałek M, Miązek A, Roczniewska M. Promotion and prevention regulatory focus LIWC dictionary. Polish adaptation and validation. PLoS One 2023; 18:e0288726. [PMID: 37471322 PMCID: PMC10358899 DOI: 10.1371/journal.pone.0288726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 07/04/2023] [Indexed: 07/22/2023] Open
Abstract
This article describes the adaptation and validation of a Polish version of the regulatory focus (RF) Linguistic Inquiry and Word Count (LIWC) dictionary. RF theory proposes that there are two types of self-regulation: promotion (focus on gains, growth, and ideals) and prevention (focus on losses, security, and oughts). Apart from self-report questionnaires, one method to measure RF includes a linguistic analysis. LIWC counts the frequency of words from relevant categories and presents the output as a percentage of all words used in a writing sample. RF LIWC contains two categories: promotion (e.g., achieve, ideal) and prevention (e.g., afraid, fail). To test the psychometric properties of our Polish adaptation of the RF LIWC instrument, we performed three studies. In Study 1 (N = 10), experts in RF theory rated the extent to which each dictionary entry was related to promotion and prevention foci. Results showed that words from the promotion category were rated as more promotion than prevention-related, and the pattern was reversed for words from the prevention category. In Study 2 (N = 130) we examined the divergent validity of the instrument by experimentally manipulating RF and testing the writing patterns. When a promotion focus was activated, individuals wrote more words from the promotion than prevention category, and the pattern was reversed in the prevention group. Study 3 (N = 414) investigated whether the promotion and prevention scores obtained through RF LIWC are linked with results obtained using a self-report questionnaire that measures chronic RF. Promotion scores from RF LIWC correlated positively with chronic promotion RF and prevention scores from RF LIWC correlated positively with chronic prevention RF. These preliminary findings provide initial support for the validity of the Polish adaptation of the RF LIWC.
Collapse
Affiliation(s)
- Magdalena Marszałek
- Institute of Psychology, SWPS University of Social Sciences and Humanities, Warsaw, Poland
| | - Amadeusz Miązek
- Department of International Finance, Poznań University of Economics and Business, Poznań, Poland
| | - Marta Roczniewska
- Institute of Psychology, SWPS University of Social Sciences and Humanities, Warsaw, Poland
- Department of Learning, Informatics, Karolinska Institutet, Management and Ethics, Stockholm, Sweden
| |
Collapse
|
5
|
Ryu J, Heisig S, McLaughlin C, Katz M, Mayberg HS, Gu X. A natural language processing approach reveals first-person pronoun usage and non-fluency as markers of therapeutic alliance in psychotherapy. iScience 2023; 26:106860. [PMID: 37255661 PMCID: PMC10225921 DOI: 10.1016/j.isci.2023.106860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 04/18/2023] [Accepted: 05/08/2023] [Indexed: 06/01/2023] Open
Abstract
It remains elusive what language markers derived from psychotherapy sessions are indicative of therapeutic alliance, limiting our capacity to assess and provide feedback on the trusting quality of the patient-clinician relationship. To address this critical knowledge gap, we leveraged feature extraction methods from natural language processing (NLP), a subfield of artificial intelligence, to quantify pronoun and non-fluency language markers that are relevant for communicative and emotional aspects of therapeutic relationships. From twenty-eight transcripts of non-manualized psychotherapy sessions recorded in outpatient clinics, we identified therapists' first-person pronoun usage frequency and patients' speech transition marking relaxed interaction style as potential metrics of alliance. Behavioral data from patients who played an economic game that measures social exchange (i.e. trust game) suggested that therapists' first-person pronoun usage may influence alliance ratings through their diminished trusting behavior toward therapists. Together, this work supports that communicative language features in patient-therapist dialogues could be markers of alliance.
Collapse
Affiliation(s)
- Jihan Ryu
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Stephen Heisig
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Caroline McLaughlin
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael Katz
- Clinical Psychology Doctoral Program, School of Health Professions and Nursing, Long Island University - CW Post Campus, Greenvale, NY, USA
| | - Helen S. Mayberg
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Xiaosi Gu
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
6
|
Pool-Cen J, Carlos-Martínez H, Hernández-Chan G, Sánchez-Siordia O. Detection of Depression-Related Tweets in Mexico Using Crosslingual Schemes and Knowledge Distillation. Healthcare (Basel) 2023; 11:healthcare11071057. [PMID: 37046984 PMCID: PMC10094126 DOI: 10.3390/healthcare11071057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 03/18/2023] [Accepted: 03/20/2023] [Indexed: 04/08/2023] Open
Abstract
Mental health problems are one of the various ills that afflict the world’s population. Early diagnosis and medical care are public health problems addressed from various perspectives. Among the mental illnesses that most afflict the population is depression; its early diagnosis is vitally important, as it can trigger more severe illnesses, such as suicidal ideation. Due to the lack of homogeneity in current diagnostic tools, the community has focused on using AI tools for opportune diagnosis. Unfortunately, there is a lack of data that allows the use of IA tools for the Spanish language. Our work has a cross-lingual scheme to address this issue, allowing us to identify Spanish and English texts. The experiments demonstrated the methodology’s effectiveness with an F1-score of 0.95. With this methodology, we propose a method to solve a classification problem for depression tweets (or short texts) by reusing English language databases with insufficient data to generate a classification model, such as in the Spanish language. We also validated the information obtained with public data to analyze the behavior of depression in Mexico during the COVID-19 pandemic. Our results show that the use of these methodologies can serve as support, not only in the diagnosis of depression, but also in the construction of different language databases that allow the creation of more efficient diagnostic tools.
Collapse
Affiliation(s)
- Jorge Pool-Cen
- Geospatial Information Sciences Research Center, Mexico City 14240, Mexico
| | - Hugo Carlos-Martínez
- Geospatial Information Sciences Research Center, Mexico City 14240, Mexico
- IxM CONACyT, Mexico City 14240, Mexico
- Laboratorio Nacional de Geointeligencia (GeoInt), Mexico City 14240, Mexico
| | - Gandhi Hernández-Chan
- Geospatial Information Sciences Research Center, Mexico City 14240, Mexico
- IxM CONACyT, Mexico City 14240, Mexico
- Laboratorio Nacional de Geointeligencia (GeoInt), Mexico City 14240, Mexico
| | - Oscar Sánchez-Siordia
- Geospatial Information Sciences Research Center, Mexico City 14240, Mexico
- Laboratorio Nacional de Geointeligencia (GeoInt), Mexico City 14240, Mexico
| |
Collapse
|
7
|
Shi J, Khoo Z. Online health community for change: Analysis of self-disclosure and social networks of users with depression. Front Psychol 2023; 14:1092884. [PMID: 37057164 PMCID: PMC10088863 DOI: 10.3389/fpsyg.2023.1092884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 02/23/2023] [Indexed: 03/30/2023] Open
Abstract
BackgroundA key research question with theoretical and practical implications is to investigate the various conditions by which social network sites (SNS) may either enhance or interfere with mental well-being, given the omnipresence of SNS and their dual effects on well-being.Method/processWe study SNS’ effects on well-being by accounting for users’ personal (i.e., self-disclosure) and situational (i.e., social networks) attributes, using a mixed design of content analysis and social network analysis.Result/conclusionWe compare users’ within-person changes in self-disclosure and social networks in two phases (over half a year), drawing on Weibo Depression SuperTalk, an online community for depression, and find: ① Several network attributes strengthen social support, including network connectivity, global efficiency, degree centralization, hubs of communities, and reciprocal interactions. ② Users’ self-disclosure attributes reflect positive changes in mental well-being and increased attachment to the community. ③ Correlations exist between users’ topological and self-disclosure attributes. ④ A Poisson regression model extracts self-disclosure attributes that may affect users’ received social support, including the writing length, number of active days, informal words, adverbs, negative emotion words, biological process words, and first-person singular forms.InnovationWe combine social network analysis with content analysis, highlighting the need to understand SNS’ effects on well-being by accounting for users’ self-disclosure (content) and communication partners (social networks).Implication/contributionAuthentic user data helps to avoid recall bias commonly found in self-reported data. A longitudinal within-person analysis of SNS’ effects on well-being is helpful for policymakers in public health intervention, community managers for group organizations, and users in online community engagement.
Collapse
Affiliation(s)
- Jiayi Shi
- School of Foreign Studies, Xi’an Jiaotong University, Xi’an, Shaanxi, China
- *Correspondence: Jiayi Shi,
| | - Zhaowei Khoo
- School of Mathematical and Computer Sciences, Heriot-Watt University, Putrajaya, Malaysia
| |
Collapse
|
8
|
Surveillance of communicable diseases using social media: A systematic review. PLoS One 2023; 18:e0282101. [PMID: 36827297 PMCID: PMC9956027 DOI: 10.1371/journal.pone.0282101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 02/07/2023] [Indexed: 02/25/2023] Open
Abstract
BACKGROUND Communicable diseases pose a severe threat to public health and economic growth. The traditional methods that are used for public health surveillance, however, involve many drawbacks, such as being labor intensive to operate and resulting in a lag between data collection and reporting. To effectively address the limitations of these traditional methods and to mitigate the adverse effects of these diseases, a proactive and real-time public health surveillance system is needed. Previous studies have indicated the usefulness of performing text mining on social media. OBJECTIVE To conduct a systematic review of the literature that used textual content published to social media for the purpose of the surveillance and prediction of communicable diseases. METHODOLOGY Broad search queries were formulated and performed in four databases. Both journal articles and conference materials were included. The quality of the studies, operationalized as reliability and validity, was assessed. This qualitative systematic review was guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. RESULTS Twenty-three publications were included in this systematic review. All studies reported positive results for using textual social media content to surveille communicable diseases. Most studies used Twitter as a source for these data. Influenza was studied most frequently, while other communicable diseases received far less attention. Journal articles had a higher quality (reliability and validity) than conference papers. However, studies often failed to provide important information about procedures and implementation. CONCLUSION Text mining of health-related content published on social media can serve as a novel and powerful tool for the automated, real-time, and remote monitoring of public health and for the surveillance and prediction of communicable diseases in particular. This tool can address limitations related to traditional surveillance methods, and it has the potential to supplement traditional methods for public health surveillance.
Collapse
|
9
|
Santos WRD, de Oliveira RL, Paraboni I. SetembroBR: a social media corpus for depression and anxiety disorder prediction. LANG RESOUR EVAL 2023. [DOI: 10.1007/s10579-022-09633-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
10
|
Lyu S, Ren X, Du Y, Zhao N. Detecting depression of Chinese microblog users via text analysis: Combining Linguistic Inquiry Word Count (LIWC) with culture and suicide related lexicons. Front Psychiatry 2023; 14:1121583. [PMID: 36846219 PMCID: PMC9947407 DOI: 10.3389/fpsyt.2023.1121583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 01/26/2023] [Indexed: 02/11/2023] Open
Abstract
INTRODUCTION In recent years, research has used psycholinguistic features in public discourse, networking behaviors on social media and profile information to train models for depression detection. However, the most widely adopted approach for the extraction of psycholinguistic features is to use the Linguistic Inquiry Word Count (LIWC) dictionary and various affective lexicons. Other features related to cultural factors and suicide risk have not been explored. Moreover, the use of social networking behavioral features and profile features would limit the generalizability of the model. Therefore, our study aimed at building a prediction model of depression for text-only social media data through a wider range of possible linguistic features related to depression, and illuminate the relationship between linguistic expression and depression. METHODS We collected 789 users' depression scores as well as their past posts on Weibo, and extracted a total of 117 lexical features via Simplified Chinese Linguistic Inquiry Word Count, Chinese Suicide Dictionary, Chinese Version of Moral Foundations Dictionary, Chinese Version of Moral Motivation Dictionary, and Chinese Individualism/Collectivism Dictionary. RESULTS Results showed that all the dictionaries contributed to the prediction. The best performing model occurred with linear regression, with the Pearson correlation coefficient between predicted values and self-reported values was 0.33, the R-squared was 0.10, and the split-half reliability was 0.75. DISCUSSION This study did not only develop a predictive model applicable to text-only social media data, but also demonstrated the importance taking cultural psychological factors and suicide related expressions into consideration in the calculation of word frequency. Our research provided a more comprehensive understanding of how lexicons related to cultural psychology and suicide risk were associated with depression, and could contribute to the recognition of depression.
Collapse
Affiliation(s)
- Sihua Lyu
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Xiaopeng Ren
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Yihua Du
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
| | - Nan Zhao
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
11
|
Koops S, Brederoo SG, de Boer JN, Nadema FG, Voppel AE, Sommer IE. Speech as a Biomarker for Depression. CNS & NEUROLOGICAL DISORDERS DRUG TARGETS 2023; 22:152-160. [PMID: 34961469 DOI: 10.2174/1871527320666211213125847] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 10/10/2021] [Accepted: 10/10/2021] [Indexed: 01/01/2023]
Abstract
BACKGROUND Depression is a debilitating disorder that at present lacks a reliable biomarker to aid in diagnosis and early detection. Recent advances in computational analytic approaches have opened up new avenues in developing such a biomarker by taking advantage of the wealth of information that can be extracted from a person's speech. OBJECTIVE The current review provides an overview of the latest findings in the rapidly evolving field of computational language analysis for the detection of depression. We cover a wide range of both acoustic and content-related linguistic features, data types (i.e., spoken and written language), and data sources (i.e., lab settings, social media, and smartphone-based). We put special focus on the current methodological advances with regard to feature extraction and computational modeling techniques. Furthermore, we pay attention to potential hurdles in the implementation of automatic speech analysis. CONCLUSION Depressive speech is characterized by several anomalies, such as lower speech rate, less pitch variability and more self-referential speech. With current computational modeling techniques, such features can be used to detect depression with an accuracy of up to 91%. The performance of the models is optimized when machine learning techniques are implemented that suit the type and amount of data. Recent studies now work towards further optimization and generalizability of the computational language models to detect depression. Finally, privacy and ethical issues are of paramount importance to be addressed when automatic speech analysis techniques are further implemented in, for example, smartphones. Altogether, computational speech analysis is well underway towards becoming an effective diagnostic aid for depression.
Collapse
Affiliation(s)
- Sanne Koops
- Department of Biomedical Sciences of Cells & Systems, Cognitive Neurosciences, University of Groningen, University Medical Center Groningen (UMCG), Groningen, The Netherlands
| | - Sanne G Brederoo
- Department of Biomedical Sciences of Cells & Systems, Cognitive Neurosciences, University of Groningen, University Medical Center Groningen (UMCG), Groningen, The Netherlands
- University Center for Psychiatry, University Medical Center Groningen, Groningen, The Netherlands
| | - Janna N de Boer
- Department of Psychiatry, University Medical Center Utrecht, Utrecht University & Brain Center Rudolf Magnus, Utrecht, The Netherlands
| | - Femke G Nadema
- Department of Biomedical Sciences of Cells & Systems, Cognitive Neurosciences, University of Groningen, University Medical Center Groningen (UMCG), Groningen, The Netherlands
| | - Alban E Voppel
- Department of Biomedical Sciences of Cells & Systems, Cognitive Neurosciences, University of Groningen, University Medical Center Groningen (UMCG), Groningen, The Netherlands
| | - Iris E Sommer
- Department of Biomedical Sciences of Cells & Systems, Cognitive Neurosciences, University of Groningen, University Medical Center Groningen (UMCG), Groningen, The Netherlands
| |
Collapse
|
12
|
Tejaswini V, Babu KS, Sahoo B. Depression Detection from Social Media Text Analysis using Natural Language Processing Techniques and Hybrid Deep Learning Model. ACM T ASIAN LOW-RESO 2022. [DOI: 10.1145/3569580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Depression is a kind of emotion that negatively impacts people's daily lives. The number of people suffering from long-term feelings is increasing every year across the globe. Depressed patients may engage in self-harm behaviors, which occasionally result in suicide. Many psychiatrists struggle to identify the presence of mental illness or negative emotion early to provide a better course of treatment before they reach a critical stage. One of the most challenging problems is detecting depression in people at the earliest possible stage. Researchers are using Natural Language Processing (NLP) techniques to analyze text content uploaded on social media, which helps to design approaches for detecting depression. This work analyses numerous prior studies that used learning techniques to identify depression. The existing methods suffer from better model representation problems to detect depression from the text with high accuracy. The present work addresses a solution to these problems by creating a new hybrid deep learning neural network design with better text representations called "Fasttext Convolution Neural Network with Long Short-Term Memory (FCL)." In addition, this work utilizes the advantage of NLP to simplify the text analysis during the model development. The FCL model comprises fasttext embedding for better text representation considering out-of-vocabulary (OOV) with semantic information, a convolution neural network (CNN) architecture to extract global information, and Long Short-Term Memory (LSTM) architecture to extract local features with dependencies. The present work was implemented on real-world datasets utilized in the literature. The proposed technique provides better results than the state-of-the-art to detect depression with high accuracy.
Collapse
Affiliation(s)
- Vankayala Tejaswini
- Computer Science and Engineering, National Institute of Technology Rourkela, Odisha, India
| | - Korra Sathya Babu
- Computer Science and Engineering, Indian Institute of Information Technology Design and Manufacturing, Kurnool, Andhra Pradesh, India
| | - Bibhudatta Sahoo
- Computer Science and Engineering, National Institute of Technology Rourkela, Odisha, India
| |
Collapse
|
13
|
Pan W, Han Y, Li J, Zhang E, He B. The positive energy of netizens: development and application of fine-grained sentiment lexicon and emotional intensity model. CURRENT PSYCHOLOGY 2022; 42:1-18. [PMID: 36345548 PMCID: PMC9630060 DOI: 10.1007/s12144-022-03876-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/10/2022] [Indexed: 11/06/2022]
Abstract
The outbreak of COVID-19 has led to a global health crisis and caused huge emotional swings. However, the positive emotional expressions, like self-confidence, optimism, and praise, that appear in Chinese social networks are rarely explored by researchers. This study aims to analyze the characteristics of netizens' positive energy expressions and the impact of node events on public emotional expression during the COVID-19 pandemic. First, a total of 6,525,249 Chinese texts posted by Sina Weibo users were randomly selected through textual data cleaning and word segmentation for corpus construction. A fine-grained sentiment lexicon that contained POSITIVE ENERGY was built using Word2Vec technology; this lexicon was later used to conduct sentiment category analysis on original posts. Next, through manual labeling and multi-classification machine learning model construction, four mainstream machine learning algorithms were selected to train the emotional intensity model. Finally, the lexicon and optimized emotional intensity model were used to analyze the emotional expressions of Chinese netizens. The results show that POSITIVE ENERGY expression accounted for 40.97% during the COVID-19 pandemic. Over the course of time, POSITIVE ENERGY emotions were displayed at the highest levels and SURPRISES the lowest. The analysis results of the node events showed after the outbreak was confirmed officially, the expressions of POSITIVE ENERGY and FEAR increased simultaneously. After the initial victory in pandemic prevention and control, the expression of POSITIVE ENERGY and SAD reached a peak, while the increase of SAD was the most prominent. The fine-grained sentiment lexicon, which includes a POSITIVE ENERGY category, demonstrated reliable algorithm performance and can be used for sentiment classification of Chinese Internet context. We also found many POSITIVE ENERGY expressions in Chinese online social platforms which are proven to be significantly affected by nod events of different nature.
Collapse
Affiliation(s)
- Wenhao Pan
- School of Public Administration, South China University of Technology, Guangzhou, China
| | - Yingying Han
- School of Public Administration, South China University of Technology, Guangzhou, China
| | - Jinjin Li
- School of Psychology, Guizhou Normal University, Guiyang, China
| | | | - Bikai He
- Department of Intelligent Engineering, Guiyang Institute of Information Science and Technology, Guiyang, China
| |
Collapse
|
14
|
Abu-Taieh EM, AlHadid I, Masa’deh R, Alkhawaldeh RS, Khwaldeh S, Alrowwad A. Factors Affecting the Use of Social Networks and Its Effect on Anxiety and Depression among Parents and Their Children: Predictors Using ML, SEM and Extended TAM. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph192113764. [PMID: 36360644 PMCID: PMC9656283 DOI: 10.3390/ijerph192113764] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 10/14/2022] [Accepted: 10/17/2022] [Indexed: 05/12/2023]
Abstract
Previous research has found support for depression and anxiety associated with social networks. However, little research has explored parents' depression and anxiety constructs as mediators that may account for children's depression and anxiety. The purpose of this paper is to test the influence of different factors on children's depression and anxiety, extending from parents' anxiety and depression in Jordan. The authors recruited 857 parents to complete relevant web survey measures with constructs and items and a model based on different research models TAM and extended with trust, analyzed using SEM, CFA with SPSS and AMOS, and ML methods, using the triangulation method to validate the results and help predict future applications. The authors found support for the structural model whereby behavioral intention to use social media influences the parent's anxiety and depression which correlate to their offspring's anxiety and depression. Behavioral intention to use social media can be enticed by enjoyment, trust, ease of use, usefulness, and social influences. This study is unique in exploring rumination in the context of the relationship between parent-child anxiety and depression due to the use of social networks.
Collapse
Affiliation(s)
- Evon M. Abu-Taieh
- Department of Computer Information Systems, Faculty of Information Technology and Systems, The University of Jordan, Aqaba 77110, Jordan
| | - Issam AlHadid
- Department Information Technology, Faculty of Information Technology and Systems, The University of Jordan, Aqaba 77110, Jordan
| | - Ra’ed Masa’deh
- Department of Management Information Systems, School of Business, The University of Jordan, Amman 77110, Jordan
| | - Rami S. Alkhawaldeh
- Department of Computer Information Systems, Faculty of Information Technology and Systems, The University of Jordan, Aqaba 77110, Jordan
| | - Sufian Khwaldeh
- Department Information Technology, Faculty of Information Technology and Systems, The University of Jordan, Aqaba 77110, Jordan
- Department Information Technology, Faculty of Information Technology and Systems, University of Fujairah, Fujairah P.O. Box 2202, United Arab Emirates
| | - Ala’aldin Alrowwad
- Department of Business Management, School of Business, The University of Jordan, Aqaba 77110, Jordan
- Correspondence:
| |
Collapse
|
15
|
Chen L, Jeong J, Simpkins B, Ferrara E. Exploring ADHD Users’ Behavior on Twitter: A Comparative Analysis of Tweet Content and User Interactions (Preprint). J Med Internet Res 2022; 25:e43439. [PMID: 37195757 DOI: 10.2196/43439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 04/04/2023] [Accepted: 04/05/2023] [Indexed: 04/08/2023] Open
Abstract
BACKGROUND With the widespread use of social media, people share their real-time thoughts and feelings via interactions on these platforms, including those revolving around mental health problems. This can provide a new opportunity for researchers to collect health-related data to study and analyze mental disorders. However, as one of the most common mental disorders, there are few studies regarding the manifestations of attention-deficit/hyperactivity disorder (ADHD) on social media. OBJECTIVE This study aims to examine and identify the different behavioral patterns and interactions of users with ADHD on Twitter through the text content and metadata of their posted tweets. METHODS First, we built 2 data sets: an ADHD user data set containing 3135 users who explicitly reported having ADHD on Twitter and a control data set made up of 3223 randomly selected Twitter users without ADHD. All historical tweets of users in both data sets were collected. We applied mixed methods in this study. We performed Top2Vec topic modeling to extract topics frequently mentioned by users with ADHD and those without ADHD and used thematic analysis to further compare the differences in contents that were discussed by the 2 groups under these topics. We used a distillBERT sentiment analysis model to calculate the sentiment scores for the emotion categories and compared the sentiment intensity and frequency. Finally, we extracted users' posting time, tweet categories, and the number of followers and followings from the metadata of tweets and compared the statistical distribution of these features between ADHD and non-ADHD groups. RESULTS In contrast to the control group of the non-ADHD data set, users with ADHD tweeted about the inability to concentrate and manage time, sleep disturbance, and drug abuse. Users with ADHD felt confusion and annoyance more frequently, while they felt less excitement, caring, and curiosity (all P<.001). Users with ADHD were more sensitive to emotions and felt more intense feelings of nervousness, sadness, confusion, anger, and amusement (all P<.001). As for the posting characteristics, compared with controls, users with ADHD were more active in posting tweets (P=.04), especially at night between midnight and 6 AM (P<.001); posting more tweets with original content (P<.001); and following fewer people on Twitter (P<.001). CONCLUSIONS This study revealed how users with ADHD behave and interact differently on Twitter compared with those without ADHD. On the basis of these differences, researchers, psychiatrists, and clinicians can use Twitter as a potentially powerful platform to monitor and study people with ADHD, provide additional health care support to them, improve the diagnostic criteria of ADHD, and design complementary tools for automatic ADHD detection.
Collapse
|
16
|
Sheoran H, Srivastava P. Self-Reported Depression Is Associated With Aberration in Emotional Reactivity and Emotional Concept Coding. Front Psychol 2022; 13:814234. [PMID: 35814123 PMCID: PMC9267768 DOI: 10.3389/fpsyg.2022.814234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 04/06/2022] [Indexed: 12/02/2022] Open
Abstract
Cognitive impairment, alterations in mood, emotion dysregulation are just a few of the consequences of depression. Despite depression being reported as the most common mental disorder worldwide, examining depression or risks of depression is still challenging. Emotional reactivity has been observed to predict the risk of depression, but the results have been mixed for negative emotional reactivity (NER). To better understand the emotional response conflict, we asked our participants to describe their feeling in meaningful sentences alongside reporting their reactions to the emotionally evocative words. We presented a word on the screen and asked participants to perform two tasks, rate their feeling after reading the word using the self-assessment manikin (SAM) scale, and describe their feeling using the property generation task. The emotional content was analyzed using a novel machine-learning algorithm approach. We performed these two tasks in blocks and randomized their order across participants. Beck Depression Inventory (BDI) was used to categorize participants into self-reported non-depressed (ND) and depressed (D) groups. Compared to the ND, the D group reported reduced positive emotional reactivity when presented with extremely pleasant words regardless of their arousal levels. However, no significant difference was observed between the D and ND groups for negative emotional reactivity. In contrast, we observed increased sadness and inclination toward low negative context from descriptive content by the D compared to the ND group. The positive content analyses showed mixed results. The contrasting results between the emotional reactivity and emotional content analyses demand further examination between cohorts of self-reported depressive symptoms, no-symptoms, and MDD patients to better examine the risks of depression and help design early interventions.
Collapse
Affiliation(s)
| | - Priyanka Srivastava
- Perception and Cognition Research Group, Cognitive Science Lab, Kohli Center on Intelligent Systems, International Institute of Information Technology, Hyderabad, India
| |
Collapse
|
17
|
Zarate D, Stavropoulos V, Ball M, de Sena Collier G, Jacobson NC. Exploring the digital footprint of depression: a PRISMA systematic literature review of the empirical evidence. BMC Psychiatry 2022; 22:421. [PMID: 35733121 PMCID: PMC9214685 DOI: 10.1186/s12888-022-04013-y] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Accepted: 05/17/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND This PRISMA systematic literature review examined the use of digital data collection methods (including ecological momentary assessment [EMA], experience sampling method [ESM], digital biomarkers, passive sensing, mobile sensing, ambulatory assessment, and time-series analysis), emphasizing on digital phenotyping (DP) to study depression. DP is defined as the use of digital data to profile health information objectively. AIMS Four distinct yet interrelated goals underpin this study: (a) to identify empirical research examining the use of DP to study depression; (b) to describe the different methods and technology employed; (c) to integrate the evidence regarding the efficacy of digital data in the examination, diagnosis, and monitoring of depression and (d) to clarify DP definitions and digital mental health records terminology. RESULTS Overall, 118 studies were assessed as eligible. Considering the terms employed, "EMA", "ESM", and "DP" were the most predominant. A variety of DP data sources were reported, including voice, language, keyboard typing kinematics, mobile phone calls and texts, geocoded activity, actigraphy sensor-related recordings (i.e., steps, sleep, circadian rhythm), and self-reported apps' information. Reviewed studies employed subjectively and objectively recorded digital data in combination with interviews and psychometric scales. CONCLUSIONS Findings suggest links between a person's digital records and depression. Future research recommendations include (a) deriving consensus regarding the DP definition and (b) expanding the literature to consider a person's broader contextual and developmental circumstances in relation to their digital data/records.
Collapse
Affiliation(s)
- Daniel Zarate
- Institute for Health and Sport, Victoria University, Melbourne, Australia.
| | - Vasileios Stavropoulos
- grid.1019.90000 0001 0396 9544Institute for Health and Sport, Victoria University, Melbourne, Australia ,grid.5216.00000 0001 2155 0800Department of Psychology, University of Athens, Athens, Greece
| | - Michelle Ball
- grid.1019.90000 0001 0396 9544Institute for Health and Sport, Victoria University, Melbourne, Australia
| | - Gabriel de Sena Collier
- grid.1019.90000 0001 0396 9544Institute for Health and Sport, Victoria University, Melbourne, Australia
| | - Nicholas C. Jacobson
- grid.254880.30000 0001 2179 2404Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Hanover, USA ,grid.254880.30000 0001 2179 2404Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, USA ,grid.254880.30000 0001 2179 2404Department of Psychiatry, Geisel School of Medicine, Dartmouth College, Hanover, USA ,grid.254880.30000 0001 2179 2404Quantitative Biomedical Sciences Program, Dartmouth College, Hanover, USA
| |
Collapse
|
18
|
Kelley SW, Mhaonaigh CN, Burke L, Whelan R, Gillan CM. Machine learning of language use on Twitter reveals weak and non-specific predictions. NPJ Digit Med 2022; 5:35. [PMID: 35338248 PMCID: PMC8956571 DOI: 10.1038/s41746-022-00576-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 02/11/2022] [Indexed: 11/30/2022] Open
Abstract
Depressed individuals use language differently than healthy controls and it has been proposed that social media posts can be used to identify depression. Much of the evidence behind this claim relies on indirect measures of mental health and few studies have tested if these language features are specific to depression versus other aspects of mental health. We analysed the Tweets of 1006 participants who completed questionnaires assessing symptoms of depression and 8 other mental health conditions. Daily Tweets were subjected to textual analysis and the resulting linguistic features were used to train an Elastic Net model on depression severity, using nested cross-validation. We then tested performance in a held-out test set (30%), comparing predictions of depression versus 8 other aspects of mental health. The depression trained model had modest out-of-sample predictive performance, explaining 2.5% of variance in depression symptoms (R2 = 0.025, r = 0.16). The performance of this model was as-good or superior when used to identify other aspects of mental health: schizotypy, social anxiety, eating disorders, generalised anxiety, above chance for obsessive-compulsive disorder, apathy, but not significant for alcohol abuse or impulsivity. Machine learning analysis of social media data, when trained on well-validated clinical instruments, could not make meaningful individualised predictions regarding users’ mental health. Furthermore, language use associated with depression was non-specific, having similar performance in predicting other mental health problems.
Collapse
Affiliation(s)
- Sean W Kelley
- School of Psychology, Trinity College Dublin, Dublin, Ireland. .,Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland.
| | | | - Louise Burke
- School of Psychology, Trinity College Dublin, Dublin, Ireland
| | - Robert Whelan
- School of Psychology, Trinity College Dublin, Dublin, Ireland.,Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland.,Global Brain Health Institute, Trinity College Dublin, Dublin, Ireland
| | - Claire M Gillan
- School of Psychology, Trinity College Dublin, Dublin, Ireland.,Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland.,Global Brain Health Institute, Trinity College Dublin, Dublin, Ireland
| |
Collapse
|
19
|
Using language in social media posts to study the network dynamics of depression longitudinally. Nat Commun 2022; 13:870. [PMID: 35169166 PMCID: PMC8847554 DOI: 10.1038/s41467-022-28513-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 01/21/2022] [Indexed: 12/13/2022] Open
Abstract
Network theory of mental illness posits that causal interactions between symptoms give rise to mental health disorders. Increasing evidence suggests that depression network connectivity may be a risk factor for transitioning and sustaining a depressive state. Here we analysed social media (Twitter) data from 946 participants who retrospectively self-reported the dates of any depressive episodes in the past 12 months and current depressive symptom severity. We construct personalised, within-subject, networks based on depression-related linguistic features. We show an association existed between current depression severity and 8 out of 9 text features examined. Individuals with greater depression severity had higher overall network connectivity between depression-relevant linguistic features than those with lesser severity. We observed within-subject changes in overall network connectivity associated with the dates of a self-reported depressive episode. The connectivity within personalized networks of depression-associated linguistic features may change dynamically with changes in current depression symptoms.
Collapse
|
20
|
Salas-Zárate R, Alor-Hernández G, Salas-Zárate MDP, Paredes-Valverde MA, Bustos-López M, Sánchez-Cervantes JL. Detecting Depression Signs on Social Media: A Systematic Literature Review. Healthcare (Basel) 2022; 10:healthcare10020291. [PMID: 35206905 PMCID: PMC8871802 DOI: 10.3390/healthcare10020291] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 01/21/2022] [Accepted: 01/29/2022] [Indexed: 01/14/2023] Open
Abstract
Among mental health diseases, depression is one of the most severe, as it often leads to suicide; due to this, it is important to identify and summarize existing evidence concerning depression sign detection research on social media using the data provided by users. This review examines aspects of primary studies exploring depression detection from social media submissions (from 2016 to mid-2021). The search for primary studies was conducted in five digital libraries: ACM Digital Library, IEEE Xplore Digital Library, SpringerLink, Science Direct, and PubMed, as well as on the search engine Google Scholar to broaden the results. Extracting and synthesizing the data from each paper was the main activity of this work. Thirty-four primary studies were analyzed and evaluated. Twitter was the most studied social media for depression sign detection. Word embedding was the most prominent linguistic feature extraction method. Support vector machine (SVM) was the most used machine-learning algorithm. Similarly, the most popular computing tool was from Python libraries. Finally, cross-validation (CV) was the most common statistical analysis method used to evaluate the results obtained. Using social media along with computing tools and classification methods contributes to current efforts in public healthcare to detect signs of depression from sources close to patients.
Collapse
Affiliation(s)
- Rafael Salas-Zárate
- Tecnológico Nacional de México/I. T. Orizaba, Av. Oriente 9 No. 852, Col. Emiliano Zapata, Orizaba 94320, Veracruz, Mexico;
| | - Giner Alor-Hernández
- Tecnológico Nacional de México/I. T. Orizaba, Av. Oriente 9 No. 852, Col. Emiliano Zapata, Orizaba 94320, Veracruz, Mexico;
- Correspondence: ; Tel.: +52-(272)-725-7056
| | - María del Pilar Salas-Zárate
- Tecnológico Nacional de México/I.T.S. Teziutlán, Fracción I y II S/N, Aire Libre, Teziutlán 73960, Puebla, Mexico; (M.d.P.S.-Z.); (M.A.P.-V.)
| | - Mario Andrés Paredes-Valverde
- Tecnológico Nacional de México/I.T.S. Teziutlán, Fracción I y II S/N, Aire Libre, Teziutlán 73960, Puebla, Mexico; (M.d.P.S.-Z.); (M.A.P.-V.)
| | - Maritza Bustos-López
- Centro de Investigación en Inteligencia Artificial/Universidad Veracruzana, Sebastián Camacho 5, Zona Centro, Centro, Xalapa-Enríquez 91000, Veracruz, Mexico;
| | - José Luis Sánchez-Cervantes
- CONACYT-Tecnológico Nacional de México/I. T. Orizaba, Av. Oriente 9 No. 852, Col. Emiliano Zapata, Orizaba 94320, Veracruz, Mexico;
| |
Collapse
|
21
|
Using Machine Learning for Pharmacovigilance: A Systematic Review. Pharmaceutics 2022; 14:pharmaceutics14020266. [PMID: 35213998 PMCID: PMC8924891 DOI: 10.3390/pharmaceutics14020266] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 01/13/2022] [Accepted: 01/21/2022] [Indexed: 02/04/2023] Open
Abstract
Pharmacovigilance is a science that involves the ongoing monitoring of adverse drug reactions to existing medicines. Traditional approaches in this field can be expensive and time-consuming. The application of natural language processing (NLP) to analyze user-generated content is hypothesized as an effective supplemental source of evidence. In this systematic review, a broad and multi-disciplinary literature search was conducted involving four databases. A total of 5318 publications were initially found. Studies were considered relevant if they reported on the application of NLP to understand user-generated text for pharmacovigilance. A total of 16 relevant publications were included in this systematic review. All studies were evaluated to have medium reliability and validity. For all types of drugs, 14 publications reported positive findings with respect to the identification of adverse drug reactions, providing consistent evidence that natural language processing can be used effectively and accurately on user-generated textual content that was published to the Internet to identify adverse drug reactions for the purpose of pharmacovigilance. The evidence presented in this review suggest that the analysis of textual data has the potential to complement the traditional system of pharmacovigilance.
Collapse
|
22
|
Liu J, Shi M. A Hybrid Feature Selection and Ensemble Approach to Identify Depressed Users in Online Social Media. Front Psychol 2022; 12:802821. [PMID: 35115990 PMCID: PMC8803736 DOI: 10.3389/fpsyg.2021.802821] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 12/23/2021] [Indexed: 11/13/2022] Open
Abstract
Depression has become one of the most common mental illnesses, and the widespread use of social media provides new ideas for detecting various mental illnesses. The purpose of this study is to use machine learning technology to detect users of depressive patients based on user-shared content and posting behaviors in social media. At present, the existing research mostly uses a single detection method, and the unbalanced class distribution often leads to a low recognition rate. In addition, a large number of irrelevant or redundant features in high-dimensional data sets interfere with the accuracy of recognition. To solve this problem, this paper proposes a hybrid feature selection and stacking ensemble strategy for depression user detection. First, recursive elimination method and extremely randomized trees method are used to calculate feature importance and mutual information value, calculate feature weight vector, and select the optimal feature subset according to the feature weight. Second, naive bayes, k-nearest neighbor, regularized logistic regression and support vector machine are used as base learners, and a simple logistic regression algorithm is used as a combination strategy to build a stacking model. Experimental results show that compared with other machine learning algorithms, the proposed hybrid method, which integrates feature selection and ensemble, has a higher accuracy of 90.27% in identifying online patients. We believe this study will help develop new methods to identify depressed people in social networks, providing guidance for future research.
Collapse
|
23
|
Razia Sulthana A., Jaithunbi A. K., Harikrishnan H, Varadarajan V. Sentiment Analysis on Movie Reviews Dataset Using Support Vector Machines and Ensemble Learning. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING 2022. [DOI: 10.4018/ijitwe.311428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The internet makes it easier for people to connect to each other and has become a platform to express ideas and share information with the world. The growth of the internet has indirectly led to the development of social networking sites. The reviews posted by people on these sites implies their opinion, and analysis over reviews is required to understand their intent. In this paper, natural language processing technique and machine learning algorithms are applied to classify the text data. The contributions of the proposed approach are three-fold: 1) chi square selector is applied to select the k-best features, 2) support vector machines is executed to classify the reviews (hyperparameters of the SVM classifier are tuned using GridSearch approach), and 3) bagging algorithm is applied with the base classifier over the newly built SVM classifier. The number of base classifiers of the bagging algorithm is varied accordingly. The results of the proposed approach are compared to the similar existing work, and hence, it is found to achieve better results as compared to the existing systems.
Collapse
|
24
|
Han J, Feng Y, Li N, Feng L, Xiao L, Zhu X, Wang G. Correlation Between Word Frequency and 17 Items of Hamilton Scale in Major Depressive Disorder. Front Psychiatry 2022; 13:902873. [PMID: 35592381 PMCID: PMC9110653 DOI: 10.3389/fpsyt.2022.902873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 04/14/2022] [Indexed: 12/03/2022] Open
Abstract
OBJECTIVE To explore the correlation between word frequency and 17 items of the Hamilton Depression Scale (HAMD-17) in assessing the severity of depression in clinical interviews. METHODS This study included 70 patients with major depressive disorder (MDD) who were hospitalized in the Beijing Anding Hospital. Clinicians interviewed eligible patients, collected general information, disease symptoms, duration, and scored patients with HAMD-17. The words used by the patients during the interview were classified and extracted according to the HowNet sentiment dictionary, including positive evaluation words, positive emotional words, negative evaluation words, negative emotional words. Symptom severity was grouped according to the HAMD-17 score: mild depressive symptoms is 8-17 points, moderate depressive symptoms is 18-24 points and severe depressive symptoms is >24 points. Analysis of Variance (ANOVA) was used to analyze the four categories of words among the groups, and partial correlation analysis was used to analyze the correlation between the four categories of word frequencies based on HowNet sentiment dictionary and the HAMD-17 scale to evaluate the total score. Receiver operating characteristic (ROC) curves were used to determine meaningful cut-off values. RESULTS There was a significant difference in negative evaluation words between groups (p = 0.032). After controlling for gender, age and years of education, the HAMD-17 total score was correlated with negative evaluation words (p = 0.009, r = 0.319) and negative emotional words (p = 0.027, r = 0.272), as the severity of depressive symptoms increased, the number of negative evaluation and negative emotional words in clinical interviews increased. Negative evaluation words distinguished patients with mild and moderate-severe depression. The area under the curve is 0.693 (p = 0.006) when the cut-off value is 8.48, the Youden index was 0.41, the sensitivity was 55.2%, and the specificity was 85.4%. CONCLUSION In the clinical interview with MDD patients, the number of word frequencies based on HowNet sentiment dictionary may be beneficial in evaluating the severity of depressive symptoms.
Collapse
Affiliation(s)
- Jiali Han
- The National Clinical Research Center for Mental Disorders and Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital and the Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing, China
| | - Yuan Feng
- The National Clinical Research Center for Mental Disorders and Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital and the Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing, China
| | - Nanxi Li
- The National Clinical Research Center for Mental Disorders and Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital and the Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing, China
| | - Lei Feng
- The National Clinical Research Center for Mental Disorders and Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital and the Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing, China
| | - Le Xiao
- The National Clinical Research Center for Mental Disorders and Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital and the Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing, China
| | - Xuequan Zhu
- The National Clinical Research Center for Mental Disorders and Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital and the Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing, China
| | - Gang Wang
- The National Clinical Research Center for Mental Disorders and Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital and the Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing, China
| |
Collapse
|
25
|
Cuerda C, Zornoza A, Gallud JA, Tesoriero R, Ayuso DR. Deep learning assisted cognitive diagnosis for the D-Riska application. Soft comput 2021. [DOI: 10.1007/s00500-021-06510-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
AbstractIn this article, we expose a system developed that extends the Acquired Brain Injury (ABI) diagnostic application known as D-Riska with an artificial intelligence module that supports the diagnosis of ABI enabling therapists to evaluate patients in an assisted way. The application is in charge of collecting the data of the diagnostic tests of the patients, and due to a multi-class Convolutional Neural Network classifier (CNN), it is capable of making predictions that facilitate the diagnosis and the final score obtained in the test by the patient. To find out the best solution to this problem, different classifiers are used to compare the performance of the proposed model based on various classification metrics. The proposed CNN classifier makes predictions with 93 % of Accuracy, 94 % of Precision, 91 %, of Recall and 92% of F1-Score.
Collapse
|
26
|
Wongkoblap A, Vadillo MA, Curcin V. Deep Learning With Anaphora Resolution for the Detection of Tweeters With Depression: Algorithm Development and Validation Study. JMIR Ment Health 2021; 8:e19824. [PMID: 34383688 PMCID: PMC8380581 DOI: 10.2196/19824] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Revised: 09/02/2020] [Accepted: 03/31/2021] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Mental health problems are widely recognized as a major public health challenge worldwide. This concern highlights the need to develop effective tools for detecting mental health disorders in the population. Social networks are a promising source of data wherein patients publish rich personal information that can be mined to extract valuable psychological cues; however, these data come with their own set of challenges, such as the need to disambiguate between statements about oneself and third parties. Traditionally, natural language processing techniques for social media have looked at text classifiers and user classification models separately, hence presenting a challenge for researchers who want to combine text sentiment and user sentiment analysis. OBJECTIVE The objective of this study is to develop a predictive model that can detect users with depression from Twitter posts and instantly identify textual content associated with mental health topics. The model can also address the problem of anaphoric resolution and highlight anaphoric interpretations. METHODS We retrieved the data set from Twitter by using a regular expression or stream of real-time tweets comprising 3682 users, of which 1983 self-declared their depression and 1699 declared no depression. Two multiple instance learning models were developed-one with and one without an anaphoric resolution encoder-to identify users with depression and highlight posts related to the mental health of the author. Several previously published models were applied to our data set, and their performance was compared with that of our models. RESULTS The maximum accuracy, F1 score, and area under the curve of our anaphoric resolution model were 92%, 92%, and 90%, respectively. The model outperformed alternative predictive models, which ranged from classical machine learning models to deep learning models. CONCLUSIONS Our model with anaphoric resolution shows promising results when compared with other predictive models and provides valuable insights into textual content that is relevant to the mental health of the tweeter.
Collapse
Affiliation(s)
- Akkapon Wongkoblap
- Department of Informatics, King's College London, London, United Kingdom.,DIGITECH, Suranaree University of Technology, Nakhon Ratchasima, Thailand.,School of Information Technology, Suranaree University of Technology, Nakhon Ratchasima, Thailand
| | - Miguel A Vadillo
- School of Population Health and Environmental Sciences, King's College London, London, United Kingdom.,Departamento de Psicología Básica, Universidad Autónoma de Madrid, Madrid, Spain
| | - Vasa Curcin
- Department of Informatics, King's College London, London, United Kingdom.,School of Population Health and Environmental Sciences, King's College London, London, United Kingdom
| |
Collapse
|
27
|
Zhou J, Zogan H, Yang S, Jameel S, Xu G, Chen F. Detecting Community Depression Dynamics Due to COVID-19 Pandemic in Australia. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS 2021; 8:982-991. [PMID: 37982038 PMCID: PMC8545002 DOI: 10.1109/tcss.2020.3047604] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 12/07/2020] [Accepted: 12/20/2020] [Indexed: 11/21/2023]
Abstract
The recent Coronavirus Infectious Disease 2019 (COVID-19) pandemic has caused an unprecedented impact across the globe. We have also witnessed millions of people with increased mental health issues, such as depression, stress, worry, fear, disgust, sadness, and anxiety, which have become one of the major public health concerns during this severe health crisis. Depression can cause serious emotional, behavioral, and physical health problems with significant consequences, both personal and social costs included. This article studies community depression dynamics due to the COVID-19 pandemic through user-generated content on Twitter. A new approach based on multimodal features from tweets and term frequency-inverse document frequency (TF-IDF) is proposed to build depression classification models. Multimodal features capture depression cues from emotion, topic, and domain-specific perspectives. We study the problem using recently scraped tweets from Twitter users emanating from the state of New South Wales in Australia. Our novel classification model is capable of extracting depression polarities that may be affected by COVID-19 and related events during the COVID-19 period. The results found that people became more depressed after the outbreak of COVID-19. The measures implemented by the government, such as the state lockdown, also increased depression levels.
Collapse
Affiliation(s)
- Jianlong Zhou
- Data Science InstituteUniversity of Technology SydneyUltimoNSW2007Australia
| | - Hamad Zogan
- Advanced Analytics InstituteUniversity of Technology SydneyUltimoNSW2007Australia
| | - Shuiqiao Yang
- Data Science InstituteUniversity of Technology SydneyUltimoNSW2007Australia
| | - Shoaib Jameel
- School of Computer Science and Electronic EngineeringUniversity of EssexColchesterCO4 3SQU.K.
| | - Guandong Xu
- Advanced Analytics InstituteUniversity of Technology SydneyUltimoNSW2007Australia
| | - Fang Chen
- Data Science InstituteUniversity of Technology SydneyUltimoNSW2007Australia
| |
Collapse
|
28
|
Ramírez-Cifuentes D, Freire A, Baeza-Yates R, Sanz Lamora N, Álvarez A, González-Rodríguez A, Lozano Rochel M, Llobet Vives R, Velazquez DA, Gonfaus JM, Gonzàlez J. Characterization of Anorexia Nervosa on Social Media: Textual, Visual, Relational, Behavioral, and Demographical Analysis. J Med Internet Res 2021; 23:e25925. [PMID: 34283033 PMCID: PMC8335610 DOI: 10.2196/25925] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Revised: 03/10/2021] [Accepted: 05/04/2021] [Indexed: 01/29/2023] Open
Abstract
BACKGROUND Eating disorders are psychological conditions characterized by unhealthy eating habits. Anorexia nervosa (AN) is defined as the belief of being overweight despite being dangerously underweight. The psychological signs involve emotional and behavioral issues. There is evidence that signs and symptoms can manifest on social media, wherein both harmful and beneficial content is shared daily. OBJECTIVE This study aims to characterize Spanish-speaking users showing anorexia signs on Twitter through the extraction and inference of behavioral, demographical, relational, and multimodal data. By using the transtheoretical model of health behavior change, we focus on characterizing and comparing users at the different stages of the model for overcoming AN, including treatment and full recovery periods. METHODS We analyzed the writings, posting patterns, social relationships, and images shared by Twitter users who underwent different stages of anorexia nervosa and compared the differences among users going through each stage of the illness and users in the control group (ie, users without AN). We also analyzed the topics of interest of their followees (ie, users followed by study participants). We used a clustering approach to distinguish users at an early phase of the illness (precontemplation) from those that recognize that their behavior is problematic (contemplation) and generated models for the detection of tweets and images related to AN. We considered two types of control users-focused control users, which are those that use terms related to anorexia, and random control users. RESULTS We found significant differences between users at each stage of the recovery process (P<.001) and control groups. Users with AN tweeted more frequently at night, with a median sleep time tweets ratio (STTR) of 0.05, than random control users (STTR=0.04) and focused control users (STTR=0.03). Pictures were relevant for the characterization of users. Focused and random control users were characterized by the use of text in their profile pictures. We also found a strong polarization between focused control users and users in the first stages of the disorder. There was a strong correlation among the shared interests between users with AN and their followees (ρ=0.96). In addition, the interests of recovered users and users in treatment were more highly correlated to those corresponding to the focused control group (ρ=0.87 for both) than those of AN users (ρ=0.67), suggesting a shift in users' interest during the recovery process. CONCLUSIONS We mapped the signs of AN to social media context. These results support the findings of previous studies that focused on other languages and involved a deep analysis of the topics of interest of users at each phase of the disorder. The features and patterns identified provide a basis for the development of detection tools and recommender systems.
Collapse
Affiliation(s)
- Diana Ramírez-Cifuentes
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
| | - Ana Freire
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
- UPF Barcelona School of Management, Barcelona, Spain
| | - Ricardo Baeza-Yates
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
- Institute for Experiential AI, Northeastern University, Boston, MA, United States
| | - Nadia Sanz Lamora
- Department of Mental Health, Centro de Investigación Biomédica en Red de Salud Mental, Parc Tauli University Hospital, Sabadell, Spain
| | - Aida Álvarez
- Department of Mental Health, Centro de Investigación Biomédica en Red de Salud Mental, Parc Tauli University Hospital, Sabadell, Spain
- Parc Tauli Research and Innovation Institute, Sabadell, Spain
- Autonomous University of Barcelona, Bellaterra, Spain
| | - Alexandre González-Rodríguez
- Department of Mental Health, Centro de Investigación Biomédica en Red de Salud Mental, Parc Tauli University Hospital, Sabadell, Spain
- Parc Tauli Research and Innovation Institute, Sabadell, Spain
- Autonomous University of Barcelona, Bellaterra, Spain
| | | | | | | | | | - Jordi Gonzàlez
- Computer Vision Center, Universitat Autonoma de Barcelona, Bellaterra, Spain
| |
Collapse
|
29
|
Cohrdes C, Yenikent S, Wu J, Ghanem B, Franco-Salvador M, Vogelgesang F. Indications of Depressive Symptoms During the COVID-19 Pandemic in Germany: Comparison of National Survey and Twitter Data. JMIR Ment Health 2021; 8:e27140. [PMID: 34142973 PMCID: PMC8216331 DOI: 10.2196/27140] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 04/25/2021] [Accepted: 04/29/2021] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND The current COVID-19 pandemic is associated with extensive individual and societal challenges, including challenges to both physical and mental health. To date, the development of mental health problems such as depressive symptoms accompanying population-based federal distancing measures is largely unknown, and opportunities for rapid, effective, and valid monitoring are currently a relevant matter of investigation. OBJECTIVE In this study, we aim to investigate, first, the temporal progression of depressive symptoms during the COVID-19 pandemic and, second, the consistency of the results from tweets and survey-based self-reports of depressive symptoms within the same time period. METHODS Based on a cross-sectional population survey of 9011 German adolescents and adults (n=4659, 51.7% female; age groups from 15 to 50 years and older) and a sample of 88,900 tweets (n=74,587, 83.9% female; age groups from 10 to 50 years and older), we investigated five depressive symptoms (eg, depressed mood and energy loss) using items from the Patient Health Questionnaire (PHQ-8) before, during, and after relaxation of the first German social contact ban from January to July 2020. RESULTS On average, feelings of worthlessness were the least frequently reported symptom (survey: n=1011, 13.9%; Twitter: n=5103, 5.7%) and fatigue or loss of energy was the most frequently reported depressive symptom (survey: n=4472, 51.6%; Twitter: n=31,005, 34.9%) among both the survey and Twitter respondents. Young adult women and people living in federal districts with high COVID-19 infection rates were at an increased risk for depressive symptoms. The comparison of the survey and Twitter data before and after the first contact ban showed that German adolescents and adults had a significant decrease in feelings of fatigue and energy loss over time. The temporal progression of depressive symptoms showed high correspondence between both data sources (ρ=0.76-0.93; P<.001), except for diminished interest and depressed mood, which showed a steady increase even after the relaxation of the contact ban among the Twitter respondents but not among the survey respondents. CONCLUSIONS Overall, the results indicate relatively small differences in depressive symptoms associated with social distancing measures during the COVID-19 pandemic and highlight the need to differentiate between positive (eg, energy level) and negative (eg, depressed mood) associations and variations over time. The results also underscore previous suggestions of Twitter data's potential to help identify hot spots of declining and improving public mental health and thereby help provide early intervention measures, especially for young and middle-aged adults. Further efforts are needed to investigate the long-term consequences of recurring lockdown phases and to address the limitations of social media data such as Twitter data to establish real-time public mental surveillance approaches.
Collapse
Affiliation(s)
- Caroline Cohrdes
- Mental Health Research Unit, Department of Epidemiology and Health Monitoring, Robert Koch Institute, Berlin, Germany
| | | | - Jiawen Wu
- Symanto Research GmbH & Co KG, Nuernberg, Germany
| | - Bilal Ghanem
- Symanto Research GmbH & Co KG, Nuernberg, Germany
| | | | - Felicitas Vogelgesang
- Mental Health Research Unit, Department of Epidemiology and Health Monitoring, Robert Koch Institute, Berlin, Germany
| |
Collapse
|
30
|
Population attitudes toward contraceptive methods over time on a social media platform. Am J Obstet Gynecol 2021; 224:597.e1-597.e14. [PMID: 33309562 DOI: 10.1016/j.ajog.2020.11.042] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 11/03/2020] [Accepted: 11/26/2020] [Indexed: 02/03/2023]
Abstract
BACKGROUND Contraceptive method choice is often strongly influenced by the experiences and opinions of one's social network. Although social media, including Twitter, increasingly influences reproductive-age individuals, discussion of contraception in this setting has yet to be characterized. Natural language processing, a type of machine learning in which computers analyze natural language data, enables this analysis. OBJECTIVE This study aimed to illuminate temporal trends in attitudes toward long- and short-acting reversible contraceptive methods in tweets between 2006 and 2019 and establish social media platforms as alternate data sources for large-scale sentiment analysis on contraception. STUDY DESIGN We studied English-language tweets mentioning reversible prescription contraceptive methods between March 2006 (founding of Twitter) and December 2019. Tweets mentioning contraception were extracted using search terms, including generic or brand names, colloquial names, and abbreviations. We characterized and performed sentiment analysis on tweets. We used Mann-Kendall nonparametric tests to assess temporal trends in the overall number and the number of positive, negative, and neutral tweets referring to each method. The code to reproduce this analysis is available at https://github.com/hms-dbmi/contraceptionOnTwitter. RESULTS We extracted 838,739 tweets mentioning at least 1 contraceptive method. The annual number of contraception-related tweets increased considerably over the study period. The intrauterine device was the most commonly referenced method (45.9%). Long-acting methods were mentioned more often than short-acting ones (58% vs 42%), and the annual proportion of long-acting reversible contraception-related tweets increased over time. In sentiment analysis of tweets mentioning a single contraceptive method (n=665,064), the greatest proportion of all tweets was negative (65,339 of 160,713 tweets with at least 95% confident sentiment, or 40.66%). Tweets mentioning long-acting methods were nearly twice as likely to be positive compared with tweets mentioning short-acting methods (19.65% vs 10.21%; P<.002). CONCLUSION Recognizing the influence of social networks on contraceptive decision making, social media platforms may be useful in the collection and dissemination of information about contraception.
Collapse
|
31
|
Kelly DL, Spaderna M, Hodzic V, Coppersmith G, Chen S, Resnik P. Can language use in social media help in the treatment of severe mental illness? CURRENT RESEARCH IN PSYCHIATRY 2021; 1:1-4. [PMID: 34532718 PMCID: PMC8442995 DOI: 10.46439/psychiatry.1.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Affiliation(s)
- Deanna L. Kelly
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA
| | - Max Spaderna
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA
| | - Vedrana Hodzic
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA
| | | | - Shuo Chen
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA
| | | |
Collapse
|
32
|
Ávila-Tomás JF, Mayer-Pujadas MA, Quesada-Varela VJ. [Artificial intelligence and its applications in medicine II: Current importance and practical applications]. Aten Primaria 2021; 53:81-88. [PMID: 32571595 PMCID: PMC7752970 DOI: 10.1016/j.aprim.2020.04.014] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Accepted: 04/22/2020] [Indexed: 12/16/2022] Open
Abstract
Technology and medicine follow a parallel path during the last decades. Technological advances are changing the concept of health and health needs are influencing the development of technology. Artificial intelligence (AI) is made up of a series of sufficiently trained logical algorithms from which machines are capable of making decisions for specific cases based on general rules. This technology has applications in the diagnosis and follow-up of patients with an individualized prognostic evaluation of them. Furthermore, if we combine this technology with robotics, we can create intelligent machines that make more efficient diagnostic proposals in their work. Therefore, AI is going to be a technology present in our daily work through machines or computer programs, which in a more or less transparent way for the user, will become a daily reality in health processes. Health professionals have to know this technology, its advantages and disadvantages, because it will be an integral part of our work. In these two articles we intend to give a basic vision of this technology adapted to doctors with a review of its history and evolution, its real applications at the present time and a vision of a future in which AI and Big Data will shape the personalized medicine that will characterize the 21st century.
Collapse
Affiliation(s)
- Jose Francisco Ávila-Tomás
- Medicina de Familia y Comunitaria, Centro de Salud Santa Isabel, Madrid, España; Medicina Preventiva y Salud Pública, Universidad Rey Juan Carlos, Móstoles, Madrid, España; Estrutura Organizativa de Xestión Integrada (EOXI), Vigo, Pontevedra, España.
| | - Miguel Angel Mayer-Pujadas
- Medicina de Familia y Comunitaria, Research Programme on Biomedical Informatics (GRIB), Instituto Hospital del Mar de Investigaciones Médicas, Barcelona, España; Universitat Pompeu Fabra, Barcelona, España; Miembro del Grupo de Trabajo de Innovación Tecnológica y Sistemas de Información de la semFYC
| | - Victor Julio Quesada-Varela
- Medicina de Familia y Comunitaria, Centro de Salud de A Guarda, A Guarda, Pontevedra, España; Estrutura Organizativa de Xestión Integrada (EOXI), Vigo, Pontevedra, España; Miembro del Grupo de Trabajo de Innovación Tecnológica y Sistemas de Información de la semFYC
| |
Collapse
|
33
|
Leis A, Ronzano F, Mayer MA, Furlong LI, Sanz F. Evaluating Behavioral and Linguistic Changes During Drug Treatment for Depression Using Tweets in Spanish: Pairwise Comparison Study. J Med Internet Res 2020; 22:e20920. [PMID: 33337338 PMCID: PMC7775819 DOI: 10.2196/20920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 09/01/2020] [Accepted: 11/12/2020] [Indexed: 11/13/2022] Open
Abstract
Background Depressive disorders are the most common mental illnesses, and they constitute the leading cause of disability worldwide. Selective serotonin reuptake inhibitors (SSRIs) are the most commonly prescribed drugs for the treatment of depressive disorders. Some people share information about their experiences with antidepressants on social media platforms such as Twitter. Analysis of the messages posted by Twitter users under SSRI treatment can yield useful information on how these antidepressants affect users’ behavior. Objective This study aims to compare the behavioral and linguistic characteristics of the tweets posted while users were likely to be under SSRI treatment, in comparison to the tweets posted by the same users when they were less likely to be taking this medication. Methods In the first step, the timelines of Twitter users mentioning SSRI antidepressants in their tweets were selected using a list of 128 generic and brand names of SSRIs. In the second step, two datasets of tweets were created, the in-treatment dataset (made up of the tweets posted throughout the 30 days after mentioning an SSRI) and the unknown-treatment dataset (made up of tweets posted more than 90 days before or more than 90 days after any tweet mentioning an SSRI). For each user, the changes in behavioral and linguistic features between the tweets classified in these two datasets were analyzed. 186 users and their timelines with 668,842 tweets were finally included in the study. Results The number of tweets generated per day by the users when they were in treatment was higher than it was when they were in the unknown-treatment period (P=.001). When the users were in treatment, the mean percentage of tweets posted during the daytime (from 8 AM to midnight) increased in comparison to the unknown-treatment period (P=.002). The number of characters and words per tweet was higher when the users were in treatment (P=.03 and P=.02, respectively). Regarding linguistic features, the percentage of pronouns that were first-person singular was higher when users were in treatment (P=.008). Conclusions Behavioral and linguistic changes have been detected when users with depression are taking antidepressant medication. These features can provide interesting insights for monitoring the evolution of this disease, as well as offering additional information related to treatment adherence. This information may be especially useful in patients who are receiving long-term treatments such as people suffering from depression.
Collapse
Affiliation(s)
- Angela Leis
- Research Programme on Biomedical Informatics, Hospital del Mar Medical Research Institute, Department of Experimental and Health Sciences, Pompeu Fabra University, Barcelona, Spain
| | - Francesco Ronzano
- Research Programme on Biomedical Informatics, Hospital del Mar Medical Research Institute, Department of Experimental and Health Sciences, Pompeu Fabra University, Barcelona, Spain
| | - Miguel Angel Mayer
- Research Programme on Biomedical Informatics, Hospital del Mar Medical Research Institute, Department of Experimental and Health Sciences, Pompeu Fabra University, Barcelona, Spain
| | - Laura I Furlong
- Research Programme on Biomedical Informatics, Hospital del Mar Medical Research Institute, Department of Experimental and Health Sciences, Pompeu Fabra University, Barcelona, Spain
| | - Ferran Sanz
- Research Programme on Biomedical Informatics, Hospital del Mar Medical Research Institute, Department of Experimental and Health Sciences, Pompeu Fabra University, Barcelona, Spain
| |
Collapse
|
34
|
Kelly DL, Spaderna M, Hodzic V, Nair S, Kitchen C, Werkheiser AE, Powell MM, Liu F, Coppersmith G, Chen S, Resnik P. Blinded Clinical Ratings of Social Media Data are Correlated with In-Person Clinical Ratings in Participants Diagnosed with Either Depression, Schizophrenia, or Healthy Controls. Psychiatry Res 2020; 294:113496. [PMID: 33065372 DOI: 10.1016/j.psychres.2020.113496] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 10/01/2020] [Indexed: 12/16/2022]
Abstract
This study investigates clinically valid signals about psychiatric symptoms in social media data, by rating severity of psychiatric symptoms in donated, de-identified Facebook posts and comparing to in-person clinical assessments. Participants with schizophrenia (N=8), depression (N=7), or who were healthy controls (N=8) also consented to the collection of their Facebook activity from three months before the in-person assessments to six weeks after this evaluation. Depressive symptoms were assessed in- person using the Montgomery-Åsberg Depression Rating Scale (MADRS), psychotic symptoms were assessed using the Brief Psychiatric Rating Scale (BPRS), and global functioning was assessed using the Community Assessment of Psychotic Experiences (CAPE-42). Independent raters (psychiatrists, non-psychiatrist mental health clinicians, and two staff members) rated depression, psychosis, and global functioning symptoms from the social media activity of deidentified participants. The correlations between in-person clinical ratings and blinded ratings based on social media data were evaluated. Significant correlations (and trends for significance in the mixed model controlling for multiple raters) were found for psychotic symptoms, global symptom ratings and depressive symptoms. Results like these, indicating the presence of clinically valid signal in social media, are an important step toward developing computational tools that could assist clinicians by providing additional data outside the context of clinical encounters.
Collapse
Affiliation(s)
- Deanna L Kelly
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA.
| | - Max Spaderna
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA
| | - Vedrana Hodzic
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA
| | - Suraj Nair
- University of Maryland College Park, Department of Computer Science and Institute for Advanced Computer Studies, College Park, MD, USA
| | - Christopher Kitchen
- Center for Population Health IT, Johns Hopkins School of Public Health, Baltimore, MD, USA
| | - Anne E Werkheiser
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA; Department of Psychology, Georgia State University, USA
| | | | - Fang Liu
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA
| | | | - Shuo Chen
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA
| | - Philip Resnik
- University of Maryland College Park, Department of Linguistics and Institute for Advanced Computer Studies, College Park, MD, USA
| |
Collapse
|
35
|
Garcia-Rudolph A, Saurí J, Cegarra B, Bernabeu Guitart M. Discovering the Context of People With Disabilities: Semantic Categorization Test and Environmental Factors Mapping of Word Embeddings from Reddit. JMIR Med Inform 2020; 8:e17903. [PMID: 33216006 PMCID: PMC7718084 DOI: 10.2196/17903] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 04/17/2020] [Accepted: 04/19/2020] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND The World Health Organization's International Classification of Functioning Disability and Health (ICF) conceptualizes disability not solely as a problem that resides in the individual, but as a health experience that occurs in a context. Word embeddings build on the idea that words that occur in similar contexts tend to have similar meanings. In spite of both sharing "context" as a key component, word embeddings have been scarcely applied in disability. In this work, we propose social media (particularly, Reddit) to link them. OBJECTIVE The objective of our study is to train a model for generating word associations using a small dataset (a subreddit on disability) able to retrieve meaningful content. This content will be formally validated and applied to the discovery of related terms in the corpus of the disability subreddit that represent the physical, social, and attitudinal environment (as defined by a formal framework like the ICF) of people with disabilities. METHODS Reddit data were collected from pushshift.io with the pushshiftr R package as a wrapper. A word2vec model was trained with the wordVectors R package using the disability subreddit comments, and a preliminary validation was performed using a subset of Mikolov analogies. We used Van Overschelde's updated and expanded version of the Battig and Montague norms to perform a semantic categories test. Silhouette coefficients were calculated using cosine distance from the wordVectors R package. For each of the 5 ICF environmental factors (EF), we selected representative subcategories addressing different aspects of daily living (ADLs); then, for each subcategory, we identified specific terms extracted from their formal ICF definition and ran the word2vec model to generate their nearest semantic terms, validating the obtained nearest semantic terms using public evidence. Finally, we applied the model to a specific subcategory of an EF involved in a relevant use case in the field of rehabilitation. RESULTS We analyzed 96,314 comments posted between February 2009 and December 2019, by 10,411 Redditors. We trained word2vec and identified more than 30 analogies (eg, breakfast - 8 am + 8 pm = dinner). The semantic categorization test showed promising results over 60 categories; for example, s(A relative)=0.562, s(A sport)=0.475 provided remarkable explanations for low s values. We mapped the representative subcategories of all EF chapters and obtained the closest terms for each, which we confirmed with publications. This allowed immediate access (≤ 2 seconds) to the terms related to ADLs, ranging from apps "to know accessibility before you go" to adapted sports (boccia). For example, for the support and relationships EF subcategory, the closest term discovered by our model was "resilience," recently regarded as a key feature of rehabilitation, not yet having one unified definition. Our model discovered 10 closest terms, which we validated with publications, contributing to the "resilience" definition. CONCLUSIONS This study opens up interesting opportunities for the exploration and discovery of the use of a word2vec model that has been trained with a small disability dataset, leading to immediate, accurate, and often unknown (for authors, in many cases) terms related to ADLs within the ICF framework.
Collapse
Affiliation(s)
- Alejandro Garcia-Rudolph
- Institut Guttmann Hospital de Neurorehabilitacio, Badalona, Spain
- Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
- Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Badalona, Spain
| | - Joan Saurí
- Institut Guttmann Hospital de Neurorehabilitacio, Badalona, Spain
- Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
- Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Badalona, Spain
| | - Blanca Cegarra
- Institut Guttmann Hospital de Neurorehabilitacio, Badalona, Spain
- Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
- Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Badalona, Spain
- Universitat de Barcelona, Barcelona, Spain
| | - Montserrat Bernabeu Guitart
- Institut Guttmann Hospital de Neurorehabilitacio, Badalona, Spain
- Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
- Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Badalona, Spain
| |
Collapse
|
36
|
Ramírez-Cifuentes D, Freire A, Baeza-Yates R, Puntí J, Medina-Bravo P, Velazquez DA, Gonfaus JM, Gonzàlez J. Detection of Suicidal Ideation on Social Media: Multimodal, Relational, and Behavioral Analysis. J Med Internet Res 2020; 22:e17758. [PMID: 32673256 PMCID: PMC7381053 DOI: 10.2196/17758] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 03/28/2020] [Accepted: 03/28/2020] [Indexed: 12/13/2022] Open
Abstract
Background Suicide risk assessment usually involves an interaction between doctors and patients. However, a significant number of people with mental disorders receive no treatment for their condition due to the limited access to mental health care facilities; the reduced availability of clinicians; the lack of awareness; and stigma, neglect, and discrimination surrounding mental disorders. In contrast, internet access and social media usage have increased significantly, providing experts and patients with a means of communication that may contribute to the development of methods to detect mental health issues among social media users. Objective This paper aimed to describe an approach for the suicide risk assessment of Spanish-speaking users on social media. We aimed to explore behavioral, relational, and multimodal data extracted from multiple social platforms and develop machine learning models to detect users at risk. Methods We characterized users based on their writings, posting patterns, relations with other users, and images posted. We
also evaluated statistical and deep learning approaches to handle multimodal data for the detection of users with signs of suicidal
ideation (suicidal ideation risk group). Our methods were evaluated over a dataset of 252 users annotated by clinicians. To evaluate
the performance of our models, we distinguished 2 control groups: users who make use of suicide-related vocabulary (focused
control group) and generic random users (generic control group). Results We identified significant statistical differences between the textual and behavioral attributes of each of the control
groups compared with the suicidal ideation risk group. At a 95% CI, when comparing the suicidal ideation risk group and the
focused control group, the number of friends (P=.04) and median tweet length (P=.04) were significantly different. The median
number of friends for a focused control user (median 578.5) was higher than that for a user at risk (median 372.0). Similarly, the
median tweet length was higher for focused control users, with 16 words against 13 words of suicidal ideation risk users. Our
findings also show that the combination of textual, visual, relational, and behavioral data outperforms the accuracy of using each
modality separately. We defined text-based baseline models based on bag of words and word embeddings, which were outperformed
by our models, obtaining an increase in accuracy of up to 8% when distinguishing users at risk from both types of control users. Conclusions The types of attributes analyzed are significant for detecting users at risk, and their combination outperforms the
results provided by generic, exclusively text-based baseline models. After evaluating the contribution of image-based predictive
models, we believe that our results can be improved by enhancing the models based on textual and relational features. These
methods can be extended and applied to different use cases related to other mental disorders.
Collapse
Affiliation(s)
- Diana Ramírez-Cifuentes
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
| | - Ana Freire
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
| | - Ricardo Baeza-Yates
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
| | - Joaquim Puntí
- Hospital de Día de Adolescentes, Servicio de Salud Mental, Consorci Corporació Sanitària Parc Taulí, Sabadell, Spain.,Departamento de Psicología Clínica y de la Salud, Universitat Autònoma de Barcelona, Barcelona, Spain
| | | | | | | | - Jordi Gonzàlez
- Computer Vision Center, Universitat Autònoma de Barcelona, Bellaterra (Barcelona), Spain
| |
Collapse
|
37
|
Patra BG, Kar R, Roberts K, Wu H. Mental Health Severity Detection from Psychological Forum Data using Domain-Specific Unlabelled Data. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2020; 2020:487-496. [PMID: 32477670 PMCID: PMC7233051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Mental health has become a growing concern in the medical field, yet remains difficult to study due to both privacy concerns and the lack of objectively quantifiable measurements (e.g., lab tests, physical exams). Instead, the data that is available for mental health is largely based on subjective accounts of a patient's experience, and thus typically is expressed exclusively in text. An important source of such data comes from online sources and directly from the patient, including many forms of social media. In this work, we utilize the datasets provided by the CLPsych shared tasks in 2016 and 2017, derived from online forum posts of ReachOut which have been manually classified according to mental health severity. We implemented an automated severity labeling system using different machine and deep learning algorithms. Our approach combines both supervised and semi-supervised embedding methods using corpus from ReachOut (both labeled and unlabelled) and WebMD (unlabelled). Metadata, syntactic, semantic, and embedding features were used to classify the posts into four categories (green, amber, red, and crisis). The developed systems outperformed other state-of-the-art systems developed on the ReachOut dataset and obtained the maximum micro- averaged F-scores of 0.86 and 0.80 for CLPsych 2016 and 2017 test datasets, respectively, using the above features.
Collapse
Affiliation(s)
- Braja Gopal Patra
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX
| | - Reshma Kar
- Department of Electronics and Telecommunication Engineering, Jadavpur University, Kolkata, India
| | - Kirk Roberts
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX
| | - Hulin Wu
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX
| |
Collapse
|
38
|
Wang J, Deng H, Liu B, Hu A, Liang J, Fan L, Zheng X, Wang T, Lei J. Systematic Evaluation of Research Progress on Natural Language Processing in Medicine Over the Past 20 Years: Bibliometric Study on PubMed. J Med Internet Res 2020; 22:e16816. [PMID: 32012074 PMCID: PMC7005695 DOI: 10.2196/16816] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 12/05/2019] [Accepted: 12/15/2019] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Natural language processing (NLP) is an important traditional field in computer science, but its application in medical research has faced many challenges. With the extensive digitalization of medical information globally and increasing importance of understanding and mining big data in the medical field, NLP is becoming more crucial. OBJECTIVE The goal of the research was to perform a systematic review on the use of NLP in medical research with the aim of understanding the global progress on NLP research outcomes, content, methods, and study groups involved. METHODS A systematic review was conducted using the PubMed database as a search platform. All published studies on the application of NLP in medicine (except biomedicine) during the 20 years between 1999 and 2018 were retrieved. The data obtained from these published studies were cleaned and structured. Excel (Microsoft Corp) and VOSviewer (Nees Jan van Eck and Ludo Waltman) were used to perform bibliometric analysis of publication trends, author orders, countries, institutions, collaboration relationships, research hot spots, diseases studied, and research methods. RESULTS A total of 3498 articles were obtained during initial screening, and 2336 articles were found to meet the study criteria after manual screening. The number of publications increased every year, with a significant growth after 2012 (number of publications ranged from 148 to a maximum of 302 annually). The United States has occupied the leading position since the inception of the field, with the largest number of articles published. The United States contributed to 63.01% (1472/2336) of all publications, followed by France (5.44%, 127/2336) and the United Kingdom (3.51%, 82/2336). The author with the largest number of articles published was Hongfang Liu (70), while Stéphane Meystre (17) and Hua Xu (33) published the largest number of articles as the first and corresponding authors. Among the first author's affiliation institution, Columbia University published the largest number of articles, accounting for 4.54% (106/2336) of the total. Specifically, approximately one-fifth (17.68%, 413/2336) of the articles involved research on specific diseases, and the subject areas primarily focused on mental illness (16.46%, 68/413), breast cancer (5.81%, 24/413), and pneumonia (4.12%, 17/413). CONCLUSIONS NLP is in a period of robust development in the medical field, with an average of approximately 100 publications annually. Electronic medical records were the most used research materials, but social media such as Twitter have become important research materials since 2015. Cancer (24.94%, 103/413) was the most common subject area in NLP-assisted medical research on diseases, with breast cancers (23.30%, 24/103) and lung cancers (14.56%, 15/103) accounting for the highest proportions of studies. Columbia University and the talents trained therein were the most active and prolific research forces on NLP in the medical field.
Collapse
Affiliation(s)
- Jing Wang
- School of Medical Informatics and Engineering, Southwest Medical University, Luzhou, China
| | - Huan Deng
- School of Medical Informatics and Engineering, Southwest Medical University, Luzhou, China
| | - Bangtao Liu
- School of Medical Informatics and Engineering, Southwest Medical University, Luzhou, China
| | - Anbin Hu
- School of Medical Informatics and Engineering, Southwest Medical University, Luzhou, China
| | - Jun Liang
- IT Center, Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Lingye Fan
- Affiliated Hospital, Southwest Medical University, Luzhou, China
| | - Xu Zheng
- Center for Medical Informatics, Peking University, Beijing, China
| | - Tong Wang
- School of Public Health, Jilin University, Jilin, China
| | - Jianbo Lei
- School of Medical Informatics and Engineering, Southwest Medical University, Luzhou, China.,Center for Medical Informatics, Peking University, Beijing, China.,Institute of Medical Technology, Health Science Center, Peking University, Beijing, China
| |
Collapse
|