1
|
Zainal NH, Eckhardt R, Rackoff GN, Fitzsimmons-Craft EE, Rojas-Ashe E, Barr Taylor C, Funk B, Eisenberg D, Wilfley DE, Newman MG. Capitalizing on natural language processing (NLP) to automate the evaluation of coach implementation fidelity in guided digital cognitive-behavioral therapy (GdCBT). Psychol Med 2025; 55:e106. [PMID: 40170669 DOI: 10.1017/s0033291725000340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/03/2025]
Abstract
BACKGROUND As the use of guided digitally-delivered cognitive-behavioral therapy (GdCBT) grows, pragmatic analytic tools are needed to evaluate coaches' implementation fidelity. AIMS We evaluated how natural language processing (NLP) and machine learning (ML) methods might automate the monitoring of coaches' implementation fidelity to GdCBT delivered as part of a randomized controlled trial. METHOD Coaches served as guides to 6-month GdCBT with 3,381 assigned users with or at risk for anxiety, depression, or eating disorders. CBT-trained and supervised human coders used a rubric to rate the implementation fidelity of 13,529 coach-to-user messages. NLP methods abstracted data from text-based coach-to-user messages, and 11 ML models predicting coach implementation fidelity were evaluated. RESULTS Inter-rater agreement by human coders was excellent (intra-class correlation coefficient = .980-.992). Coaches achieved behavioral targets at the start of the GdCBT and maintained strong fidelity throughout most subsequent messages. Coaches also avoided prohibited actions (e.g. reinforcing users' avoidance). Sentiment analyses generally indicated a higher frequency of coach-delivered positive than negative sentiment words and predicted coach implementation fidelity with acceptable performance metrics (e.g. area under the receiver operating characteristic curve [AUC] = 74.48%). The final best-performing ML algorithms that included a more comprehensive set of NLP features performed well (e.g. AUC = 76.06%). CONCLUSIONS NLP and ML tools could help clinical supervisors automate monitoring of coaches' implementation fidelity to GdCBT. These tools could maximize allocation of scarce resources by reducing the personnel time needed to measure fidelity, potentially freeing up more time for high-quality clinical care.
Collapse
Affiliation(s)
- Nur Hani Zainal
- Department of Psychology, National University of Singapore (NUS), Singapore
| | - Regina Eckhardt
- Technical University of Munich, TUM School of Life Sciences, Freising, Germany
| | - Gavin N Rackoff
- Department of Psychology, The Pennsylvania State University, University Park, PA, USA
| | | | - Elsa Rojas-Ashe
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | - Craig Barr Taylor
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
- Department of Psychology, Palo Alto University, Palo Alto, CA, USA
| | - Burkhardt Funk
- Department of Information Systems and Data Science, Leuphana University Lüneburg, Lüneburg, Germany
| | - Daniel Eisenberg
- Fielding School of Public Health, University of California at Los Angeles, Los Angeles, CA, USA
| | - Denise E Wilfley
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Michelle G Newman
- Department of Psychology, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
2
|
Plisiecki H, Lenartowicz P, Flakus M, Pokropek A. High risk of political bias in black box emotion inference models. Sci Rep 2025; 15:6028. [PMID: 39972000 PMCID: PMC11840103 DOI: 10.1038/s41598-025-86766-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Accepted: 01/14/2025] [Indexed: 02/21/2025] Open
Abstract
This paper investigates the presence of political bias in emotion inference models used for sentiment analysis (SA). Machine learning models often reflect biases in their training data, impacting the validity of their outcomes. While previous research has highlighted gender and race biases, our study focuses on political bias-an underexplored, pervasive issue that can skew the interpretation of text data across many studies. We audit a Polish sentiment analysis model developed in our lab for bias. By analyzing valence predictions for names and sentences involving Polish politicians, we uncovered systematic differences influenced by political affiliations. Our findings suggest that annotations by human raters propagate political biases into the model's predictions. To prove it, we pruned the training dataset of texts mentioning these politicians and observed a reduction in bias, though not its complete elimination. Given the significant implications of political bias in SA, our study emphasizes caution in employing these models for social science research. We recommend a critical examination of SA results and propose using lexicon-based systems as an ideologically neutral alternative. This paper underscores the necessity for ongoing scrutiny and methodological adjustments to ensure the reliability of the use of machine learning in academic and applied contexts.
Collapse
Affiliation(s)
- Hubert Plisiecki
- Institute of Psychology, Polish Academy of Sciences, Warsaw, Poland.
- Stowarzyszenie na rzecz Otwartej Nauki (Society for Open Science), Warsaw, Poland.
- Institute of Philosophy and Sociology, Polish Academy of Sciences, Warsaw, Poland.
| | - Paweł Lenartowicz
- Stowarzyszenie na rzecz Otwartej Nauki (Society for Open Science), Warsaw, Poland
| | - Maria Flakus
- Institute of Philosophy and Sociology, Polish Academy of Sciences, Warsaw, Poland
| | - Artur Pokropek
- Institute of Philosophy and Sociology, Polish Academy of Sciences, Warsaw, Poland
| |
Collapse
|
3
|
Eberhardt ST, Schaffrath J, Moggia D, Schwartz B, Jaehde M, Rubel JA, Baur T, André E, Lutz W. Decoding emotions: Exploring the validity of sentiment analysis in psychotherapy. Psychother Res 2025; 35:174-189. [PMID: 38415369 DOI: 10.1080/10503307.2024.2322522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 01/19/2024] [Accepted: 02/15/2024] [Indexed: 02/29/2024] Open
Abstract
OBJECTIVE Given the importance of emotions in psychotherapy, valid measures are essential for research and practice. As emotions are expressed at different levels, multimodal measurements are needed for a nuanced assessment. Natural Language Processing (NLP) could augment the measurement of emotions. The study explores the validity of sentiment analysis in psychotherapy transcripts. METHOD We used a transformer-based NLP algorithm to analyze sentiments in 85 transcripts from 35 patients. Construct and criterion validity were evaluated using self- and therapist reports and process and outcome measures via correlational, multitrait-multimethod, and multilevel analyses. RESULTS The results provide indications in support of the sentiments' validity. For example, sentiments were significantly related to self- and therapist reports of emotions in the same session. Sentiments correlated significantly with in-session processes (e.g., coping experiences), and an increase in positive sentiments throughout therapy predicted better outcomes after treatment termination. DISCUSSION Sentiment analysis could serve as a valid approach to assessing the emotional tone of psychotherapy sessions and may contribute to the multimodal measurement of emotions. Future research could combine sentiment analysis with automatic emotion recognition in facial expressions and vocal cues via the Nonverbal Behavior Analyzer (NOVA). Limitations (e.g., exploratory study with numerous tests) and opportunities are discussed.
Collapse
|
4
|
Gandy LM, Ivanitskaya LV, Bacon LL, Bizri-Baryak R. Public Health Discussions on Social Media: Evaluating Automated Sentiment Analysis Methods. JMIR Form Res 2025; 9:e57395. [PMID: 39773420 PMCID: PMC11784633 DOI: 10.2196/57395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 08/31/2024] [Accepted: 09/20/2024] [Indexed: 01/11/2025] Open
Abstract
BACKGROUND Sentiment analysis is one of the most widely used methods for mining and examining text. Social media researchers need guidance on choosing between manual and automated sentiment analysis methods. OBJECTIVE Popular sentiment analysis tools based on natural language processing (NLP; VADER [Valence Aware Dictionary for Sentiment Reasoning], TEXT2DATA [T2D], and Linguistic Inquiry and Word Count [LIWC-22]), and a large language model (ChatGPT 4.0) were compared with manually coded sentiment scores, as applied to the analysis of YouTube comments on videos discussing the opioid epidemic. Sentiment analysis methods were also examined regarding ease of programming, monetary cost, and other practical considerations. METHODS Evaluation methods included descriptive statistics, receiver operating characteristic (ROC) curve analysis, confusion matrices, Cohen κ, accuracy, specificity, precision, sensitivity (recall), F1-score harmonic mean, and the Matthews correlation coefficient. An inductive, iterative approach to content analysis of the data was used to obtain manual sentiment codes. RESULTS A subset of comments were analyzed by a second coder, producing good agreement between the 2 coders' judgments (κ=0.734). YouTube social media about the opioid crisis had many more negative comments (4286/4871, 88%) than positive comments (79/662, 12%), making it possible to evaluate the performance of sentiment analysis models in an unbalanced dataset. The tone summary measure from LIWC-22 performed better than other tools for estimating the prevalence of negative versus positive sentiment. According to the ROC curve analysis, VADER was best at classifying manually coded negative comments. A comparison of Cohen κ values indicated that NLP tools (VADER, followed by LIWC's tone and T2D) showed only fair agreement with manual coding. In contrast, ChatGPT 4.0 had poor agreement and failed to generate binary sentiment scores in 2 out of 3 attempts. Variations in accuracy, specificity, precision, sensitivity, F1-score, and MCC did not reveal a single superior model. F1-score harmonic means were 0.34-0.38 (SD 0.02) for NLP tools and very low (0.13) for ChatGPT 4.0. None of the MCCs reached a strong correlation level. CONCLUSIONS Researchers studying negative emotions, public worries, or dissatisfaction with social media face unique challenges in selecting models suitable for unbalanced datasets. We recommend VADER, the only cost-free tool we evaluated, due to its excellent discrimination, which can be further improved when the comments are at least 100 characters long. If estimating the prevalence of negative comments in an unbalanced dataset is important, we recommend the tone summary measure from LIWC-22. Researchers using T2D must know that it may only score some data and, compared with other methods, be more time-consuming and cost-prohibitive. A general-purpose large language model, ChatGPT 4.0, has yet to surpass the performance of NLP models, at least for unbalanced datasets with highly prevalent (7:1) negative comments.
Collapse
Affiliation(s)
- Lisa M Gandy
- Department of Computer Science, College of Sciences and Liberal Arts, Kettering University, Flint, MI, United States
| | - Lana V Ivanitskaya
- Department of Health Administration, The College of Health Professions, Central Michigan University, Mt Pleasant, MI, United States
| | - Leeza L Bacon
- Department of Healthcare Management, Northwood University, Midland, MI, United States
| | - Rodina Bizri-Baryak
- Department of Health Administration, The College of Health Professions, Central Michigan University, Mt Pleasant, MI, United States
| |
Collapse
|
5
|
Huisman SM, Kraiss JT, de Vos JA. Examining a sentiment algorithm on session patient records in an eating disorder treatment setting: a preliminary study. Front Psychiatry 2024; 15:1275236. [PMID: 38544849 PMCID: PMC10965787 DOI: 10.3389/fpsyt.2024.1275236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 02/08/2024] [Indexed: 11/11/2024] Open
Abstract
Background Clinicians collect session therapy notes within patient session records. Session records contain valuable information about patients' treatment progress. Sentiment analysis is a tool to extract emotional tones and states from text input and could be used to evaluate patients' sentiment during treatment over time. This preliminary study aims to investigate the validity of automated sentiment analysis on session patient records within an eating disorder (ED) treatment context against the performance of human raters. Methods A total of 460 patient session records from eight participants diagnosed with an ED were evaluated on their overall sentiment by an automated sentiment analysis and two human raters separately. The inter-rater agreement (IRR) between the automated analysis and human raters and IRR among the human raters was analyzed by calculating the intra-class correlation (ICC) under a continuous interpretation and weighted Cohen's kappa under a categorical interpretation. Furthermore, differences regarding positive and negative matches between the human raters and the automated analysis were examined in closer detail. Results The ICC showed a moderate automated-human agreement (ICC = 0.55), and the weighted Cohen's kappa showed a fair automated-human (k = 0.29) and substantial human-human agreement (k = 0.68) for the evaluation of overall sentiment. Furthermore, the automated analysis lacked words specific to an ED context. Discussion/conclusion The automated sentiment analysis performed worse in discerning sentiment from session patient records compared to human raters and cannot be used within practice in its current state if the benchmark is considered adequate enough. Nevertheless, the automated sentiment analysis does show potential in extracting sentiment from session records. The automated analysis should be further developed by including context-specific ED words, and a more solid benchmark, such as patients' own mood, should be established to compare the performance of the automated analysis to.
Collapse
Affiliation(s)
- Sophie M. Huisman
- Department of Psychology, Health and Technology, Centre for eHealth and Wellbeing Research, University of Twente, Enschede, Netherlands
| | - Jannis T. Kraiss
- Department of Psychology, Health and Technology, Centre for eHealth and Wellbeing Research, University of Twente, Enschede, Netherlands
| | - Jan Alexander de Vos
- Department of Research, GGZ Friesland Mental Healthcare Institution, Leeuwarden, Netherlands
- Human Concern, Centrum Voor Eetstoornissen, Amsterdam, Netherlands
| |
Collapse
|
6
|
Malgaroli M, Hull TD, Zech JM, Althoff T. Natural language processing for mental health interventions: a systematic review and research framework. Transl Psychiatry 2023; 13:309. [PMID: 37798296 PMCID: PMC10556019 DOI: 10.1038/s41398-023-02592-2] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 08/31/2023] [Accepted: 09/04/2023] [Indexed: 10/07/2023] Open
Abstract
Neuropsychiatric disorders pose a high societal cost, but their treatment is hindered by lack of objective outcomes and fidelity metrics. AI technologies and specifically Natural Language Processing (NLP) have emerged as tools to study mental health interventions (MHI) at the level of their constituent conversations. However, NLP's potential to address clinical and research challenges remains unclear. We therefore conducted a pre-registered systematic review of NLP-MHI studies using PRISMA guidelines (osf.io/s52jh) to evaluate their models, clinical applications, and to identify biases and gaps. Candidate studies (n = 19,756), including peer-reviewed AI conference manuscripts, were collected up to January 2023 through PubMed, PsycINFO, Scopus, Google Scholar, and ArXiv. A total of 102 articles were included to investigate their computational characteristics (NLP algorithms, audio features, machine learning pipelines, outcome metrics), clinical characteristics (clinical ground truths, study samples, clinical focus), and limitations. Results indicate a rapid growth of NLP MHI studies since 2019, characterized by increased sample sizes and use of large language models. Digital health platforms were the largest providers of MHI data. Ground truth for supervised learning models was based on clinician ratings (n = 31), patient self-report (n = 29) and annotations by raters (n = 26). Text-based features contributed more to model accuracy than audio markers. Patients' clinical presentation (n = 34), response to intervention (n = 11), intervention monitoring (n = 20), providers' characteristics (n = 12), relational dynamics (n = 14), and data preparation (n = 4) were commonly investigated clinical categories. Limitations of reviewed studies included lack of linguistic diversity, limited reproducibility, and population bias. A research framework is developed and validated (NLPxMHI) to assist computational and clinical researchers in addressing the remaining gaps in applying NLP to MHI, with the goal of improving clinical utility, data access, and fairness.
Collapse
Affiliation(s)
- Matteo Malgaroli
- Department of Psychiatry, New York University, Grossman School of Medicine, New York, NY, 10016, USA.
| | | | - James M Zech
- Talkspace, New York, NY, 10025, USA
- Department of Psychology, Florida State University, Tallahassee, FL, 32306, USA
| | - Tim Althoff
- Department of Computer Science, University of Washington, Seattle, WA, 98195, USA
| |
Collapse
|
7
|
Sharma C, Whittle S, Haghighi PD, Burstein F, Keen HI. Response to 'Correspondence on 'Mining social media data to investigate patient perceptions regarding DMARD pharmacotherapy for rheumatoid arthritis'' by Reuter et al. Ann Rheum Dis 2023; 82:e92. [PMID: 33593739 DOI: 10.1136/annrheumdis-2020-219815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Accepted: 02/03/2021] [Indexed: 11/04/2022]
Affiliation(s)
- Chanakya Sharma
- Rheumatology, Fiona Stanley Hospital, Murdoch, Western Australia, Australia
| | - Samuel Whittle
- Rheumatology, The Queen Elizabeth Hospital, Woodville South, South Australia, Australia
| | | | - Frada Burstein
- Information Technology, Monash University, Clayton, Victoria, Australia
| | - Helen Isobel Keen
- Medicine and Pharmacology, UWA, Murdoch, Western Australia, Australia
| |
Collapse
|
8
|
Roy B, Das S. Perceptible sentiment analysis of students' WhatsApp group chats in valence, arousal, and dominance space. SOCIAL NETWORK ANALYSIS AND MINING 2022. [DOI: 10.1007/s13278-022-01016-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
9
|
West SJ, Thomson ND. Identifying the emotions behind apologies for severe transgressions. MOTIVATION AND EMOTION 2022. [DOI: 10.1007/s11031-022-09993-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
10
|
Topitzer M, Kou Y, Kasumba R, Kreniske P. How Differing Audiences Were Associated with User Emotional Expression on a Well-Being App. HUMAN BEHAVIOR AND EMERGING TECHNOLOGIES 2022; 2022:4453980. [PMID: 38031588 PMCID: PMC10686580 DOI: 10.1155/2022/4453980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
In the last five years there has been an explosion of mobile apps that aim to impact emotional well-being, yet limited research has examined the ways that users interact, and specifically write to develop a therapeutic alliance within these apps. Writing is a developmental practice in which a narrator transforms amorphous thoughts and emotions into expressions, and according to narrative theory, the linguistic characteristics of writing can be understood as a physical manifestation of a narrator's affect. Informed by literacy theorists who have argued convincingly that narrators address different audiences in different ways, we used IBM Watson's Natural Language Processing software (IBM Watson NLP) to examine how users' expression of emotion on a well-being app differed depending on the audience. Our findings demonstrate that audience was strongly associated with the way users' expressed emotions in writing. When writing to an explicit audience users wrote longer narratives, with less sadness, less anger, less disgust, less fear and more joy. These findings have direct relevance for researchers and well-being app design.
Collapse
Affiliation(s)
- Maya Topitzer
- Columbia University Mailman School of Public Health, Department of Biostatistics
| | - Yueming Kou
- Columbia University Mailman School of Public Health, Department of Biostatistics
| | - Robert Kasumba
- Washington University in St. Louis, International Center for Child Health and Development McKelvey School of Engineering
| | - Philip Kreniske
- HIV Center for Clinical and Behavioral Studies, New York State Psychiatric Institute and Columbia University
| |
Collapse
|
11
|
Yang M, Jiang B, Wang Y, Hao T, Liu Y. News Text Mining-Based Business Sentiment Analysis and Its Significance in Economy. Front Psychol 2022; 13:918447. [PMID: 35910983 PMCID: PMC9330562 DOI: 10.3389/fpsyg.2022.918447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 05/17/2022] [Indexed: 11/26/2022] Open
Abstract
The purpose of business sentiment analysis is to determine the emotions or attitudes expressed toward the company, products, services, personnel, or events. Text analysis are the simplest and most developed types of sentiment analysis so far. The text-based business sentiment analysis still has some unresolved challenges. For example, the machine learning algorithms are unable to recognize double meanings, jokes and allusions. The regional differences between language and non-native speech structures cannot be explained. To solve this problem, an undirected weighted graph is constructed for news topics. The sentences in an article are modeled as nodes, and the normalized sentence similarity is used as the link of the nodes, which can help avoid the influence of sentence length on the summary results. In the topic extraction process, the keywords are not limited to a single word, to achieve the purpose of improving the readability of the abstract. To improve the accuracy of sentiment classification, this work proposes a robust news mining-based business sentiment analysis framework, called BuSeD. It contains two main stages: (1) news collection and preprocessing, and (2) feature extraction and sentiment classification. In the first stage, the news is collected by using crawler tools. The news dataset is then preprocessed by reducing noises. In the second stage, topics in each article is extracted by using traditional topic extraction tools. And then a convolutional neural network (CNN)-based text analyzing model is designed to analyze news from sentence level. We conduct comprehensive experiments to evaluate the performance of BuSeD for sentiment classification. Compared with four classical classification algorithms, the proposed CNN-based classification model of BuSeD achieves the highest F1 scores. We also present a quantitative trading application based on sentiment analysis to validate BuSeD, which indicates that the news-based business sentiment analysis has high economic application value.
Collapse
Affiliation(s)
| | - Binghan Jiang
- Faculty of Business and Economics, The University of Hong Kong, Hong Kong, Hong Kong SAR, China
| | | | | | | |
Collapse
|
12
|
Working from Home in Italy during COVID-19 Lockdown: A Survey to Assess the Indoor Environmental Quality and Productivity. BUILDINGS 2021. [DOI: 10.3390/buildings11120660] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Italians were the first European citizens to experience the lockdown due to Sars-Cov-2 in March 2020. Most employees were forced to work from home. People suddenly had to share common living spaces with family members for longer periods of time and convert home spaces into workplaces. This inevitably had a subjective impact on the perception, satisfaction and preference of indoor environmental quality and work productivity. A web-based survey was designed and administered to Italian employees to determine how they perceived the indoor environmental quality of residential spaces when Working From Home (WFH) and to investigate the relationship between different aspects of users’ satisfaction. A total of 330 valid questionnaires were collected and analysed. The article reports the results of the analyses conducted using a descriptive approach and predictive models to quantify comfort in living spaces when WFH, focusing on respondents’ satisfaction. Most of them were satisfied with the indoor environmental conditions (89% as the sum of “very satisfied” and “satisfied” responses for thermal comfort, 74% for visual comfort, 68% for acoustic quality and 81% for indoor air quality), while the layout of the furniture negatively influenced the WFH experience: 45% of the participants expressed an unsatisfactory or neutral opinion. The results of the sentiment analysis confirmed this trend. Among the Indoor Environmental factors that affect productivity, visual comfort is the most relevant variable. As for the predictive approach using machine learning, the Support Vector Machine classifier performed best in predicting overall satisfaction.
Collapse
|
13
|
Box-Steffensmeier JM, Moses L. Meaningful messaging: Sentiment in elite social media communication with the public on the COVID-19 pandemic. SCIENCE ADVANCES 2021; 7:7/29/eabg2898. [PMID: 34261655 PMCID: PMC8279499 DOI: 10.1126/sciadv.abg2898] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 06/01/2021] [Indexed: 06/13/2023]
Abstract
Elite messaging plays a crucial role in shaping public debate and spreading information. We examine elite political communication during an emergent international crisis to investigate the role of tone in messaging, information spread, and public reaction. By measuring tone in social media messages from members of the U.S. Congress related to the COVID-19 pandemic, we find clear partisan differences and a differential impact of tone on message engagement and information spread. This suggests that even in the midst of an international health crisis, partisanship and emotional rhetoric play a critical part in elite communications and contribute to the attention messages receive. The messaging on COVID-19 is polarized and fractured. The valenced messaging provokes divergence in public engagement, reaction, and information spread. These results have important implications for studies of representation, public opinion, and how government can effectively engage individuals in emergent situations or pivotal moments.
Collapse
Affiliation(s)
- Janet M Box-Steffensmeier
- Department of Political Science, The Ohio State University, 154 N. Oval Mall, Columbus, OH 43210, USA.
| | - Laura Moses
- Department of Political Science, The Ohio State University, 154 N. Oval Mall, Columbus, OH 43210, USA
| |
Collapse
|
14
|
Gooding P, Kariotis T. Ethics and Law in Research on Algorithmic and Data-Driven Technology in Mental Health Care: Scoping Review. JMIR Ment Health 2021; 8:e24668. [PMID: 34110297 PMCID: PMC8262551 DOI: 10.2196/24668] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 03/11/2021] [Accepted: 04/15/2021] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Uncertainty surrounds the ethical and legal implications of algorithmic and data-driven technologies in the mental health context, including technologies characterized as artificial intelligence, machine learning, deep learning, and other forms of automation. OBJECTIVE This study aims to survey empirical scholarly literature on the application of algorithmic and data-driven technologies in mental health initiatives to identify the legal and ethical issues that have been raised. METHODS We searched for peer-reviewed empirical studies on the application of algorithmic technologies in mental health care in the Scopus, Embase, and Association for Computing Machinery databases. A total of 1078 relevant peer-reviewed applied studies were identified, which were narrowed to 132 empirical research papers for review based on selection criteria. Conventional content analysis was undertaken to address our aims, and this was supplemented by a keyword-in-context analysis. RESULTS We grouped the findings into the following five categories of technology: social media (53/132, 40.1%), smartphones (37/132, 28%), sensing technology (20/132, 15.1%), chatbots (5/132, 3.8%), and miscellaneous (17/132, 12.9%). Most initiatives were directed toward detection and diagnosis. Most papers discussed privacy, mainly in terms of respecting the privacy of research participants. There was relatively little discussion of privacy in this context. A small number of studies discussed ethics directly (10/132, 7.6%) and indirectly (10/132, 7.6%). Legal issues were not substantively discussed in any studies, although some legal issues were discussed in passing (7/132, 5.3%), such as the rights of user subjects and privacy law compliance. CONCLUSIONS Ethical and legal issues tend to not be explicitly addressed in empirical studies on algorithmic and data-driven technologies in mental health initiatives. Scholars may have considered ethical or legal matters at the ethics committee or institutional review board stage. If so, this consideration seldom appears in published materials in applied research in any detail. The form itself of peer-reviewed papers that detail applied research in this field may well preclude a substantial focus on ethics and law. Regardless, we identified several concerns, including the near-complete lack of involvement of mental health service users, the scant consideration of algorithmic accountability, and the potential for overmedicalization and techno-solutionism. Most papers were published in the computer science field at the pilot or exploratory stages. Thus, these technologies could be appropriated into practice in rarely acknowledged ways, with serious legal and ethical implications.
Collapse
Affiliation(s)
- Piers Gooding
- Melbourne Law School, University of Melbourne, Melbourne, Australia
- Mozilla Foundation, Mountain View, CA, United States
| | - Timothy Kariotis
- Melbourne School of Government, University of Melbourne, Melbourne, Australia
| |
Collapse
|
15
|
Chekroud AM, Bondar J, Delgadillo J, Doherty G, Wasil A, Fokkema M, Cohen Z, Belgrave D, DeRubeis R, Iniesta R, Dwyer D, Choi K. The promise of machine learning in predicting treatment outcomes in psychiatry. World Psychiatry 2021; 20:154-170. [PMID: 34002503 PMCID: PMC8129866 DOI: 10.1002/wps.20882] [Citation(s) in RCA: 226] [Impact Index Per Article: 56.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
For many years, psychiatrists have tried to understand factors involved in response to medications or psychotherapies, in order to personalize their treatment choices. There is now a broad and growing interest in the idea that we can develop models to personalize treatment decisions using new statistical approaches from the field of machine learning and applying them to larger volumes of data. In this pursuit, there has been a paradigm shift away from experimental studies to confirm or refute specific hypotheses towards a focus on the overall explanatory power of a predictive model when tested on new, unseen datasets. In this paper, we review key studies using machine learning to predict treatment outcomes in psychiatry, ranging from medications and psychotherapies to digital interventions and neurobiological treatments. Next, we focus on some new sources of data that are being used for the development of predictive models based on machine learning, such as electronic health records, smartphone and social media data, and on the potential utility of data from genetics, electrophysiology, neuroimaging and cognitive testing. Finally, we discuss how far the field has come towards implementing prediction tools in real-world clinical practice. Relatively few retrospective studies to-date include appropriate external validation procedures, and there are even fewer prospective studies testing the clinical feasibility and effectiveness of predictive models. Applications of machine learning in psychiatry face some of the same ethical challenges posed by these techniques in other areas of medicine or computer science, which we discuss here. In short, machine learning is a nascent but important approach to improve the effectiveness of mental health care, and several prospective clinical studies suggest that it may be working already.
Collapse
Affiliation(s)
- Adam M Chekroud
- Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA
- Spring Health, New York City, NY, USA
| | | | - Jaime Delgadillo
- Clinical Psychology Unit, Department of Psychology, University of Sheffield, Sheffield, UK
| | - Gavin Doherty
- School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
| | - Akash Wasil
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Marjolein Fokkema
- Department of Methods and Statistics, Institute of Psychology, Leiden University, Leiden, The Netherlands
| | - Zachary Cohen
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, USA
| | | | - Robert DeRubeis
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Raquel Iniesta
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neurosciences, King's College London, London, UK
| | - Dominic Dwyer
- Department of Psychiatry and Psychotherapy, Section for Neurodiagnostic Applications, Ludwig-Maximilian University, Munich, Germany
| | - Karmel Choi
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
16
|
Vaidyam AN, Linggonegoro D, Torous J. Changes to the Psychiatric Chatbot Landscape: A Systematic Review of Conversational Agents in Serious Mental Illness: Changements du paysage psychiatrique des chatbots: une revue systématique des agents conversationnels dans la maladie mentale sérieuse. CANADIAN JOURNAL OF PSYCHIATRY. REVUE CANADIENNE DE PSYCHIATRIE 2021; 66:339-348. [PMID: 33063526 PMCID: PMC8172347 DOI: 10.1177/0706743720966429] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
OBJECTIVE The need for digital tools in mental health is clear, with insufficient access to mental health services. Conversational agents, also known as chatbots or voice assistants, are digital tools capable of holding natural language conversations. Since our last review in 2018, many new conversational agents and research have emerged, and we aimed to reassess the conversational agent landscape in this updated systematic review. METHODS A systematic literature search was conducted in January 2020 using the PubMed, Embase, PsychINFO, and Cochrane databases. Studies included were those that involved a conversational agent assessing serious mental illness: major depressive disorder, schizophrenia spectrum disorders, bipolar disorder, or anxiety disorder. RESULTS Of the 247 references identified from selected databases, 7 studies met inclusion criteria. Overall, there were generally positive experiences with conversational agents in regard to diagnostic quality, therapeutic efficacy, or acceptability. There continues to be, however, a lack of standard measures that allow ease of comparison of studies in this space. There were several populations that lacked representation such as the pediatric population and those with schizophrenia or bipolar disorder. While comparing 2018 to 2020 research offers useful insight into changes and growth, the high degree of heterogeneity between all studies in this space makes direct comparison challenging. CONCLUSIONS This review revealed few but generally positive outcomes regarding conversational agents' diagnostic quality, therapeutic efficacy, and acceptability, which may augment mental health care. Despite this increase in research activity, there continues to be a lack of standard measures for evaluating conversational agents as well as several neglected populations. We recommend that the standardization of conversational agent studies should include patient adherence and engagement, therapeutic efficacy, and clinician perspectives.
Collapse
Affiliation(s)
- Aditya Nrusimha Vaidyam
- 1859Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Danny Linggonegoro
- 1859Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - John Torous
- 1859Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
17
|
Robin J, Harrison JE, Kaufman LD, Rudzicz F, Simpson W, Yancheva M. Evaluation of Speech-Based Digital Biomarkers: Review and Recommendations. Digit Biomark 2020; 4:99-108. [PMID: 33251474 DOI: 10.1159/000510820] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 08/11/2020] [Indexed: 12/23/2022] Open
Abstract
Speech represents a promising novel biomarker by providing a window into brain health, as shown by its disruption in various neurological and psychiatric diseases. As with many novel digital biomarkers, however, rigorous evaluation is currently lacking and is required for these measures to be used effectively and safely. This paper outlines and provides examples from the literature of evaluation steps for speech-based digital biomarkers, based on the recent V3 framework (Goldsack et al., 2020). The V3 framework describes 3 components of evaluation for digital biomarkers: verification, analytical validation, and clinical validation. Verification includes assessing the quality of speech recordings and comparing the effects of hardware and recording conditions on the integrity of the recordings. Analytical validation includes checking the accuracy and reliability of data processing and computed measures, including understanding test-retest reliability, demographic variability, and comparing measures to reference standards. Clinical validity involves verifying the correspondence of a measure to clinical outcomes which can include diagnosis, disease progression, or response to treatment. For each of these sections, we provide recommendations for the types of evaluation necessary for speech-based biomarkers and review published examples. The examples in this paper focus on speech-based biomarkers, but they can be used as a template for digital biomarker development more generally.
Collapse
Affiliation(s)
| | - John E Harrison
- Metis Cognition Ltd., Park House, Kilmington Common, Warminster, United Kingdom.,Alzheimer Center, AUmc, Amsterdam, The Netherlands.,Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
| | | | - Frank Rudzicz
- Li Ka Shing Knowledge Institute, St Michael's Hospital, Toronto, Ontario, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.,Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - William Simpson
- Winterlight Labs, Toronto, Ontario, Canada.,Department of Psychiatry and Behavioural Neuroscience, McMaster University, Hamilton, Ontario, Canada
| | | |
Collapse
|
18
|
Provoost S, Kleiboer A, Ornelas J, Bosse T, Ruwaard J, Rocha A, Cuijpers P, Riper H. Improving adherence to an online intervention for low mood with a virtual coach: study protocol of a pilot randomized controlled trial. Trials 2020; 21:860. [PMID: 33066805 PMCID: PMC7565359 DOI: 10.1186/s13063-020-04777-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2019] [Accepted: 09/29/2020] [Indexed: 01/17/2023] Open
Abstract
Background Internet-based cognitive-behavioral therapy (iCBT) is more effective when it is guided by human support than when it is unguided. This may be attributable to higher adherence rates that result from a positive effect of the accompanying support on motivation and on engagement with the intervention. This protocol presents the design of a pilot randomized controlled trial that aims to start bridging the gap between guided and unguided interventions. It will test an intervention that includes automated support delivered by an embodied conversational agent (ECA) in the form of a virtual coach. Methods/design The study will employ a pilot two-armed randomized controlled trial design. The primary outcomes of the trial will be (1) the effectiveness of iCBT, as supported by a virtual coach, in terms of improved intervention adherence in comparison with unguided iCBT, and (2) the feasibility of a future, larger-scale trial in terms of recruitment, acceptability, and sample size calculation. Secondary aims will be to assess the virtual coach’s effect on motivation, users’ perceptions of the virtual coach, and general feasibility of the intervention as supported by a virtual coach. We will recruit N = 70 participants from the general population who wish to learn how they can improve their mood by using Moodbuster Lite, a 4-week cognitive-behavioral therapy course. Candidates with symptoms of moderate to severe depression will be excluded from study participation. Included participants will be randomized in a 1:1 ratio to either (1) Moodbuster Lite with automated support delivered by a virtual coach or (2) Moodbuster Lite without automated support. Assessments will be taken at baseline and post-study 4 weeks later. Discussion The study will assess the preliminary effectiveness of a virtual coach in improving adherence and will determine the feasibility of a larger-scale RCT. It could represent a significant step in bridging the gap between guided and unguided iCBT interventions. Trial registration Netherlands Trial Register (NTR) NL8110. Registered on 23 October 2019.
Collapse
Affiliation(s)
- Simon Provoost
- Department of Clinical, Neuro- and Developmental Psychology, Clinical Psychology Section, VU University and Amsterdam Public Health Research Institute, Amsterdam, Netherlands.
| | - Annet Kleiboer
- Department of Clinical, Neuro- and Developmental Psychology, Clinical Psychology Section, VU University and Amsterdam Public Health Research Institute, Amsterdam, Netherlands
| | - José Ornelas
- Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal
| | - Tibor Bosse
- Behavioural Science Institute, Radboud University, Nijmegen, Netherlands
| | - Jeroen Ruwaard
- Department of Psychiatry, Amsterdam UMC, Location VU University Medical Centre, and Amsterdam Public Health Research Institute, Amsterdam, Netherlands.,GGZ inGeest Specialized Mental Health Care, Amsterdam, Netherlands
| | - Artur Rocha
- Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal
| | - Pim Cuijpers
- Department of Clinical, Neuro- and Developmental Psychology, Clinical Psychology Section, VU University and Amsterdam Public Health Research Institute, Amsterdam, Netherlands
| | - Heleen Riper
- Department of Clinical, Neuro- and Developmental Psychology, Clinical Psychology Section, VU University and Amsterdam Public Health Research Institute, Amsterdam, Netherlands.,Department of Psychiatry, Amsterdam UMC, Location VU University Medical Centre, and Amsterdam Public Health Research Institute, Amsterdam, Netherlands.,GGZ inGeest Specialized Mental Health Care, Amsterdam, Netherlands
| |
Collapse
|