1
|
Valdes G, Scholey J, Nano TF, Gennatas ED, Mohindra P, Mohammed N, Zeng J, Kotecha R, Rosen LR, Chang J, Tsai HK, Urbanic JJ, Vargas CE, Yu NY, Ungar LH, Eaton E, Simone CB. Predicting the Effect of Proton Beam Therapy Technology on Pulmonary Toxicities for Patients With Locally Advanced Lung Cancer Enrolled in the Proton Collaborative Group Prospective Clinical Trial. Int J Radiat Oncol Biol Phys 2024; 119:66-77. [PMID: 38000701 DOI: 10.1016/j.ijrobp.2023.11.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 10/27/2023] [Accepted: 11/13/2023] [Indexed: 11/26/2023]
Abstract
PURPOSE This study aimed to predict the probability of grade ≥2 pneumonitis or dyspnea within 12 months of receiving conventionally fractionated or mildly hypofractionated proton beam therapy for locally advanced lung cancer using machine learning. METHODS AND MATERIALS Demographic and treatment characteristics were analyzed for 965 consecutive patients treated for lung cancer with conventionally fractionated or mildly hypofractionated (2.2-3 Gy/fraction) proton beam therapy across 12 institutions. Three machine learning models (gradient boosting, additive tree, and logistic regression with lasso regularization) were implemented to predict Common Terminology Criteria for Adverse Events version 4 grade ≥2 pulmonary toxicities using double 10-fold cross-validation for parameter hyper-tuning without leak of information. Balanced accuracy and area under the curve were calculated, and 95% confidence intervals were obtained using bootstrap sampling. RESULTS The median age of the patients was 70 years (range, 20-97), and they had predominantly stage IIIA or IIIB disease. They received a median dose of 60 Gy in 2 Gy/fraction, and 46.4% received concurrent chemotherapy. In total, 250 (25.9%) had grade ≥2 pulmonary toxicity. The probability of pulmonary toxicity was 0.08 for patients treated with pencil beam scanning and 0.34 for those treated with other techniques (P = 8.97e-13). Use of abdominal compression and breath hold were highly significant predictors of less toxicity (P = 2.88e-08). Higher total radiation delivered dose (P = .0182) and higher average dose to the ipsilateral lung (P = .0035) increased the likelihood of pulmonary toxicities. The gradient boosting model performed the best of the models tested, and when demographic and dosimetric features were combined, the area under the curve and balanced accuracy were 0.75 ± 0.02 and 0.67 ± 0.02, respectively. After analyzing performance versus the number of data points used for training, we observed that accuracy was limited by the number of observations. CONCLUSIONS In the largest analysis of prospectively enrolled patients with lung cancer assessing pulmonary toxicities from proton therapy to date, advanced machine learning methods revealed that pencil beam scanning, abdominal compression, and lower normal lung doses can lead to significantly lower probability of developing grade ≥2 pneumonitis or dyspnea.
Collapse
Affiliation(s)
- Gilmer Valdes
- Department of Radiation Oncology, University of California, San Francisco, California
| | - Jessica Scholey
- Department of Radiation Oncology, University of California, San Francisco, California
| | - Tomi F Nano
- Department of Radiation Oncology, University of California, San Francisco, California.
| | - Efstathios D Gennatas
- Department of Epidemiology and Biostatistics, University of California, San Francisco, California
| | - Pranshu Mohindra
- University of Maryland School of Medicine and Maryland Proton Treatment Center, Baltimore, Maryland
| | - Nasir Mohammed
- Northwestern Medicine Chicago Proton Center, Warrenville, Illinois
| | - Jing Zeng
- University of Washington and Seattle Cancer Care Alliance Proton Therapy Center, Seattle, Washington
| | - Rupesh Kotecha
- Department of Radiation Oncology, Miami Cancer Institute, Baptist Health South Florida, Miami, Florida
| | - Lane R Rosen
- Willis-Knighton Medical Center, Shreveport, Louisiana
| | - John Chang
- Oklahoma Proton Center, Oklahoma City, Oklahoma
| | - Henry K Tsai
- New Jersey Procure Proton Therapy Center, Somerset, New Jersey
| | - James J Urbanic
- Department of Radiation Oncology, California Protons Therapy Center, San Diego, California
| | - Carlos E Vargas
- Department of Radiation Oncology, Mayo Clinic Proton Center, Phoenix, Arizona
| | - Nathan Y Yu
- Department of Radiation Oncology, Mayo Clinic Proton Center, Phoenix, Arizona
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Eric Eaton
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Charles B Simone
- Department of Radiation Oncology, New York Proton Center, New York, New York
| |
Collapse
|
2
|
Jose R, Wang W, Sherman G, Rosenthal RN, Schwartz HA, Ungar LH, McKay JR. Tapping into alcohol use during COVID: Drinking correlates among bartenders and servers. PLoS One 2024; 19:e0300932. [PMID: 38625926 PMCID: PMC11020438 DOI: 10.1371/journal.pone.0300932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 03/06/2024] [Indexed: 04/18/2024] Open
Abstract
The COVID pandemic placed a spotlight on alcohol use and the hardships of working within the food and beverage industry, with millions left jobless. Following previous studies that have found elevated rates of alcohol problems among bartenders and servers, here we studied the alcohol use of bartenders and servers who were employed during COVID. From February 12-June 16, 2021, in the midst of the U.S. COVID national emergency declaration, survey data from 1,010 employed bartender and servers were analyzed to quantify rates of excessive or hazardous drinking along with regression predictors of alcohol use as assessed by the 10-item Alcohol Use Disorders Identification Test (AUDIT). Findings indicate that more than 2 out of 5 (44%) people surveyed reported moderate or high rates of alcohol problem severity (i.e., AUDIT scores of 8 or higher)-a rate 4 to 6 times that of the heavy alcohol use rate reported pre- or mid-pandemic by adults within and outside the industry. Person-level factors (gender, substance use, mood) along with the drinking habits of one's core social group were significantly associated with alcohol use. Bartenders and servers reported surprisingly high rates of alcohol problem severity and experienced risk factors for hazardous drinking at multiple ecological levels. Being a highly vulnerable and understudied population, more studies on bartenders and servers are needed to assess and manage the true toll of alcohol consumption for industry employees.
Collapse
Affiliation(s)
- Rupa Jose
- Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Weixi Wang
- Department of Computer Science, Stony Brook University, Stony Brook, New York, United States of America
| | - Garrick Sherman
- Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Richard N. Rosenthal
- Department of Psychiatry, Stony Brook University, Stony Brook, New York, United States of America
| | - H. Andrew Schwartz
- Department of Computer Science, Stony Brook University, Stony Brook, New York, United States of America
| | - Lyle H. Ungar
- Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - James R. McKay
- Philadelphia Crescenz Veterans Affairs Medical Center, Philadelphia, Pennsylvania, United States of America
- Department of Psychiatry, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
3
|
Rai S, Stade EC, Giorgi S, Francisco A, Ungar LH, Curtis B, Guntuku SC. Key language markers of depression on social media depend on race. Proc Natl Acad Sci U S A 2024; 121:e2319837121. [PMID: 38530887 PMCID: PMC10998627 DOI: 10.1073/pnas.2319837121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 01/31/2024] [Indexed: 03/28/2024] Open
Abstract
Depression has robust natural language correlates and can increasingly be measured in language using predictive models. However, despite evidence that language use varies as a function of individual demographic features (e.g., age, gender), previous work has not systematically examined whether and how depression's association with language varies by race. We examine how race moderates the relationship between language features (i.e., first-person pronouns and negative emotions) from social media posts and self-reported depression, in a matched sample of Black and White English speakers in the United States. Our findings reveal moderating effects of race: While depression severity predicts I-usage in White individuals, it does not in Black individuals. White individuals use more belongingness and self-deprecation-related negative emotions. Machine learning models trained on similar amounts of data to predict depression severity performed poorly when tested on Black individuals, even when they were trained exclusively using the language of Black individuals. In contrast, analogous models tested on White individuals performed relatively well. Our study reveals surprising race-based differences in the expression of depression in natural language and highlights the need to understand these effects better, especially before language-based models for detecting psychological phenomena are integrated into clinical practice.
Collapse
Affiliation(s)
- Sunny Rai
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA19104
| | - Elizabeth C. Stade
- Institute for Human-Centered Artificial Intelligence, Stanford University, Stanford, CA94305
| | - Salvatore Giorgi
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA19104
- Technology & Translational Research Unit, National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), Baltimore, MD21224
| | - Ashley Francisco
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA19104
| | - Lyle H. Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA19104
| | - Brenda Curtis
- Technology & Translational Research Unit, National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), Baltimore, MD21224
| | - Sharath C. Guntuku
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA19104
- Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA19104
| |
Collapse
|
4
|
Stade EC, Stirman SW, Ungar LH, Boland CL, Schwartz HA, Yaden DB, Sedoc J, DeRubeis RJ, Willer R, Eichstaedt JC. Large language models could change the future of behavioral healthcare: a proposal for responsible development and evaluation. Npj Ment Health Res 2024; 3:12. [PMID: 38609507 PMCID: PMC10987499 DOI: 10.1038/s44184-024-00056-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 01/30/2024] [Indexed: 04/14/2024]
Abstract
Large language models (LLMs) such as Open AI's GPT-4 (which power ChatGPT) and Google's Gemini, built on artificial intelligence, hold immense potential to support, augment, or even eventually automate psychotherapy. Enthusiasm about such applications is mounting in the field as well as industry. These developments promise to address insufficient mental healthcare system capacity and scale individual access to personalized treatments. However, clinical psychology is an uncommonly high stakes application domain for AI systems, as responsible and evidence-based therapy requires nuanced expertise. This paper provides a roadmap for the ambitious yet responsible application of clinical LLMs in psychotherapy. First, a technical overview of clinical LLMs is presented. Second, the stages of integration of LLMs into psychotherapy are discussed while highlighting parallels to the development of autonomous vehicle technology. Third, potential applications of LLMs in clinical care, training, and research are discussed, highlighting areas of risk given the complex nature of psychotherapy. Fourth, recommendations for the responsible development and evaluation of clinical LLMs are provided, which include centering clinical science, involving robust interdisciplinary collaboration, and attending to issues like assessment, risk detection, transparency, and bias. Lastly, a vision is outlined for how LLMs might enable a new generation of studies of evidence-based interventions at scale, and how these studies may challenge assumptions about psychotherapy.
Collapse
Affiliation(s)
- Elizabeth C Stade
- Dissemination and Training Division, National Center for PTSD, VA Palo Alto Health Care System, Palo Alto, CA, USA.
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA.
- Institute for Human-Centered Artificial Intelligence & Department of Psychology, Stanford University, Stanford, CA, USA.
| | - Shannon Wiltsey Stirman
- Dissemination and Training Division, National Center for PTSD, VA Palo Alto Health Care System, Palo Alto, CA, USA
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Cody L Boland
- Dissemination and Training Division, National Center for PTSD, VA Palo Alto Health Care System, Palo Alto, CA, USA
| | - H Andrew Schwartz
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
| | - David B Yaden
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - João Sedoc
- Department of Technology, Operations, and Statistics, New York University, New York, NY, USA
| | - Robert J DeRubeis
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Robb Willer
- Department of Sociology, Stanford University, Stanford, CA, USA
| | - Johannes C Eichstaedt
- Institute for Human-Centered Artificial Intelligence & Department of Psychology, Stanford University, Stanford, CA, USA.
| |
Collapse
|
5
|
Stamatis CA, Meyerhoff J, Meng Y, Lin ZCC, Cho YM, Liu T, Karr CJ, Liu T, Curtis BL, Ungar LH, Mohr DC. Differential temporal utility of passively sensed smartphone features for depression and anxiety symptom prediction: a longitudinal cohort study. Npj Ment Health Res 2024; 3:1. [PMID: 38609548 PMCID: PMC10955925 DOI: 10.1038/s44184-023-00041-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 10/19/2023] [Indexed: 04/14/2024]
Abstract
While studies show links between smartphone data and affective symptoms, we lack clarity on the temporal scale, specificity (e.g., to depression vs. anxiety), and person-specific (vs. group-level) nature of these associations. We conducted a large-scale (n = 1013) smartphone-based passive sensing study to identify within- and between-person digital markers of depression and anxiety symptoms over time. Participants (74.6% female; M age = 40.9) downloaded the LifeSense app, which facilitated continuous passive data collection (e.g., GPS, app and device use, communication) across 16 weeks. Hierarchical linear regression models tested the within- and between-person associations of 2-week windows of passively sensed data with depression (PHQ-8) or generalized anxiety (GAD-7). We used a shifting window to understand the time scale at which sensed features relate to mental health symptoms, predicting symptoms 2 weeks in the future (distal prediction), 1 week in the future (medial prediction), and 0 weeks in the future (proximal prediction). Spending more time at home relative to one's average was an early signal of PHQ-8 severity (distal β = 0.219, p = 0.012) and continued to relate to PHQ-8 at medial (β = 0.198, p = 0.022) and proximal (β = 0.183, p = 0.045) windows. In contrast, circadian movement was proximally related to (β = -0.131, p = 0.035) but did not predict (distal β = 0.034, p = 0.577; medial β = -0.089, p = 0.138) PHQ-8. Distinct communication features (i.e., call/text or app-based messaging) related to PHQ-8 and GAD-7. Findings have implications for identifying novel treatment targets, personalizing digital mental health interventions, and enhancing traditional patient-provider interactions. Certain features (e.g., circadian movement) may represent correlates but not true prospective indicators of affective symptoms. Conversely, other features like home duration may be such early signals of intra-individual symptom change, indicating the potential utility of prophylactic intervention (e.g., behavioral activation) in response to person-specific increases in these signals.
Collapse
Affiliation(s)
- Caitlin A Stamatis
- Department of Preventive Medicine, Center for Behavioral Intervention Technologies (CBITs), Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
| | - Jonah Meyerhoff
- Department of Preventive Medicine, Center for Behavioral Intervention Technologies (CBITs), Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Yixuan Meng
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Zhi Chong Chris Lin
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Young Min Cho
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA
| | - Tony Liu
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
- Roblox Corporation, San Mateo, CA, USA
| | | | - Tingting Liu
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA
- Technology & Translational Research Unit, National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), Bethesda, MD, USA
| | - Brenda L Curtis
- Technology & Translational Research Unit, National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), Bethesda, MD, USA
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA
| | - David C Mohr
- Department of Preventive Medicine, Center for Behavioral Intervention Technologies (CBITs), Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| |
Collapse
|
6
|
Wu T, Sherman G, Giorgi S, Thanneeru P, Ungar LH, Kamath PS, Simonetto DA, Curtis BL, Shah VH. Smartphone sensor data estimate alcohol craving in a cohort of patients with alcohol-associated liver disease and alcohol use disorder. Hepatol Commun 2023; 7:e0329. [PMID: 38055637 PMCID: PMC10984664 DOI: 10.1097/hc9.0000000000000329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Accepted: 09/22/2023] [Indexed: 12/08/2023] Open
Abstract
BACKGROUND Sensors within smartphones, such as accelerometer and location, can describe longitudinal markers of behavior as represented through devices in a method called digital phenotyping. This study aimed to assess the feasibility of digital phenotyping for patients with alcohol-associated liver disease and alcohol use disorder, determine correlations between smartphone data and alcohol craving, and establish power assessment for future studies to prognosticate clinical outcomes. METHODS A total of 24 individuals with alcohol-associated liver disease and alcohol use disorder were instructed to download the AWARE application to collect continuous sensor data and complete daily ecological momentary assessments on alcohol craving and mood for up to 30 days. Data from sensor streams were processed into features like accelerometer magnitude, number of calls, and location entropy, which were used for statistical analysis. We used repeated measures correlation for longitudinal data to evaluate associations between sensors and ecological momentary assessments and standard Pearson correlation to evaluate within-individual relationships between sensors and craving. RESULTS Alcohol craving significantly correlated with mood obtained from ecological momentary assessments. Across all sensors, features associated with craving were also significantly correlated with all moods (eg, loneliness and stress) except boredom. Individual-level analysis revealed significant relationships between craving and features of location entropy and average accelerometer magnitude. CONCLUSIONS Smartphone sensors may serve as markers for alcohol craving and mood in alcohol-associated liver disease and alcohol use disorder. Findings suggest that location-based and accelerometer-based features may be associated with alcohol craving. However, data missingness and low participant retention remain challenges. Future studies are needed for further digital phenotyping of relapse risk and progression of liver disease.
Collapse
Affiliation(s)
- Tiffany Wu
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota, USA
| | - Garrick Sherman
- National Institute on Drug Abuse Intramural Research Program, National Institute of Health Baltimore, Maryland, USA
| | - Salvatore Giorgi
- National Institute on Drug Abuse Intramural Research Program, National Institute of Health Baltimore, Maryland, USA
| | - Priya Thanneeru
- Department of Medicine and Pediatrics, The Brooklyn Hospital Center, Brooklyn, New York, USA
| | - Lyle H. Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Patrick S. Kamath
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota, USA
| | - Douglas A. Simonetto
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota, USA
| | - Brenda L. Curtis
- National Institute on Drug Abuse Intramural Research Program, National Institute of Health Baltimore, Maryland, USA
| | - Vijay H. Shah
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota, USA
| |
Collapse
|
7
|
Stamatis CA, Liu T, Meyerhoff J, Meng Y, Cho YM, Karr CJ, Curtis BL, Ungar LH, Mohr DC. Specific associations of passively sensed smartphone data with future symptoms of avoidance, fear, and physiological distress in social anxiety. Internet Interv 2023; 34:100683. [PMID: 37867614 PMCID: PMC10589746 DOI: 10.1016/j.invent.2023.100683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 09/21/2023] [Accepted: 10/11/2023] [Indexed: 10/24/2023] Open
Abstract
Background Prior literature links passively sensed information about a person's location, movement, and communication with social anxiety. These findings hold promise for identifying novel treatment targets, informing clinical care, and personalizing digital mental health interventions. However, social anxiety symptoms are heterogeneous; to identify more precise targets and tailor treatments, there is a need for personal sensing studies aimed at understanding differential predictors of the distinct subdomains of social anxiety. Our objective was to conduct a large-scale smartphone-based sensing study of fear, avoidance, and physiological symptoms in the context of trait social anxiety over time. Methods Participants (n = 1013; 74.6 % female; M age = 40.9) downloaded the LifeSense app, which collected continuous passive data (e.g., GPS, communication, app and device use) over 16 weeks. We tested a series of multilevel linear regression models to understand within- and between-person associations of 2-week windows of passively sensed smartphone data with fear, avoidance, and physiological distress on the self-reported Social Phobia Inventory (SPIN). A shifting sensor lag was applied to examine how smartphone features related to SPIN subdomains 2 weeks in the future (distal prediction), 1 week in the future (medial prediction), and 0 weeks in the future (proximal prediction). Results A decrease in time visiting novel places was a strong between-person predictor of social avoidance over time (distal β = -0.886, p = .002; medial β = -0.647, p = .029; proximal β = -0.818, p = .007). Reductions in call- and text-based communications were associated with social avoidance at both the between- (distal β = -0.882, p = .002; medial β = -0.932, p = .001; proximal β = -0.918, p = .001) and within- (distal β = -0.191, p = .046; medial β = -0.213, p = .028) person levels, as well as between-person fear of social situations (distal β = -0.860, p < .001; medial β = -0.892, p < .001; proximal β = -0.886, p < .001) over time. There were fewer significant associations of sensed data with physiological distress. Across the three subscales, smartphone data explained 9-12 % of the variance in social anxiety. Conclusion Findings have implications for understanding how social anxiety manifests in daily life, and for personalizing treatments. For example, a signal that someone is likely to begin avoiding social situations may suggest a need for alternative types of exposure-based interventions compared to a signal that someone is likely to begin experiencing increased physiological distress. Our results suggest that as a prophylactic means of targeting social avoidance, it may be helpful to deploy interventions involving social exposures in response to decreases in time spent visiting novel places.
Collapse
Affiliation(s)
- Caitlin A. Stamatis
- Department of Preventive Medicine, Center for Behavioral Intervention Technologies (CBITs), Feinberg School of Medicine, Northwestern University, Chicago, IL, United States of America
| | - Tingting Liu
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, United States of America
- Technology & Translational Research Unit, National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), Bethesda, MD, United States of America
| | - Jonah Meyerhoff
- Department of Preventive Medicine, Center for Behavioral Intervention Technologies (CBITs), Feinberg School of Medicine, Northwestern University, Chicago, IL, United States of America
| | - Yixuan Meng
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Young Min Cho
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Chris J. Karr
- Audacious Software, Chicago, IL, United States of America
| | - Brenda L. Curtis
- Technology & Translational Research Unit, National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), Bethesda, MD, United States of America
| | - Lyle H. Ungar
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, United States of America
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States of America
| | - David C. Mohr
- Department of Preventive Medicine, Center for Behavioral Intervention Technologies (CBITs), Feinberg School of Medicine, Northwestern University, Chicago, IL, United States of America
| |
Collapse
|
8
|
Sametoğlu S, Pelt DHM, Eichstaedt JC, Ungar LH, Bartels M. Comparison of wellbeing structures based on survey responses and social media language: A network analysis. Appl Psychol Health Well Being 2023; 15:1555-1582. [PMID: 37161901 DOI: 10.1111/aphw.12451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 04/07/2023] [Indexed: 05/11/2023]
Abstract
Wellbeing is predominantly measured through surveys but is increasingly measured by analysing individuals' language on social media platforms using social media text mining (SMTM). To investigate whether the structure of wellbeing is similar across both data collection methods, we compared networks derived from survey items and social media language features collected from the same participants. The dataset was split into an independent exploration (n = 1169) and a final subset (n = 1000). After estimating exploration networks, redundant survey items and language topics were eliminated. Final networks were then estimated using exploratory graph analysis (EGA). The networks of survey items and those from language topics were similar, both consisting of five wellbeing dimensions. The dimensions in the survey- and SMTM-based assessment of wellbeing showed convergent structures congruent with theories of wellbeing. Specific dimensions found in each network reflected the unique aspects of each type of data (survey and social media language). Networks derived from both language features and survey items show similar structures. Survey and SMTM methods may provide complementary methods to understand differences in human wellbeing.
Collapse
Affiliation(s)
- Selim Sametoğlu
- Department of Biological Psychology, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Amsterdam Public Health Research Institute, Amsterdam University Medical Centers, Amsterdam, The Netherlands
| | - Dirk H M Pelt
- Department of Biological Psychology, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Amsterdam Public Health Research Institute, Amsterdam University Medical Centers, Amsterdam, The Netherlands
| | - Johannes C Eichstaedt
- Department of Psychology, Stanford University, Stanford, California, USA
- Institute for Human-Centered AI, Stanford University, Stanford, California, USA
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Meike Bartels
- Department of Biological Psychology, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Amsterdam Public Health Research Institute, Amsterdam University Medical Centers, Amsterdam, The Netherlands
| |
Collapse
|
9
|
Giorgi S, Eichstaedt JC, Preoţiuc-Pietro D, Gardner JR, Schwartz HA, Ungar LH. Filling in the white space: Spatial interpolation with Gaussian processes and social media data. Curr Res Ecol Soc Psychol 2023; 5:100159. [PMID: 38125747 PMCID: PMC10732585 DOI: 10.1016/j.cresp.2023.100159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Full national coverage below the state level is difficult to attain through survey-based data collection. Even the largest survey-based data collections, such as the CDC's Behavioral Risk Factor Surveillance System or the Gallup-Healthways Well-being Index (both with more than 300,000 responses p.a.) only allow for the estimation of annual averages for about 260 out of roughly U.S. 3,000 counties when a threshold of 300 responses per county is used. Using a relatively high threshold of 300 responses gives substantially higher convergent validity-higher correlations with health variables-than lower thresholds but covers a reduced and biased sample of the population. We present principled methods to interpolate spatial estimates and show that including large-scale geotagged social media data can increase interpolation accuracy. In this work, we focus on Gallup-reported life satisfaction, a widely-used measure of subjective well-being. We use Gaussian Processes (GP), a formal Bayesian model, to interpolate life satisfaction, which we optimally combine with estimates from low-count data. We interpolate over several spaces (geographic and socioeconomic) and extend these evaluations to the space created by variables encoding language frequencies of approximately 6 million geotagged Twitter users. We find that Twitter language use can serve as a rough aggregate measure of socioeconomic and cultural similarity, and improves upon estimates derived from a wide variety of socioeconomic, demographic, and geographic similarity measures. We show that applying Gaussian Processes to the limited Gallup data allows us to generate estimates for a much larger number of counties while maintaining the same level of convergent validity with external criteria (i.e., N = 1,133 vs. 2,954 counties). This work suggests that spatial coverage of psychological variables can be reliably extended through Bayesian techniques while maintaining out-of-sample prediction accuracy and that Twitter language adds important information about cultural similarity over and above traditional socio-demographic and geographic similarity measures. Finally, to facilitate the adoption of these methods, we have also open-sourced an online tool that researchers can freely use to interpolate their data across geographies.
Collapse
Affiliation(s)
- Salvatore Giorgi
- Department of Computer and Information Science, University of Pennsylvania, United States of America
| | - Johannes C. Eichstaedt
- Department of Psychology & Institute for Human-Centered AI, Stanford University, United States of America
| | | | - Jacob R. Gardner
- Department of Computer and Information Science, University of Pennsylvania, United States of America
| | - H. Andrew Schwartz
- Department of Computer Science, Stony Brook University, United States of America
| | - Lyle H. Ungar
- Department of Computer and Information Science, University of Pennsylvania, United States of America
| |
Collapse
|
10
|
Meyerhoff J, Liu T, Stamatis CA, Liu T, Wang H, Meng Y, Curtis B, Karr CJ, Sherman G, Ungar LH, Mohr DC. Analyzing text message linguistic features: Do people with depression communicate differently with their close and non-close contacts? Behav Res Ther 2023; 166:104342. [PMID: 37269650 PMCID: PMC10330918 DOI: 10.1016/j.brat.2023.104342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Revised: 03/20/2023] [Accepted: 05/26/2023] [Indexed: 06/05/2023]
Abstract
BACKGROUND Relatively little is known about how communication changes as a function of depression severity and interpersonal closeness. We examined the linguistic features of outgoing text messages among individuals with depression and their close- and non-close contacts. METHODS 419 participants were included in this 16-week-long observational study. Participants regularly completed the PHQ-8 and rated subjective closeness to their contacts. Text messages were processed to count frequencies of word usage in the LIWC 2015 libraries. A linear mixed modeling approach was used to estimate linguistic feature scores of outgoing text messages. RESULTS Regardless of closeness, people with higher PHQ-8 scores tended to use more differentiation words. When texting with close contacts, individuals with higher PHQ-8 scores used more first-person singular, filler, sexual, anger, and negative emotion words. When texting with non-close contacts these participants used more conjunctions, tentative, and sadness-related words and fewer first-person plural words. CONCLUSION Word classes used in text messages, when combined with symptom severity and subjective social closeness data, may be indicative of underlying interpersonal processes. These data may hold promise as potential treatment targets to address interpersonal drivers of depression.
Collapse
Affiliation(s)
- Jonah Meyerhoff
- Department of Preventive Medicine, Center for Behavioral Intervention Technologies (CBITs), Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
| | - Tingting Liu
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA; Technology & Translational Research Unit, National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), Baltimore, MD, USA
| | - Caitlin A Stamatis
- Department of Preventive Medicine, Center for Behavioral Intervention Technologies (CBITs), Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Tony Liu
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA; Roblox, San Mateo, CA, USA
| | - Harry Wang
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Yixuan Meng
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Brenda Curtis
- Technology & Translational Research Unit, National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), Baltimore, MD, USA
| | | | - Garrick Sherman
- National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), Baltimore, MD, USA
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - David C Mohr
- Department of Preventive Medicine, Center for Behavioral Intervention Technologies (CBITs), Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| |
Collapse
|
11
|
Giorgi S, Yaden DB, Eichstaedt JC, Ungar LH, Schwartz HA, Kwarteng A, Curtis B. Predicting U.S. county opioid poisoning mortality from multi-modal social media and psychological self-report data. Sci Rep 2023; 13:9027. [PMID: 37270657 DOI: 10.1038/s41598-023-34468-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 04/30/2023] [Indexed: 06/05/2023] Open
Abstract
Opioid poisoning mortality is a substantial public health crisis in the United States, with opioids involved in approximately 75% of the nearly 1 million drug related deaths since 1999. Research suggests that the epidemic is driven by both over-prescribing and social and psychological determinants such as economic stability, hopelessness, and isolation. Hindering this research is a lack of measurements of these social and psychological constructs at fine-grained spatial and temporal resolutions. To address this issue, we use a multi-modal data set consisting of natural language from Twitter, psychometric self-reports of depression and well-being, and traditional area-based measures of socio-demographics and health-related risk factors. Unlike previous work using social media data, we do not rely on opioid or substance related keywords to track community poisonings. Instead, we leverage a large, open vocabulary of thousands of words in order to fully characterize communities suffering from opioid poisoning, using a sample of 1.5 billion tweets from 6 million U.S. county mapped Twitter users. Results show that Twitter language predicted opioid poisoning mortality better than factors relating to socio-demographics, access to healthcare, physical pain, and psychological well-being. Additionally, risk factors revealed by the Twitter language analysis included negative emotions, discussions of long work hours, and boredom, whereas protective factors included resilience, travel/leisure, and positive emotions, dovetailing with results from the psychometric self-report data. The results show that natural language from public social media can be used as a surveillance tool for both predicting community opioid poisonings and understanding the dynamic social and psychological nature of the epidemic.
Collapse
Affiliation(s)
- Salvatore Giorgi
- National Institute on Drug Abuse, Intramural Research Program, Baltimore, MD, USA
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - David B Yaden
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Johannes C Eichstaedt
- Department of Psychology, Stanford University, Stanford, CA, USA
- Institute for Human-Centered AI, Stanford University, Stanford, CA, USA
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - H Andrew Schwartz
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
| | - Amy Kwarteng
- National Institute on Drug Abuse, Intramural Research Program, Baltimore, MD, USA
| | - Brenda Curtis
- National Institute on Drug Abuse, Intramural Research Program, Baltimore, MD, USA.
| |
Collapse
|
12
|
Matero M, Giorgi S, Curtis B, Ungar LH, Schwartz HA. Opioid death projections with AI-based forecasts using social media language. NPJ Digit Med 2023; 6:35. [PMID: 36882633 PMCID: PMC9992514 DOI: 10.1038/s41746-023-00776-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 02/13/2023] [Indexed: 03/09/2023] Open
Abstract
Targeting of location-specific aid for the U.S. opioid epidemic is difficult due to our inability to accurately predict changes in opioid mortality across heterogeneous communities. AI-based language analyses, having recently shown promise in cross-sectional (between-community) well-being assessments, may offer a way to more accurately longitudinally predict community-level overdose mortality. Here, we develop and evaluate, TROP (Transformer for Opiod Prediction), a model for community-specific trend projection that uses community-specific social media language along with past opioid-related mortality data to predict future changes in opioid-related deaths. TOP builds on recent advances in sequence modeling, namely transformer networks, to use changes in yearly language on Twitter and past mortality to project the following year's mortality rates by county. Trained over five years and evaluated over the next two years TROP demonstrated state-of-the-art accuracy in predicting future county-specific opioid trends. A model built using linear auto-regression and traditional socioeconomic data gave 7% error (MAPE) or within 2.93 deaths per 100,000 people on average; our proposed architecture was able to forecast yearly death rates with less than half that error: 3% MAPE and within 1.15 per 100,000 people.
Collapse
Affiliation(s)
- Matthew Matero
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA.
| | - Salvatore Giorgi
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
- National Institute on Drug Abuse, National Institutes of Health, Baltimore, MD, USA
| | - Brenda Curtis
- National Institute on Drug Abuse, National Institutes of Health, Baltimore, MD, USA
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - H Andrew Schwartz
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA.
| |
Collapse
|
13
|
Weissman GE, Ungar LH, Halpern SD. Chess Lessons: Harnessing Collective Human Intelligence and Imitation Learning to Support Clinical Decisions. Ann Intern Med 2023; 176:274-275. [PMID: 36716453 DOI: 10.7326/m22-2998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Affiliation(s)
- Gary E Weissman
- Palliative and Advanced Illness Research (PAIR) Center, and Pulmonary, Allergy, and Critical Care Division, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania (G.E.W., S.D.H.)
| | - Lyle H Ungar
- Department of Computer and Information Science and Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania (L.H.U.)
| | - Scott D Halpern
- Palliative and Advanced Illness Research (PAIR) Center, and Pulmonary, Allergy, and Critical Care Division, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania (G.E.W., S.D.H.)
| |
Collapse
|
14
|
Southwick DA, Liu ZV, Baldwin C, Quirk AL, Ungar LH, Tsay CJ, Duckworth AL. The trouble with talent: Semantic ambiguity in the workplace. Organizational Behavior and Human Decision Processes 2023. [DOI: 10.1016/j.obhdp.2022.104223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
|
15
|
Stamatis CA, Meyerhoff J, Liu T, Sherman G, Wang H, Liu T, Curtis B, Ungar LH, Mohr DC. Prospective associations of text-message-based sentiment with symptoms of depression, generalized anxiety, and social anxiety. Depress Anxiety 2022; 39:794-804. [PMID: 36281621 PMCID: PMC9729432 DOI: 10.1002/da.23286] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 09/16/2022] [Accepted: 10/02/2022] [Indexed: 01/27/2023] Open
Abstract
OBJECTIVE Language patterns may elucidate mechanisms of mental health conditions. To inform underlying theory and risk models, we evaluated prospective associations between in vivo text messaging language and differential symptoms of depression, generalized anxiety, and social anxiety. METHODS Over 16 weeks, we collected outgoing text messages from 335 adults. Using Linguistic Inquiry and Word Count (LIWC), NRC Emotion Lexicon, and previously established depression and stress dictionaries, we evaluated the degree to which language features predict symptoms of depression, generalized anxiety, or social anxiety the following week using hierarchical linear models. To isolate the specificity of language effects, we also controlled for the effects of the two other symptom types. RESULTS We found significant relationships of language features, including personal pronouns, negative emotion, cognitive and biological processes, and informal language, with common mental health conditions, including depression, generalized anxiety, and social anxiety (ps < .05). There was substantial overlap between language features and the three mental health outcomes. However, after controlling for other symptoms in the models, depressive symptoms were uniquely negatively associated with language about anticipation, trust, social processes, and affiliation (βs: -.10 to -.09, ps < .05), whereas generalized anxiety symptoms were positively linked with these same language features (βs: .12-.13, ps < .001). Social anxiety symptoms were uniquely associated with anger, sexual language, and swearing (βs: .12-.13, ps < .05). CONCLUSION Language that confers both common (e.g., personal pronouns and negative emotion) and specific (e.g., affiliation, anticipation, trust, and anger) risk for affective disorders is perceptible in prior week text messages, holding promise for understanding cognitive-behavioral mechanisms and tailoring digital interventions.
Collapse
Affiliation(s)
- Caitlin A. Stamatis
- Center for Behavioral Intervention TechnologiesNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Jonah Meyerhoff
- Center for Behavioral Intervention TechnologiesNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Tingting Liu
- Positive Psychology CenterUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Technology & Translational Research Unit, National Institute on Drug Abuse (NIDA IRP)National Institutes of Health (NIH)BaltimoreMarylandUSA
| | - Garrick Sherman
- Positive Psychology CenterUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Harry Wang
- Department of Computer and Information ScienceUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Tony Liu
- Department of Computer and Information ScienceUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- RobloxSan MateoCaliforniaUSA
| | - Brenda Curtis
- Technology & Translational Research Unit, National Institute on Drug Abuse (NIDA IRP)National Institutes of Health (NIH)BaltimoreMarylandUSA
| | - Lyle H. Ungar
- Positive Psychology CenterUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Computer and Information ScienceUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - David C. Mohr
- Center for Behavioral Intervention TechnologiesNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| |
Collapse
|
16
|
Liu T, Ungar LH, Curtis B, Sherman G, Yadeta K, Tay L, Eichstaedt JC, Guntuku SC. Head versus heart: social media reveals differential language of loneliness from depression. Npj Ment Health Res 2022; 1:16. [PMID: 38609477 PMCID: PMC10955894 DOI: 10.1038/s44184-022-00014-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 09/12/2022] [Indexed: 04/14/2024]
Abstract
We study the language differentially associated with loneliness and depression using 3.4-million Facebook posts from 2986 individuals, and uncover the statistical associations of survey-based depression and loneliness with both dictionary-based (Linguistic Inquiry Word Count 2015) and open-vocabulary linguistic features (words, phrases, and topics). Loneliness and depression were found to have highly overlapping language profiles, including sickness, pain, and negative emotions as (cross-sectional) risk factors, and social relationships and activities as protective factors. Compared to depression, the language associated with loneliness reflects a stronger cognitive focus, including more references to cognitive processes (i.e., differentiation and tentative language, thoughts, and the observation of irregularities), and cognitive activities like reading and writing. As might be expected, less lonely users were more likely to reference social relationships (e.g., friends and family, romantic relationships), and use first-person plural pronouns. Our findings suggest that the mechanisms of loneliness include self-oriented cognitive activities (i.e., reading) and an overattention to the interpretation of information in the environment. These data-driven ecological findings suggest interventions for loneliness that target maladaptive social cognitions (e.g., through reframing the perception of social environments), strengthen social relationships, and treat other affective distress (i.e., depression).
Collapse
Affiliation(s)
- Tingting Liu
- National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), Baltimore, MD, USA.
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA.
| | - Lyle H Ungar
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Brenda Curtis
- National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), Baltimore, MD, USA
| | - Garrick Sherman
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA
| | - Kenna Yadeta
- National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), Baltimore, MD, USA
| | - Louis Tay
- Department of Psychological Sciences, Purdue University, West Lafayette, IN, USA
| | - Johannes C Eichstaedt
- Department of Psychology, Institute for Human-Centered A.I., Stanford University, Stanford, CA, USA
| | - Sharath Chandra Guntuku
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
17
|
Stamatis CA, Meyerhoff J, Liu T, Hou Z, Sherman G, Curtis BL, Ungar LH, Mohr DC. The association of language style matching in text messages with mood and anxiety symptoms. Procedia Comput Sci 2022; 206:151-161. [PMID: 36567869 PMCID: PMC9784681 DOI: 10.1016/j.procs.2022.09.094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Context Impairment in social functioning is a feature and consequence of depression and anxiety disorders. For example, in depression, anhedonia and negative feelings about the self may impact relationships; in anxiety, fear of negative evaluation may interfere with getting close to others. It is unknown whether social impairment associated with depression and anxiety symptoms is reflected in day-to-day language exchanges with others, such as through reduced language style matching (LSM). Methods Over 16 weeks, we collected text message data from 458 adults and evaluated differences in LSM between people with average scores above/below the clinical cutoff for depression, generalized anxiety, and social anxiety in text message conversations. Text message sentiment scores were computed across 73 Linguistic Inquiry and Word Count (LIWC) categories for each participant. T-tests were used to compare LSM across two groups (average scores above/below clinical cutoff) for each of the 3 diagnostic categories (depression, generalized anxiety, social anxiety), and each of the 73 LIWC categories, with correction for multiple comparisons. Results We found reduced LSM of function words (namely, prepositions [t=-2.82, p=.032], articles [t=-5.26, p<.001], and auxiliary verbs [t=-2.64, p=.046]) in people with average scores above the clinical cutoff for generalized anxiety, and reduced LSM of prepositions (t=-4.26, p<.001) and articles (t=-3.39, p=.010) in people with average scores above the clinical cutoff for social anxiety. There were no significant differences in LSM of function words between people with average scores above and below the clinical cutoff for depression. Across all symptom categories, elevated affective psychopathology was associated with being more likely to style match on formality, including netspeak (generalized anxiety, t=5.77, p<.001; social anxiety, t=4.14, p<.001; depression, t=3.13, p=.021) and informal language (generalized anxiety, t=6.65, p<.001; social anxiety, t=5.14, p>.001; depression, t=3.20, p=.020).We also observed content-specific LSM differences across the three groups. Conclusions Reduced LSM of function words among patients reporting elevated anxiety symptoms suggests that anxiety-related psychosocial difficulties may be perceptible in subtle cues from day-to-day language. Conversely, the absence of differences in the LSM of function words among people with average scores above and below the clinical cutoff for depression indicates a potentially distinct mechanism of social impairment. Implications Results point to potential markers of psychosocial difficulties in daily conversations, particularly among those experiencing heightened anxiety symptoms. Future studies may consider the degree to which LSM is associated with self-reported psychosocial impairment, with the promise of informing cognitive-behavioral mechanisms and tailoring digital interventions for social skills.
Collapse
Affiliation(s)
- Caitlin A. Stamatis
- Center for Behavioral Intervention Technologies, Northwestern University Feinberg School of Medicine, 750 N. Lake Shore Dr., 10th Floor Chicago, IL 60611, USA
| | - Jonah Meyerhoff
- Center for Behavioral Intervention Technologies, Northwestern University Feinberg School of Medicine, 750 N. Lake Shore Dr., 10th Floor Chicago, IL 60611, USA
| | - Tingting Liu
- Positive Psychology Center, University of Pennsylvania, 3701 Market St, Philadelphia, PA 19104, USA
- Technology & Translational Research Unit, National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), 251 Bayview Blvd., Suite 200, Baltimore, MD, 21224, USA
| | - Zhaoyi Hou
- Department of Computer and Information Science, University of Pennsylvania, 3330 Walnut St, Philadelphia, PA 19104, USA
| | - Garrick Sherman
- Positive Psychology Center, University of Pennsylvania, 3701 Market St, Philadelphia, PA 19104, USA
| | - Brenda L. Curtis
- Technology & Translational Research Unit, National Institute on Drug Abuse (NIDA IRP), National Institutes of Health (NIH), 251 Bayview Blvd., Suite 200, Baltimore, MD, 21224, USA
| | - Lyle H. Ungar
- Positive Psychology Center, University of Pennsylvania, 3701 Market St, Philadelphia, PA 19104, USA
- Department of Computer and Information Science, University of Pennsylvania, 3330 Walnut St, Philadelphia, PA 19104, USA
| | - David C. Mohr
- Center for Behavioral Intervention Technologies, Northwestern University Feinberg School of Medicine, 750 N. Lake Shore Dr., 10th Floor Chicago, IL 60611, USA
| |
Collapse
|
18
|
Liu T, Giorgi S, Yadeta K, Schwartz HA, Ungar LH, Curtis B. Linguistic predictors from Facebook postings of substance use disorder treatment retention versus discontinuation. Am J Drug Alcohol Abuse 2022; 48:573-585. [PMID: 35853250 PMCID: PMC10231268 DOI: 10.1080/00952990.2022.2091450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 06/02/2022] [Accepted: 06/15/2022] [Indexed: 01/31/2023]
Abstract
Background: Early indicators of who will remain in - or leave - treatment for substance use disorder (SUD) can drive targeted interventions to support long-term recovery.Objectives: To conduct a comprehensive study of linguistic markers of SUD treatment outcomes, the current study integrated features produced by machine learning models known to have social-psychology relevance.Methods: We extracted and analyzed linguistic features from participants' Facebook posts (N = 206, 39.32% female; 55,415 postings) over the two years before they entered a SUD treatment program. Exploratory features produced by both Linguistic Inquiry and Word Count (LIWC) and Latent Dirichlet Allocation (LDA) topic modeling and the features from theoretical domains of religiosity, affect, and temporal orientation via established AI-based linguistic models were utilized.Results: Patients who stayed in the SUD treatment for over 90 days used more words associated with religion, positive emotions, family, affiliations, and the present, and used more first-person singular pronouns (Cohen's d values: [-0.39, -0.57]). Patients who discontinued their treatment before 90 days discussed more diverse topics, focused on the past, and used more articles (Cohen's d values: [0.44, 0.57]). All ps < .05 with Benjamini-Hochberg False Discovery Rate correction.Conclusions: We confirmed the literature on protective and risk social-psychological factors linking to SUD treatment in language analysis, showing that Facebook language before treatment entry could be used to identify the markers of SUD treatment outcomes. This reflects the importance of taking these linguistic features and markers into consideration when designing and recommending SUD treatment plans.
Collapse
Affiliation(s)
- Tingting Liu
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA
| | - Salvatore Giorgi
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Kenna Yadeta
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA
| | - H. Andrew Schwartz
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA
- Department of Computer Science, Stony Brook University, NY, USA
| | - Lyle H. Ungar
- Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Brenda Curtis
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA
| |
Collapse
|
19
|
Xia CH, Barnett I, Tapera TM, Adebimpe A, Baker JT, Bassett DS, Brotman MA, Calkins ME, Cui Z, Leibenluft E, Linguiti S, Lydon-Staley DM, Martin ML, Moore TM, Murtha K, Piiwaa K, Pines A, Roalf DR, Rush-Goebel S, Wolf DH, Ungar LH, Satterthwaite TD. Mobile footprinting: linking individual distinctiveness in mobility patterns to mood, sleep, and brain functional connectivity. Neuropsychopharmacology 2022; 47:1662-1671. [PMID: 35660803 PMCID: PMC9163291 DOI: 10.1038/s41386-022-01351-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Revised: 05/18/2022] [Accepted: 05/23/2022] [Indexed: 11/09/2022]
Abstract
Mapping individual differences in behavior is fundamental to personalized neuroscience, but quantifying complex behavior in real world settings remains a challenge. While mobility patterns captured by smartphones have increasingly been linked to a range of psychiatric symptoms, existing research has not specifically examined whether individuals have person-specific mobility patterns. We collected over 3000 days of mobility data from a sample of 41 adolescents and young adults (age 17-30 years, 28 female) with affective instability. We extracted summary mobility metrics from GPS and accelerometer data and used their covariance structures to identify individuals and calculated the individual identification accuracy-i.e., their "footprint distinctiveness". We found that statistical patterns of smartphone-based mobility features represented unique "footprints" that allow individual identification (p < 0.001). Critically, mobility footprints exhibited varying levels of person-specific distinctiveness (4-99%), which was associated with age and sex. Furthermore, reduced individual footprint distinctiveness was associated with instability in affect (p < 0.05) and circadian patterns (p < 0.05) as measured by environmental momentary assessment. Finally, brain functional connectivity, especially those in the somatomotor network, was linked to individual differences in mobility patterns (p < 0.05). Together, these results suggest that real-world mobility patterns may provide individual-specific signatures relevant for studies of development, sleep, and psychopathology.
Collapse
Affiliation(s)
- Cedric Huchuan Xia
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Ian Barnett
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Tinashe M Tapera
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Azeez Adebimpe
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Justin T Baker
- McLean Institute for Technology in Psychiatry, McLean Hospital, Belmont, MA, 02478, USA.,Department of Psychiatry, Harvard Medical School, Boston, MA, 02115, USA
| | - Danielle S Bassett
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Department of Electrical & Systems Engineering, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Department of Neurology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Santa Fe Institute, Santa Fe, NM, 87501, USA
| | - Melissa A Brotman
- National Institute of Mental Health, Intramural Research Program, Bethesda, MD, 20892, USA
| | - Monica E Calkins
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Zaixu Cui
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Ellen Leibenluft
- National Institute of Mental Health, Intramural Research Program, Bethesda, MD, 20892, USA
| | - Sophia Linguiti
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - David M Lydon-Staley
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Annenberg School of Communication, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Leonard Davis Institute for Health Economics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Melissa Lynne Martin
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Tyler M Moore
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Kristin Murtha
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Kayla Piiwaa
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Adam Pines
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - David R Roalf
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Sage Rush-Goebel
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Daniel H Wolf
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Center for Biomedical Image Computation and Analytics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Lyle H Ungar
- Department of Computer and Information Science, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Department of Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Department of Operations, Information and Decisions, Wharton School, Philadelphia, PA, 19104, USA.,Department of Psychology, School of Arts and Sciences, Philadelphia, PA, 19104, USA
| | - Theodore D Satterthwaite
- Penn Lifespan Informatics and Neuroimaging Center, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA. .,Penn/CHOP Lifespan Brain Institute, University of Pennsylvania, Philadelphia, PA, 19104, USA. .,Center for Biomedical Image Computation and Analytics, University of Pennsylvania, Philadelphia, PA, 19104, USA. .,Penn Statistics in Imaging and Visualization Center, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
20
|
Franco OH, Calkins ME, Giorgi S, Ungar LH, Gur RE, Kohler CG, Tang SX. Feasibility of Mobile Health and Social Media–Based Interventions for Young Adults With Early Psychosis and Clinical Risk for Psychosis: Survey Study. JMIR Form Res 2022; 6:e30230. [PMID: 35802420 PMCID: PMC9308069 DOI: 10.2196/30230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 03/21/2022] [Accepted: 04/15/2022] [Indexed: 11/28/2022] Open
Abstract
Background Digital technology, the internet, and social media are increasingly investigated as promising means for monitoring symptoms and delivering mental health treatment. These apps and interventions have demonstrated preliminary acceptability and feasibility, but previous reports suggest that access to technology may still be limited among individuals with psychotic disorders relative to the general population. Objective We evaluated and compared access to and use of technology and social media in young adults with psychotic disorders (PD), young adults with clinical risk for psychosis (CR), and psychosis-free youths (PF). Methods Participants were recruited through a coordinated specialty care clinic dedicated toward early psychosis as well as ongoing studies. We surveyed 21 PD, 23 CR, and 15 PF participants regarding access to technology and use of social media, specifically Facebook and Twitter. Statistical analyses were conducted in R. Categorical variables were compared among groups using Fisher exact test, continuous variables were compared using 1-way ANOVA, and multiple linear regressions were used to evaluate for covariates. Results Access to technology and social media were similar among PD, CR, and PF participants. Individuals with PD, but not CR, were less likely to post at a weekly or higher frequency compared to PF individuals. We found that decreased active social media posting was unique to psychotic disorders and did not occur with other psychiatric diagnoses or demographic variables. Additionally, variation in age, sex, and White versus non-White race did not affect posting frequency. Conclusions For young people with psychosis spectrum disorders, there appears to be no “technology gap” limiting the implementation of digital and mobile health interventions. Active posting to social media was reduced for individuals with psychosis, which may be related to negative symptoms or impairment in social functioning.
Collapse
Affiliation(s)
- Olivia H Franco
- Department of Psychiatry, University of Pennsylvania, Philadelphia, PA, United States
| | - Monica E Calkins
- Department of Psychiatry, University of Pennsylvania, Philadelphia, PA, United States
| | - Salvatore Giorgi
- Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States
| | - Lyle H Ungar
- Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States
| | - Raquel E Gur
- Department of Psychiatry, University of Pennsylvania, Philadelphia, PA, United States
| | - Christian G Kohler
- Department of Psychiatry, University of Pennsylvania, Philadelphia, PA, United States
| | - Sunny X Tang
- Department of Psychiatry, University of Pennsylvania, Philadelphia, PA, United States
- Feinstein Institutes for Medical Research, Northwell Health, Glen Oaks, NY, United States
| |
Collapse
|
21
|
Giorgi S, Lynn VE, Gupta K, Ahmed F, Matz S, Ungar LH, Schwartz HA. Correcting Sociodemographic Selection Biases for Population Prediction from Social Media. Proc Int AAAI Conf Weblogs Soc Media 2022; 16:228-240. [PMID: 36467573 PMCID: PMC9714525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Social media is increasingly used for large-scale population predictions, such as estimating community health statistics. However, social media users are not typically a representative sample of the intended population - a "selection bias". Within the social sciences, such a bias is typically addressed with restratification techniques, where observations are reweighted according to how under- or over-sampled their socio-demographic groups are. Yet, restratifaction is rarely evaluated for improving prediction. In this two-part study, we first evaluate standard, "out-of-the-box" restratification techniques, finding they provide no improvement and often even degraded prediction accuracies across four tasks of esimating U.S. county population health statistics from Twitter. The core reasons for degraded performance seem to be tied to their reliance on either sparse or shrunken estimates of each population's socio-demographics. In the second part of our study, we develop and evaluate Robust Poststratification, which consists of three methods to address these problems: (1) estimator redistribution to account for shrinking, as well as (2) adaptive binning and (3) informed smoothing to handle sparse socio-demographic estimates. We show that each of these methods leads to significant improvement in prediction accuracies over the standard restratification approaches. Taken together, Robust Poststratification enables state-of-the-art prediction accuracies, yielding a 53.0% increase in variance explained (R 2) in the case of surveyed life satisfaction, and a 17.8% average increase across all tasks.
Collapse
|
22
|
Jose R, Matero M, Sherman G, Curtis B, Giorgi S, Schwartz HA, Ungar LH. Using Facebook language to predict and describe excessive alcohol use. Alcohol Clin Exp Res 2022; 46:836-847. [PMID: 35575955 PMCID: PMC9179895 DOI: 10.1111/acer.14807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 02/10/2022] [Accepted: 03/10/2022] [Indexed: 11/28/2022]
Abstract
BACKGROUND Assessing risk for excessive alcohol use is important for applications ranging from recruitment into research studies to targeted public health messaging. Social media language provides an ecologically embedded source of information for assessing individuals who may be at risk for harmful drinking. METHODS Using data collected on 3664 respondents from the general population, we examine how accurately language used on social media classifies individuals as at-risk for alcohol problems based on Alcohol Use Disorder Identification Test-Consumption score benchmarks. RESULTS We find that social media language is moderately accurate (area under the curve = 0.75) at identifying individuals at risk for alcohol problems (i.e., hazardous drinking/alcohol use disorders) when used with models based on contextual word embeddings. High-risk alcohol use was predicted by individuals' usage of words related to alcohol, partying, informal expressions, swearing, and anger. Low-risk alcohol use was predicted by individuals' usage of social, affiliative, and faith-based words. CONCLUSIONS The use of social media data to study drinking behavior in the general public is promising and could eventually support primary and secondary prevention efforts among Americans whose at-risk drinking may have otherwise gone "under the radar."
Collapse
Affiliation(s)
- Rupa Jose
- Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Matthew Matero
- Department of Computer Science, Stony Brook University, Stony Brook, New York, USA
| | - Garrick Sherman
- Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Brenda Curtis
- Technology and Translational Research Unit, National Institute on Drug Abuse, Baltimore, Maryland, USA
| | - Salvatore Giorgi
- Technology and Translational Research Unit, National Institute on Drug Abuse, Baltimore, Maryland, USA.,Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | | | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA.,Department of Psychology, Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
23
|
Liu T, Meyerhoff J, Eichstaedt JC, Karr CJ, Kaiser SM, Kording KP, Mohr DC, Ungar LH. The relationship between text message sentiment and self-reported depression. J Affect Disord 2022; 302:7-14. [PMID: 34963643 PMCID: PMC8912980 DOI: 10.1016/j.jad.2021.12.048] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 11/15/2021] [Accepted: 12/18/2021] [Indexed: 10/19/2022]
Abstract
BACKGROUND Personal sensing has shown promise for detecting behavioral correlates of depression, but there is little work examining personal sensing of cognitive and affective states. Digital language, particularly through personal text messages, is one source that can measure these markers. METHODS We correlated privacy-preserving sentiment analysis of text messages with self-reported depression symptom severity. We enrolled 219 U.S. adults in a 16 week longitudinal observational study. Participants installed a personal sensing app on their phones, which administered self-report PHQ-8 assessments of their depression severity, collected phone sensor data, and computed anonymized language sentiment scores from their text messages. We also trained machine learning models for predicting end-of-study self-reported depression status using on blocks of phone sensor and text features. RESULTS In correlation analyses, we find that degrees of depression, emotional, and personal pronoun language categories correlate most strongly with self-reported depression, validating prior literature. Our classification models which predict binary depression status achieve a leave-one-out AUC of 0.72 when only considering text features and 0.76 when combining text with other networked smartphone sensors. LIMITATIONS Participants were recruited from a panel that over-represented women, caucasians, and individuals with self-reported depression at baseline. As language use differs across demographic factors, generalizability beyond this population may be limited. The study period also coincided with the initial COVID-19 outbreak in the United States, which may have affected smartphone sensor data quality. CONCLUSIONS Effective depression prediction through text message sentiment, especially when combined with other personal sensors, could enable comprehensive mental health monitoring and intervention.
Collapse
Affiliation(s)
- Tony Liu
- Department of Computer and Information Science, University of Pennsylvania, USA.
| | - Jonah Meyerhoff
- Center for Behavioral Intervention Technologies (CBITs), Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, USA
| | | | | | - Susan M Kaiser
- Center for Behavioral Intervention Technologies (CBITs), Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, USA
| | - Konrad P Kording
- Department of Bioengineering, Department of Neuroscience, University of Pennsylvania, USA
| | - David C Mohr
- Center for Behavioral Intervention Technologies (CBITs), Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, USA
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, USA
| |
Collapse
|
24
|
Seltzer EK, Guntuku SC, Lanza AL, Tufts C, Srinivas SK, Klinger EV, Asch DA, Fausti N, Ungar LH, Merchant RM. Patient Experience and Satisfaction in Online Reviews of Obstetric Care: Observational Study. JMIR Form Res 2022; 6:e28379. [PMID: 35357310 PMCID: PMC9015735 DOI: 10.2196/28379] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 06/29/2021] [Accepted: 12/13/2021] [Indexed: 11/30/2022] Open
Abstract
Background The quality of care in labor and delivery is traditionally measured through the Hospital Consumer Assessment of Healthcare Providers and Systems but less is known about the experiences of care reported by patients and caregivers on online sites that are more easily accessed by the public. Objective The aim of this study was to generate insight into the labor and delivery experience using hospital reviews on Yelp. Methods We identified all Yelp reviews of US hospitals posted online from May 2005 to March 2017. We used a machine learning tool, latent Dirichlet allocation, to identify 100 topics or themes within these reviews and used Pearson r to identify statistically significant correlations between topics and high (5-star) and low (1-star) ratings. Results A total of 1569 hospitals listed in the American Hospital Association directory had at least one Yelp posting, contributing a total of 41,095 Yelp reviews. Among those hospitals, 919 (59%) had at least one Yelp rating for labor and delivery services (median of 9 reviews), contributing a total of 6523 labor and delivery reviews. Reviews concentrated among 5-star (n=2643, 41%) and 1-star reviews (n=1934, 30%). Themes strongly associated with favorable ratings included the following: top-notch care (r=0.45, P<.001), describing staff as comforting (r=0.52, P<.001), the delivery experience (r=0.46, P<.001), modern and clean facilities (r=0.44, P<.001), and hospital food (r=0.38, P<.001). Themes strongly correlated with 1-star labor and delivery reviews included complaints to management (r=0.30, P<.001), a lack of agency among patients (r=0.47, P<.001), and issues with discharging from the hospital (r=0.32, P<.001). Conclusions Online review content about labor and delivery can provide meaningful information about patient satisfaction and experiences. Narratives from these reviews that are not otherwise captured in traditional surveys can direct efforts to improve the experience of obstetrical care.
Collapse
Affiliation(s)
- Emily K Seltzer
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, United States.,Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, PA, United States
| | - Sharath Chandra Guntuku
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, United States.,Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States
| | - Amy L Lanza
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, United States
| | - Christopher Tufts
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, United States
| | - Sindhu K Srinivas
- Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA, United States
| | - Elissa V Klinger
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, United States.,Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, PA, United States
| | - David A Asch
- Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, PA, United States.,Center for Health Equity Research and Promotion, Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA, United States
| | - Nick Fausti
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, United States
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States
| | - Raina M Merchant
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, United States.,Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
25
|
Flamholz ZN, Crane-Droesch A, Ungar LH, Weissman GE. Word embeddings trained on published case reports are lightweight, effective for clinical tasks, and free of protected health information. J Biomed Inform 2022; 125:103971. [PMID: 34920127 PMCID: PMC8766939 DOI: 10.1016/j.jbi.2021.103971] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 11/22/2021] [Accepted: 12/02/2021] [Indexed: 01/03/2023]
Abstract
OBJECTIVE Quantify tradeoffs in performance, reproducibility, and resource demands across several strategies for developing clinically relevant word embeddings. MATERIALS AND METHODS We trained separate embeddings on all full-text manuscripts in the Pubmed Central (PMC) Open Access subset, case reports therein, the English Wikipedia corpus, the Medical Information Mart for Intensive Care (MIMIC) III dataset, and all notes in the University of Pennsylvania Health System (UPHS) electronic health record. We tested embeddings in six clinically relevant tasks including mortality prediction and de-identification, and assessed performance using the scaled Brier score (SBS) and the proportion of notes successfully de-identified, respectively. RESULTS Embeddings from UPHS notes best predicted mortality (SBS 0.30, 95% CI 0.15 to 0.45) while Wikipedia embeddings performed worst (SBS 0.12, 95% CI -0.05 to 0.28). Wikipedia embeddings most consistently (78% of notes) and the full PMC corpus embeddings least consistently (48%) de-identified notes. Across all six tasks, the full PMC corpus demonstrated the most consistent performance, and the Wikipedia corpus the least. Corpus size ranged from 49 million tokens (PMC case reports) to 10 billion (UPHS). DISCUSSION Embeddings trained on published case reports performed as least as well as embeddings trained on other corpora in most tasks, and clinical corpora consistently outperformed non-clinical corpora. No single corpus produced a strictly dominant set of embeddings across all tasks and so the optimal training corpus depends on intended use. CONCLUSION Embeddings trained on published case reports performed comparably on most clinical tasks to embeddings trained on larger corpora. Open access corpora allow training of clinically relevant, effective, and reproducible embeddings.
Collapse
Affiliation(s)
- Zachary N. Flamholz
- Medical Scientist Training Program, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Andrew Crane-Droesch
- Penn Medicine Predictive Healthcare, University of Pennsylvania Health System, Philadelphia, Pennsylvania, USA,Palliative and Advanced Illness Research (PAIR) Center, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Lyle H. Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Gary E. Weissman
- Palliative and Advanced Illness Research (PAIR) Center, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA,Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania, USA,Pulmonary, Allergy, and Critical Care Division, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| |
Collapse
|
26
|
Martin JA, Crane-Droesch A, Lapite FC, Puhl JC, Kmiec TE, Silvestri JA, Ungar LH, Kinosian BP, Himes BE, Hubbard RA, Diamond JM, Ahya V, Sims MW, Halpern SD, Weissman GE. Development and validation of a prediction model for actionable aspects of frailty in the text of clinicians' encounter notes. J Am Med Inform Assoc 2021; 29:109-119. [PMID: 34791302 DOI: 10.1093/jamia/ocab248] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 10/16/2021] [Accepted: 10/28/2021] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE Frailty is a prevalent risk factor for adverse outcomes among patients with chronic lung disease. However, identifying frail patients who may benefit from interventions is challenging using standard data sources. We therefore sought to identify phrases in clinical notes in the electronic health record (EHR) that describe actionable frailty syndromes. MATERIALS AND METHODS We used an active learning strategy to select notes from the EHR and annotated each sentence for 4 actionable aspects of frailty: respiratory impairment, musculoskeletal problems, fall risk, and nutritional deficiencies. We compared the performance of regression, tree-based, and neural network models to predict the labels for each sentence. We evaluated performance with the scaled Brier score (SBS), where 1 is perfect and 0 is uninformative, and the positive predictive value (PPV). RESULTS We manually annotated 155 952 sentences from 326 patients. Elastic net regression had the best performance across all 4 frailty aspects (SBS 0.52, 95% confidence interval [CI] 0.49-0.54) followed by random forests (SBS 0.49, 95% CI 0.47-0.51), and multi-task neural networks (SBS 0.39, 95% CI 0.37-0.42). For the elastic net model, the PPV for identifying the presence of respiratory impairment was 54.8% (95% CI 53.3%-56.6%) at a sensitivity of 80%. DISCUSSION Classification models using EHR notes can effectively identify actionable aspects of frailty among patients living with chronic lung disease. Regression performed better than random forest and neural network models. CONCLUSIONS NLP-based models offer promising support to population health management programs that seek to identify and refer community-dwelling patients with frailty for evidence-based interventions.
Collapse
Affiliation(s)
- Jacob A Martin
- Division of Cardiology, Department of Medicine, New York University Grossman School of Medicine, New York, New York, USA.,Palliative and Advanced Illness Research (PAIR) Center, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA.,Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Andrew Crane-Droesch
- Palliative and Advanced Illness Research (PAIR) Center, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | | | - Joseph C Puhl
- Palliative and Advanced Illness Research (PAIR) Center, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Tyler E Kmiec
- Palliative and Advanced Illness Research (PAIR) Center, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Jasmine A Silvestri
- Palliative and Advanced Illness Research (PAIR) Center, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania School of Engineering and Applied Science, Philadelphia, Pennsylvania, USA
| | - Bruce P Kinosian
- Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.,Division of Geriatrics, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA.,Geriatrics and Extended Care Data Analysis Center, Corporal Michael J Crescenz VA Medical Center, Philadelphia, Pennsylvania, USA
| | - Blanca E Himes
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Rebecca A Hubbard
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Joshua M Diamond
- Pulmonary, Allergy, and Critical Care Division, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Vivek Ahya
- Pulmonary, Allergy, and Critical Care Division, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Michael W Sims
- Pulmonary, Allergy, and Critical Care Division, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Scott D Halpern
- Palliative and Advanced Illness Research (PAIR) Center, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA.,Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.,Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA.,Pulmonary, Allergy, and Critical Care Division, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Gary E Weissman
- Palliative and Advanced Illness Research (PAIR) Center, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA.,Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.,Pulmonary, Allergy, and Critical Care Division, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| |
Collapse
|
27
|
Eichstaedt JC, Kern ML, Yaden DB, Schwartz HA, Giorgi S, Park G, Hagan CA, Tobolsky VA, Smith LK, Buffone A, Iwry J, Seligman MEP, Ungar LH. Closed- and open-vocabulary approaches to text analysis: A review, quantitative comparison, and recommendations. Psychol Methods 2021; 26:398-427. [PMID: 34726465 DOI: 10.1037/met0000349] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Technology now makes it possible to understand efficiently and at large scale how people use language to reveal their everyday thoughts, behaviors, and emotions. Written text has been analyzed through both theory-based, closed-vocabulary methods from the social sciences as well as data-driven, open-vocabulary methods from computer science, but these approaches have not been comprehensively compared. To provide guidance on best practices for automatically analyzing written text, this narrative review and quantitative synthesis compares five predominant closed- and open-vocabulary methods: Linguistic Inquiry and Word Count (LIWC), the General Inquirer, DICTION, Latent Dirichlet Allocation, and Differential Language Analysis. We compare the linguistic features associated with gender, age, and personality across the five methods using an existing dataset of Facebook status updates and self-reported survey data from 65,896 users. Results are fairly consistent across methods. The closed-vocabulary approaches efficiently summarize concepts and are helpful for understanding how people think, with LIWC2015 yielding the strongest, most parsimonious results. Open-vocabulary approaches reveal more specific and concrete patterns across a broad range of content domains, better address ambiguous word senses, and are less prone to misinterpretation, suggesting that they are well-suited for capturing the nuances of everyday psychological processes. We detail several errors that can occur in closed-vocabulary analyses, the impact of sample size, number of words per user and number of topics included in open-vocabulary analyses, and implications of different analytical decisions. We conclude with recommendations for researchers, advocating for a complementary approach that combines closed- and open-vocabulary methods. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
Affiliation(s)
| | - Margaret L Kern
- Melbourne Graduate School of Education, The University of Melbourne
| | - David B Yaden
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins Medicine
| | - H A Schwartz
- Department of Computer Science, Stony Brook University
| | | | - Gregory Park
- Department of Psychology, University of Pennsylvania
| | | | | | - Laura K Smith
- Department of Psychology, University of Pennsylvania
| | | | - Jonathan Iwry
- Department of Psychology, University of Pennsylvania
| | | | - Lyle H Ungar
- Department of Psychology, University of Pennsylvania
| |
Collapse
|
28
|
Morris MP, Christopher AN, Patel V, Mellia JA, Liu T, Hsu JY, Broach RB, Ungar LH, Fischer JP. Feasibility of Natural Language Processing in Surgery: Sensitivity and Specificity Compared to Manual Extraction. J Am Coll Surg 2021. [DOI: 10.1016/j.jamcollsurg.2021.07.173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
29
|
Mancheno C, Asch DA, Klinger EV, Goldshear JL, Mitra N, Buttenheim AM, Barg FK, Ungar LH, Yang L, Merchant RM. Effect of Posting on Social Media on Systolic Blood Pressure and Management of Hypertension: A Randomized Controlled Trial. J Am Heart Assoc 2021; 10:e020596. [PMID: 34558301 PMCID: PMC8649152 DOI: 10.1161/jaha.120.020596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Background Online platforms are used to manage aspects of our lives including health outside clinical settings. Little is known about the effectiveness of using online platforms to manage hypertension. We assessed effects of tweeting/retweeting cardiovascular health content by individuals with poorly controlled hypertension on systolic blood pressure (SBP) and patient activation. Methods and Results We conducted this 2‐arm randomized controlled trial. Eligibility included diagnosis of hypertension; SBP >140 mm Hg; and an existing Twitter account or willingness to create one to follow study Twitter account. Intervention arm was asked to tweet/retweet health content 2×/week using a specific hashtag for study duration (6 months). The main measures include primary outcome change in SBP; secondary outcome point change in Patient Activation Measure (PAM). We remotely recruited and enrolled 611 participants, mean age 52 (SD, 11.7). Mean baseline SBP for the intervention group was 155.8 and for control was 155.6. At 6 months, mean SBP for intervention group was 137.6 and for control was 135.7. Mean change in SBP from baseline to 6 months for the intervention group was −18.5 and for control was −19.8 (P=0.48). Mean PAM at baseline for the intervention group was 70.3 for control was 72.7. At 6 months, mean PAM scores were 71.1 (intervention) and 75.6 (control). Mean change in PAM score for the intervention group was 0.0 and for control was 3.3 (P=0.12). Conclusions Recruiting and engaging patients and collecting outcome measures remotely are feasible using Twitter. Encouraging patients with poorly controlled hypertension to tweet or retweet health content on Twitter did not improve SBP or PAM score at 6 months. Registration URL: https://www.clinicaltrials.gov. Unique identifier: NCT02622256.
Collapse
Affiliation(s)
- Christina Mancheno
- Penn Medicine Center for Digital Health University of Pennsylvania Philadelphia PA
| | - David A Asch
- Penn Medicine Center for Digital Health University of Pennsylvania Philadelphia PA.,Center for Health Equity Research and Promotion - Philadelphia Veterans Affairs Medical Center Philadelphia PA.,The Wharton School University of Pennsylvania Philadelphia PA
| | - Elissa V Klinger
- Penn Medicine Center for Digital Health University of Pennsylvania Philadelphia PA
| | - Jesse L Goldshear
- Penn Medicine Center for Digital Health University of Pennsylvania Philadelphia PA
| | - Nandita Mitra
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Philadelphia PA
| | - Alison M Buttenheim
- Department of Family and Community Health University of Pennsylvania School of Nursing Philadelphia PA.,Center for Health Incentives and Behavioral Economics Perelman School of Medicine University of Pennsylvania Philadelphia PA
| | - Frances K Barg
- Department of Family and Community Health University of Pennsylvania School of Nursing Philadelphia PA
| | - Lyle H Ungar
- Penn Medicine Center for Digital Health University of Pennsylvania Philadelphia PA.,The Wharton School University of Pennsylvania Philadelphia PA.,Department of Computer and Information Science University of Pennsylvania Philadelphia PA
| | - Lin Yang
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Philadelphia PA
| | - Raina M Merchant
- Penn Medicine Center for Digital Health University of Pennsylvania Philadelphia PA.,Department of Emergency Medicine Perelman School of Medicine University of Pennsylvania Philadelphia PA.,Center for Health Incentives and Behavioral Economics Perelman School of Medicine University of Pennsylvania Philadelphia PA
| |
Collapse
|
30
|
Giorgi S, Nguyen KL, Eichstaedt JC, Kern ML, Yaden DB, Kosinski M, Seligman MEP, Ungar LH, Schwartz HA, Park G. Regional personality assessment through social media language. J Pers 2021; 90:405-425. [PMID: 34536229 DOI: 10.1111/jopy.12674] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 08/26/2021] [Accepted: 09/12/2021] [Indexed: 11/30/2022]
Abstract
OBJECTIVE We explore the personality of counties as assessed through linguistic patterns on social media. Such studies were previously limited by the cost and feasibility of large-scale surveys; however, language-based computational models applied to large social media datasets now allow for large-scale personality assessment. METHOD We applied a language-based assessment of the five factor model of personality to 6,064,267 U.S. Twitter users. We aggregated the Twitter-based personality scores to 2,041 counties and compared to political, economic, social, and health outcomes measured through surveys and by government agencies. RESULTS There was significant personality variation across counties. Openness to experience was higher on the coasts, conscientiousness was uniformly spread, extraversion was higher in southern states, agreeableness was higher in western states, and emotional stability was highest in the south. Across 13 outcomes, language-based personality estimates replicated patterns that have been observed in individual-level and geographic studies. This includes higher Republican vote share in less agreeable counties and increased life satisfaction in more conscientious counties. CONCLUSIONS Results suggest that regions vary in their personality and that these differences can be studied through computational linguistic analysis of social media. Furthermore, these methods may be used to explore other psychological constructs across geographies.
Collapse
Affiliation(s)
- Salvatore Giorgi
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Khoa Le Nguyen
- Department Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Johannes C Eichstaedt
- Department of Psychology, Institute for Human-Centered A.I., Stanford University, Stanford, California, USA
| | - Margaret L Kern
- Melbourne Graduate School of Education, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Yaden
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Michal Kosinski
- Graduate School of Business, Stanford University, Stanford, California, USA
| | - Martin E P Seligman
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - H Andrew Schwartz
- Department of Computer Science, Stony Brook University, Stony Brook, New York, USA
| | - Gregory Park
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
31
|
Meyerhoff J, Liu T, Kording KP, Ungar LH, Kaiser SM, Karr CJ, Mohr DC. Evaluation of Changes in Depression, Anxiety, and Social Anxiety Using Smartphone Sensor Features: Longitudinal Cohort Study. J Med Internet Res 2021; 23:e22844. [PMID: 34477562 PMCID: PMC8449302 DOI: 10.2196/22844] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 10/29/2020] [Accepted: 07/19/2021] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND The assessment of behaviors related to mental health typically relies on self-report data. Networked sensors embedded in smartphones can measure some behaviors objectively and continuously, with no ongoing effort. OBJECTIVE This study aims to evaluate whether changes in phone sensor-derived behavioral features were associated with subsequent changes in mental health symptoms. METHODS This longitudinal cohort study examined continuously collected phone sensor data and symptom severity data, collected every 3 weeks, over 16 weeks. The participants were recruited through national research registries. Primary outcomes included depression (8-item Patient Health Questionnaire), generalized anxiety (Generalized Anxiety Disorder 7-item scale), and social anxiety (Social Phobia Inventory) severity. Participants were adults who owned Android smartphones. Participants clustered into 4 groups: multiple comorbidities, depression and generalized anxiety, depression and social anxiety, and minimal symptoms. RESULTS A total of 282 participants were aged 19-69 years (mean 38.9, SD 11.9 years), and the majority were female (223/282, 79.1%) and White participants (226/282, 80.1%). Among the multiple comorbidities group, depression changes were preceded by changes in GPS features (Time: r=-0.23, P=.02; Locations: r=-0.36, P<.001), exercise duration (r=0.39; P=.03) and use of active apps (r=-0.31; P<.001). Among the depression and anxiety groups, changes in depression were preceded by changes in GPS features for Locations (r=-0.20; P=.03) and Transitions (r=-0.21; P=.03). Depression changes were not related to subsequent sensor-derived features. The minimal symptoms group showed no significant relationships. There were no associations between sensor-based features and anxiety and minimal associations between sensor-based features and social anxiety. CONCLUSIONS Changes in sensor-derived behavioral features are associated with subsequent depression changes, but not vice versa, suggesting a directional relationship in which changes in sensed behaviors are associated with subsequent changes in symptoms.
Collapse
Affiliation(s)
- Jonah Meyerhoff
- Center for Behavioral Intervention Technologies, Department of Preventive Medicine, Northwestern University, Chicago, IL, United States
| | - Tony Liu
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States
| | - Konrad P Kording
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, United States
- Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, United States
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States
| | - Susan M Kaiser
- Center for Behavioral Intervention Technologies, Department of Preventive Medicine, Northwestern University, Chicago, IL, United States
| | | | - David C Mohr
- Center for Behavioral Intervention Technologies, Department of Preventive Medicine, Northwestern University, Chicago, IL, United States
| |
Collapse
|
32
|
Shah PK, Ginestra JC, Ungar LH, Junker P, Rohrbach JI, Fishman NO, Weissman GE. A Simulated Prospective Evaluation of a Deep Learning Model for Real-Time Prediction of Clinical Deterioration Among Ward Patients. Crit Care Med 2021; 49:1312-1321. [PMID: 33711001 PMCID: PMC8282687 DOI: 10.1097/ccm.0000000000004966] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
OBJECTIVES The National Early Warning Score, Modified Early Warning Score, and quick Sepsis-related Organ Failure Assessment can predict clinical deterioration. These scores exhibit only moderate performance and are often evaluated using aggregated measures over time. A simulated prospective validation strategy that assesses multiple predictions per patient-day would provide the best pragmatic evaluation. We developed a deep recurrent neural network deterioration model and conducted a simulated prospective evaluation. DESIGN Retrospective cohort study. SETTING Four hospitals in Pennsylvania. PATIENTS Inpatient adults discharged between July 1, 2017, and June 30, 2019. INTERVENTIONS None. MEASUREMENTS AND MAIN RESULTS We trained a deep recurrent neural network and logistic regression model using data from electronic health records to predict hourly the 24-hour composite outcome of transfer to ICU or death. We analyzed 146,446 hospitalizations with 16.75 million patient-hours. The hourly event rate was 1.6% (12,842 transfers or deaths, corresponding to 260,295 patient-hours within the predictive horizon). On a hold-out dataset, the deep recurrent neural network achieved an area under the precision-recall curve of 0.042 (95% CI, 0.04-0.043), comparable with logistic regression model (0.043; 95% CI 0.041 to 0.045), and outperformed National Early Warning Score (0.034; 95% CI, 0.032-0.035), Modified Early Warning Score (0.028; 95% CI, 0.027- 0.03), and quick Sepsis-related Organ Failure Assessment (0.021; 95% CI, 0.021-0.022). For a fixed sensitivity of 50%, the deep recurrent neural network achieved a positive predictive value of 3.4% (95% CI, 3.4-3.5) and outperformed logistic regression model (3.1%; 95% CI 3.1-3.2), National Early Warning Score (2.0%; 95% CI, 2.0-2.0), Modified Early Warning Score (1.5%; 95% CI, 1.5-1.5), and quick Sepsis-related Organ Failure Assessment (1.5%; 95% CI, 1.5-1.5). CONCLUSIONS Commonly used early warning scores for clinical decompensation, along with a logistic regression model and a deep recurrent neural network model, show very poor performance characteristics when assessed using a simulated prospective validation. None of these models may be suitable for real-time deployment.
Collapse
Affiliation(s)
- Parth K Shah
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Jennifer C Ginestra
- Palliative and Advanced Illness Research Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA
| | - Paul Junker
- Clinical Effectiveness and Quality Improvement, Hospital of the University of Pennsylvania, Philadelphia, PA
| | - Jeff I Rohrbach
- Clinical Effectiveness and Quality Improvement, Hospital of the University of Pennsylvania, Philadelphia, PA
| | - Neil O Fishman
- Department of Medicine, Hospital of the University of Pennsylvania, Philadelphia, PA
| | - Gary E Weissman
- Palliative and Advanced Illness Research Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
- Department of Medicine, Hospital of the University of Pennsylvania, Philadelphia, PA
| |
Collapse
|
33
|
Guntuku SC, Gaulton JS, Seltzer EK, Asch DA, Srinivas SK, Ungar LH, Mancheno C, Klinger EV, Merchant RM. Studying social media language changes associated with pregnancy status, trimester, and parity from medical records. ACTA ACUST UNITED AC 2021; 16:1745506520949392. [PMID: 33028170 PMCID: PMC7549071 DOI: 10.1177/1745506520949392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We sought to evaluate whether there was variability in language used on social
media across different time points of pregnancy (before, during, and after
pregnancy, as well as by trimester and parity). Consenting patients shared
access to their individual Facebook posts and electronic medical records. Random
forest models trained on Facebook posts could differentiate first trimester of
pregnancy from 3 months before pregnancy (F1 score = .63) and from a random
3-month time period (F1 score = .64). Posts during pregnancy were more likely to
include themes about family (β = .22), food craving (β = .14), and date/times
(β = .13), while posts 3 months prior to pregnancy included themes about social
life (β = .30), sleep (β = .31), and curse words (β = .27), and 3 months
post-pregnancy included themes of gratitude (β = .17), health appointments
(β = .21), and religiosity (β = .18). Users who were pregnant for the first time
were more likely to post about lack of sleep (β = .15), activities of daily
living (β = .09), and communication (β = .08) compared with those who were
pregnant after having a child who posted about others’ birthdays (β = .16) and
life events (.12). A better understanding about social media timelines can
provide insight into lifestyle choices that are specific to pregnancy.
Collapse
Affiliation(s)
- Sharath Chandra Guntuku
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, USA.,Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.,Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Jessica S Gaulton
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, USA.,Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Emily K Seltzer
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, USA.,Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, PA, USA
| | - David A Asch
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, USA.,Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, PA, USA
| | - Sindhu K Srinivas
- Department of Obstetrics and Gynecology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Lyle H Ungar
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, USA.,Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA.,Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA
| | - Christina Mancheno
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, USA
| | - Elissa V Klinger
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, USA.,Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, PA, USA
| | - Raina M Merchant
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, USA.,Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.,Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
34
|
Guntuku SC, Klinger EV, McCalpin HJ, Ungar LH, Asch DA, Merchant RM. Social media language of healthcare super-utilizers. NPJ Digit Med 2021; 4:55. [PMID: 33767336 PMCID: PMC7994843 DOI: 10.1038/s41746-021-00419-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 02/16/2021] [Indexed: 12/02/2022] Open
Abstract
An understanding of healthcare super-utilizers' online behaviors could better identify experiences to inform interventions. In this retrospective case-control study, we analyzed patients' social media posts to better understand their day-to-day behaviors and emotions expressed online. Patients included those receiving care in an urban academic emergency department who consented to share access to their historical Facebook posts and electronic health records. Super-utilizers were defined as patients with more than six visits to the Emergency Department (ED) in a year. We compared posts by super-utilizers with a matched group using propensity scoring based on age, gender and Charlson comorbidity index. Super-utilizers were more likely to post about confusion and negativity (D = .65, 95% CI-[.38, .95]), self-reflection (D = .63 [.35, .91]), avoidance (D = .62 [.34, .90]), swearing (D = .52 [.24, .79]), sleep (D = .60 [.32, .88]), seeking help and attention (D = .61 [.33, .89]), psychosomatic symptoms, (D = .49 [.22, .77]), self-agency (D = .56 [.29, .85]), anger (D = .51, [.24, .79]), stress (D = .46, [.19, .73]), and lonely expressions (D = .44, [.17, .71]). Insights from this study can potentially supplement offline community care services with online social support interventions considering the high engagement of super-utilizers on social media.
Collapse
Affiliation(s)
- Sharath Chandra Guntuku
- Penn Medicine Center for Digital Health, Philadelphia, PA, USA.
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA.
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| | - Elissa V Klinger
- Penn Medicine Center for Digital Health, Philadelphia, PA, USA
- Penn Medicine Center for Health Care Innovation, Philadelphia, PA, USA
| | - Haley J McCalpin
- Penn Medicine Center for Digital Health, Philadelphia, PA, USA
- Penn Medicine Center for Health Care Innovation, Philadelphia, PA, USA
| | - Lyle H Ungar
- Penn Medicine Center for Digital Health, Philadelphia, PA, USA
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
- The Wharton School, University of Pennsylvania, Philadelphia, PA, USA
| | - David A Asch
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Penn Medicine Center for Health Care Innovation, Philadelphia, PA, USA
- The Wharton School, University of Pennsylvania, Philadelphia, PA, USA
- Cpl Michael J Crescenz VA Medical Center, Philadelphia, PA, USA
| | - Raina M Merchant
- Penn Medicine Center for Digital Health, Philadelphia, PA, USA
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Penn Medicine Center for Health Care Innovation, Philadelphia, PA, USA
| |
Collapse
|
35
|
Andy AU, Guntuku SC, Adusumalli S, Asch DA, Groeneveld PW, Ungar LH, Merchant RM. Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models. JMIR Cardio 2021; 5:e24473. [PMID: 33605888 PMCID: PMC8411430 DOI: 10.2196/24473] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 12/14/2020] [Accepted: 01/15/2021] [Indexed: 01/23/2023] Open
Abstract
Background Current atherosclerotic cardiovascular disease (ASCVD) predictive models have limitations; thus, efforts are underway to improve the discriminatory power of ASCVD models. Objective We sought to evaluate the discriminatory power of social media posts to predict the 10-year risk for ASCVD as compared to that of pooled cohort risk equations (PCEs). Methods We consented patients receiving care in an urban academic emergency department to share access to their Facebook posts and electronic medical records (EMRs). We retrieved Facebook status updates up to 5 years prior to study enrollment for all consenting patients. We identified patients (N=181) without a prior history of coronary heart disease, an ASCVD score in their EMR, and more than 200 words in their Facebook posts. Using Facebook posts from these patients, we applied a machine-learning model to predict 10-year ASCVD risk scores. Using a machine-learning model and a psycholinguistic dictionary, Linguistic Inquiry and Word Count, we evaluated if language from posts alone could predict differences in risk scores and the association of certain words with risk categories, respectively. Results The machine-learning model predicted the 10-year ASCVD risk scores for the categories <5%, 5%-7.4%, 7.5%-9.9%, and ≥10% with area under the curve (AUC) values of 0.78, 0.57, 0.72, and 0.61, respectively. The machine-learning model distinguished between low risk (<10%) and high risk (>10%) with an AUC of 0.69. Additionally, the machine-learning model predicted the ASCVD risk score with Pearson r=0.26. Using Linguistic Inquiry and Word Count, patients with higher ASCVD scores were more likely to use words associated with sadness (r=0.32). Conclusions Language used on social media can provide insights about an individual’s ASCVD risk and inform approaches to risk modification.
Collapse
Affiliation(s)
- Anietie U Andy
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, United States
| | - Sharath C Guntuku
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, United States.,Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA, United States
| | - Srinath Adusumalli
- Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, PA, United States.,Division of Cardiovascular Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - David A Asch
- Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, PA, United States.,Center for Health Equity Research and Promotion, Corporal Michael J Crescenz VA Medical Center, Philadelphia, PA, United States.,Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Peter W Groeneveld
- Center for Health Equity Research and Promotion, Corporal Michael J Crescenz VA Medical Center, Philadelphia, PA, United States.,Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Lyle H Ungar
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, United States
| | - Raina M Merchant
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, United States.,Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, PA, United States.,Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
36
|
Jaidka K, Guntuku SC, Lee JH, Luo Z, Buffone A, Ungar LH. The rural–urban stress divide: Obtaining geographical insights through Twitter. Computers in Human Behavior 2021. [DOI: 10.1016/j.chb.2020.106544] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
37
|
Giorgi S, Guntuku SC, Eichstaedt JC, Pajot C, Schwartz HA, Ungar LH. Well-Being Depends on Social Comparison: Hierarchical Models of Twitter Language Suggest That Richer Neighbors Make You Less Happy. Proc Int AAAI Conf Weblogs Soc Media 2021; 15:1069-1074. [PMID: 37064998 PMCID: PMC10099468 DOI: 10.1609/icwsm.v15i1.18132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Psychological research has shown that subjective well-being is sensitive to social comparison effects; individuals report decreased happiness when their neighbors earn more than they do. In this work, we use Twitter language to estimate the well-being of users, and model both individual and neighborhood income using hierarchical modeling across counties in the United States (US). We show that language-based estimates from a sample of 5.8 million Twitter users replicate results obtained from large-scale well-being surveys - relatively richer neighbors leads to lower well-being, even when controlling for absolute income. Furthermore, predicting individual-level happiness using hierarchical models (i.e., individuals within their communities) out-predicts standard baselines. We also explore language associated with relative income differences and find that individuals with lower income than their community tend to swear (f*ck, sh*t, b*tch), express anger (pissed, bullsh*t, wtf), hesitation (don't, anymore, idk, confused) and acts of social deviance (weed, blunt, drunk). These results suggest that social comparison robustly affects reported well-being, and that Twitter language analyses can be used to both measure these effects and shed light on their underlying psychological dynamics.
Collapse
|
38
|
Guntuku SC, Sherman G, Stokes DC, Agarwal AK, Seltzer E, Merchant RM, Ungar LH. Tracking Mental Health and Symptom Mentions on Twitter During COVID-19. J Gen Intern Med 2020; 35:2798-2800. [PMID: 32638321 PMCID: PMC7340749 DOI: 10.1007/s11606-020-05988-8] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 06/12/2020] [Indexed: 11/29/2022]
Affiliation(s)
- Sharath Chandra Guntuku
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA. .,Center for Digital Health, Penn Medicine, Philadelphia, PA, USA. .,Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA, USA.
| | - Garrick Sherman
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA.,Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA
| | - Daniel C Stokes
- Center for Digital Health, Penn Medicine, Philadelphia, PA, USA.,Departmentof Emergency Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Anish K Agarwal
- Center for Digital Health, Penn Medicine, Philadelphia, PA, USA.,Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA, USA.,Departmentof Emergency Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Emily Seltzer
- Center for Digital Health, Penn Medicine, Philadelphia, PA, USA
| | - Raina M Merchant
- Center for Digital Health, Penn Medicine, Philadelphia, PA, USA.,Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA, USA.,Departmentof Emergency Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA.,Positive Psychology Center, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
39
|
Abstract
A rapidly growing literature has attempted to explain Donald Trump's success in the 2016 U.S. presidential election as a result of a wide variety of differences in individual characteristics, attitudes, and social processes. We propose that the economic and psychological processes previously established have in common that they generated or electorally capitalized on unhappiness in the electorate, which emerges as a powerful high-level predictor of the 2016 electoral outcome. Drawing on a large dataset covering over 2 million individual surveys, which we aggregated to the county level, we find that low levels of evaluative, experienced, and eudaemonic subjective well-being (SWB) are strongly predictive of Trump's victory, accounting for an extensive list of demographic, ideological, and socioeconomic covariates and robustness checks. County-level future life evaluation alone correlates with the Trump vote share over Republican baselines at r = -.78 in the raw data, a magnitude rarely seen in the social sciences. We show similar findings when examining the association between individual-level life satisfaction and Trump voting. Low levels of SWB also predict anti-incumbent voting at the 2012 election, both at the county and individual level. The findings suggest that SWB is a powerful high-level marker of (dis)content and that SWB should be routinely considered alongside economic explanations of electoral choice. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
Affiliation(s)
- George Ward
- d, Sloan School of Management, Massachusetts Institute of Technology
| | | | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania
| | | |
Collapse
|
40
|
Abstract
The COVID-19 outbreak has clear clinical and economic impacts, but also affects behaviors e.g. through social distancing, and may increase stress and anxiety. However, while case numbers are tracked daily, we know little about the psychological effects of the outbreak on individuals in the moment. Here we examine the psychological and behavioral shifts over the initial stages of the outbreak in the United States in an observational longitudinal study. Through GPS phone data we find that homestay is increasing, while being at work dropped precipitously. Using regular real-time experiential surveys we observe an overall increase in stress and mood levels which is similar in size to the weekend vs. weekday differences. As there is a significant difference between weekday and weekend mood and stress levels, this is an important decrease in wellbeing. For some, especially those affected by job loss, the mental health impact is severe.
Collapse
|
41
|
Jaidka K, Giorgi S, Schwartz HA, Kern ML, Ungar LH, Eichstaedt JC. Estimating geographic subjective well-being from Twitter: A comparison of dictionary and data-driven language methods. Proc Natl Acad Sci U S A 2020; 117:10165-10171. [PMID: 32341156 PMCID: PMC7229753 DOI: 10.1073/pnas.1906364117] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Researchers and policy makers worldwide are interested in measuring the subjective well-being of populations. When users post on social media, they leave behind digital traces that reflect their thoughts and feelings. Aggregation of such digital traces may make it possible to monitor well-being at large scale. However, social media-based methods need to be robust to regional effects if they are to produce reliable estimates. Using a sample of 1.53 billion geotagged English tweets, we provide a systematic evaluation of word-level and data-driven methods for text analysis for generating well-being estimates for 1,208 US counties. We compared Twitter-based county-level estimates with well-being measurements provided by the Gallup-Sharecare Well-Being Index survey through 1.73 million phone surveys. We find that word-level methods (e.g., Linguistic Inquiry and Word Count [LIWC] 2015 and Language Assessment by Mechanical Turk [LabMT]) yielded inconsistent county-level well-being measurements due to regional, cultural, and socioeconomic differences in language use. However, removing as few as three of the most frequent words led to notable improvements in well-being prediction. Data-driven methods provided robust estimates, approximating the Gallup data at up to r = 0.64. We show that the findings generalized to county socioeconomic and health outcomes and were robust when poststratifying the samples to be more representative of the general US population. Regional well-being estimation from social media data seems to be robust when supervised data-driven methods are used.
Collapse
Affiliation(s)
- Kokil Jaidka
- Department of Communications and New Media, National University of Singapore, Singapore 117416;
- Centre for Trusted Internet and Community, National University of Singapore, Singapore 117416
| | - Salvatore Giorgi
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104
| | - H Andrew Schwartz
- Department of Computer Science, Stony Brook University, Stony Brook, NY 11794
| | - Margaret L Kern
- Melbourne Graduate School of Education, The University of Melbourne, Parkville, VIC 3010, Australia
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104
| | - Johannes C Eichstaedt
- Department of Psychology, Stanford University, Stanford, CA 94305;
- Institute for Human-Centered Artificial Intelligence, Stanford University, Stanford, CA 94305
| |
Collapse
|
42
|
Simchon A, Guntuku SC, Simhon R, Ungar LH, Hassin RR, Gilead M. Political depression? A big-data, multimethod investigation of Americans' emotional response to the Trump presidency. J Exp Psychol Gen 2020; 149:2154-2168. [PMID: 32309988 DOI: 10.1037/xge0000767] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Previous studies suggested that the 2016 presidential elections gave rise to pathological levels of election-related distress in liberal Americans; however, it has also been suggested that the public discourse and the professional discourse have increasingly overgeneralized concepts of trauma and psychopathology. In light of this, in the current research, we utilized an array of big data measures and asked whether a political loss in a participatory democracy can indeed lead to psychopathology. We observed that liberals report being more depressed when asked directly about the effects of the election; however, more indirect measures show a short-lived or nonexistent effect. We examined self-report measures of clinical depression with and without a reference to the election (Studies 1A & 1B), analyzed Twitter discourse and measured users' levels of depression using a machine-learning-based model (Study 2), conducted time-series analysis of depression-related search behavior on Google (Study 3), examined the proportion of antidepressants consumption in Medicaid data (Study 4), and analyzed daily surveys of hundreds of thousands of Americans (Study 5), and saw that at the aggregate level, empirical data reject the accounts of "Trump Depression." We discuss possible interpretations of the discrepancies between the direct and indirect measures. The current investigation demonstrates how big-data sources can provide an unprecedented view of the psychological consequences of political events and sheds light on the complex relationship between the political and the personal spheres. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Collapse
|
43
|
Luna JM, Chao HH, Shinohara RT, Ungar LH, Cengel KA, Pryma DA, Chinniah C, Berman AT, Katz SI, Kontos D, Simone CB, Diffenderfer ES. Machine learning highlights the deficiency of conventional dosimetric constraints for prevention of high-grade radiation esophagitis in non-small cell lung cancer treated with chemoradiation. Clin Transl Radiat Oncol 2020; 22:69-75. [PMID: 32274426 PMCID: PMC7132156 DOI: 10.1016/j.ctro.2020.03.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 03/17/2020] [Accepted: 03/21/2020] [Indexed: 12/23/2022] Open
Abstract
A large cohort to predict radiation esophagitis in lung cancer patients was used. Modern machine learning models were implemented to predict radiation esophagitis. Previously published predictors of grade ≥ 3 radiation esophagitis may be unreliable.
Background and Purpose Radiation esophagitis is a clinically important toxicity seen with treatment for locally-advanced non-small cell lung cancer. There is considerable disagreement among prior studies in identifying predictors of radiation esophagitis. We apply machine learning algorithms to identify factors contributing to the development of radiation esophagitis to uncover previously unidentified criteria and more robust dosimetric factors. Materials and Methods We used machine learning approaches to identify predictors of grade ≥ 3 radiation esophagitis in a cohort of 202 consecutive locally-advanced non-small cell lung cancer patients treated with definitive chemoradiation from 2008 to 2016. We evaluated 35 clinical features per patient grouped into risk factors, comorbidities, imaging, stage, histology, radiotherapy, chemotherapy and dosimetry. Univariate and multivariate analyses were performed using a panel of 11 machine learning algorithms combined with predictive power assessments. Results All patients were treated to a median dose of 66.6 Gy at 1.8 Gy per fraction using photon (89.6%) and proton (10.4%) beam therapy, most often with concurrent chemotherapy (86.6%). 11.4% of patients developed grade ≥ 3 radiation esophagitis. On univariate analysis, no individual feature was found to predict radiation esophagitis (AUC range 0.45–0.55, p ≥ 0.07). In multivariate analysis, all machine learning algorithms exhibited poor predictive performance (AUC range 0.46–0.56, p ≥ 0.07). Conclusions Contemporary machine learning algorithms applied to our modern, relatively large institutional cohort could not identify any reliable predictors of grade ≥ 3 radiation esophagitis. Additional patients are needed, and novel patient-specific and treatment characteristics should be investigated to develop clinically meaningful methods to mitigate this survival altering toxicity.
Collapse
Affiliation(s)
- José Marcio Luna
- Department of Radiation Oncology, University of Pennsylvania, Perelman Center for Advanced Medicine, 3400 Civic Center Blvd, Philadelphia, PA 19104, United States
| | - Hann-Hsiang Chao
- Department of Radiation Oncology, Hunter Holmes McGuire Veterans Affairs Medical Center, 1201 Broad Rock Blvd, Richmond, VA 23249, United States
| | - Russel T Shinohara
- Department of Biostatistics and Epidemiology, University of Pennsylvania, 423 Guardian Dr, Philadelphia, PA 19104, United States
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, 3330 Walnut St, Philadelphia, PA 19104, United States
| | - Keith A Cengel
- Department of Radiation Oncology, University of Pennsylvania, Perelman Center for Advanced Medicine, 3400 Civic Center Blvd, Philadelphia, PA 19104, United States
| | - Daniel A Pryma
- Department of Radiology, University of Pennsylvania, 3400 Spruce St, Philadelphia, PA 19104, United States
| | | | - Abigail T Berman
- Department of Radiation Oncology, University of Pennsylvania, Perelman Center for Advanced Medicine, 3400 Civic Center Blvd, Philadelphia, PA 19104, United States
| | - Sharyn I Katz
- Department of Radiology, University of Pennsylvania, 3400 Spruce St, Philadelphia, PA 19104, United States
| | - Despina Kontos
- Department of Radiology, University of Pennsylvania, 3400 Spruce St, Philadelphia, PA 19104, United States
| | - Charles B Simone
- Department of Radiation Oncology, New York Proton Center, 225 East 126 St, New York, NY 10035, United States
| | - Eric S Diffenderfer
- Department of Radiation Oncology, University of Pennsylvania, Perelman Center for Advanced Medicine, 3400 Civic Center Blvd, Philadelphia, PA 19104, United States
| |
Collapse
|
44
|
Guntuku SC, Schwartz HA, Kashyap A, Gaulton JS, Stokes DC, Asch DA, Ungar LH, Merchant RM. Variability in Language used on Social Media prior to Hospital Visits. Sci Rep 2020; 10:4346. [PMID: 32165648 PMCID: PMC7067847 DOI: 10.1038/s41598-020-60750-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 02/10/2020] [Indexed: 11/30/2022] Open
Abstract
Forecasting healthcare utilization has the potential to anticipate care needs, either accelerating needed care or redirecting patients toward care most appropriate to their needs. While prior research has utilized clinical information to forecast readmissions, analyzing digital footprints from social media can inform our understanding of individuals' behaviors, thoughts, and motivations preceding a healthcare visit. We evaluate how language patterns on social media change prior to emergency department (ED) visits and inpatient hospital admissions in this case-crossover study of adult patients visiting a large urban academic hospital system who consented to share access to their history of Facebook statuses and electronic medical records. An ensemble machine learning model forecasted ED visits and inpatient admissions with out-of-sample cross-validated AUCs of 0.64 and 0.70 respectively. Prior to an ED visit, there was a significant increase in depressed language (Cohen's d = 0.238), and a decrease in informal language (d = 0.345). Facebook posts prior to an inpatient admission showed significant increase in expressions of somatic pain (d = 0.267) and decrease in extraverted/social language (d = 0.357). These results are a first step in developing methods to utilize user-generated content to characterize patient care-seeking context which could ultimately enable better allocation of resources and potentially early interventions to reduce unplanned visits.
Collapse
Affiliation(s)
| | | | | | - Jessica S Gaulton
- University of Pennsylvania, Philadelphia, PA, USA
- Children's Hospital of Pennsylvania, Philadelphia, PA, USA
| | | | - David A Asch
- University of Pennsylvania, Philadelphia, PA, USA
- Cpl Michael J Crescenz VA Medical Center, Philadelphia, PA, USA
| | - Lyle H Ungar
- University of Pennsylvania, Philadelphia, PA, USA
| | | |
Collapse
|
45
|
Agarwal S, Guntuku SC, Robinson OC, Dunn A, Ungar LH. Examining the Phenomenon of Quarter-Life Crisis Through Artificial Intelligence and the Language of Twitter. Front Psychol 2020; 11:341. [PMID: 32210878 PMCID: PMC7068850 DOI: 10.3389/fpsyg.2020.00341] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 02/13/2020] [Indexed: 11/25/2022] Open
Abstract
Quarter-life crisis (QLC) is a popular term for developmental crisis episodes that occur during early adulthood (18–30). Our aim was to explore what linguistic themes are associated with this phenomenon as discussed on social media. We analyzed 1.5 million tweets written by over 1,400 users from the United Kingdom and United States that referred to QLC, comparing their posts to those used by a control set of users who were matched by age, gender and period of activity. Logistic regression was used to uncover significant associations between words, topics, and sentiments of users and QLC, controlling for demographics. Users who refer to a QLC were found to post more about feeling mixed emotions, feeling stuck, wanting change, career, illness, school, and family. Their language tended to be focused on the future. Of 20 terms selected according to early adult crisis theory, 16 were mentioned by the QLC group more than the control group. The insights from this study could be used by clinicians and coaches to better understand the developmental challenges faced by young adults and how these are portrayed naturalistically in the language of social media.
Collapse
Affiliation(s)
- Shantenu Agarwal
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States
| | - Sharath Chandra Guntuku
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States
- *Correspondence: Sharath Chandra Guntuku,
| | - Oliver C. Robinson
- Department of Psychology, Social Work and Counselling, University of Greenwich, London, United Kingdom
| | - Abigail Dunn
- Department of Psychology, University of Sussex, Brighton, United Kingdom
| | - Lyle H. Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
46
|
Giorgi S, Yaden DB, Eichstaedt JC, Ashford RD, Buffone AE, Schwartz HA, Ungar LH, Curtis B. Cultural Differences in Tweeting about Drinking Across the US. Int J Environ Res Public Health 2020; 17:ijerph17041125. [PMID: 32053866 PMCID: PMC7068559 DOI: 10.3390/ijerph17041125] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/01/2020] [Revised: 02/06/2020] [Accepted: 02/08/2020] [Indexed: 11/16/2022]
Abstract
Excessive alcohol use in the US contributes to over 88,000 deaths per year and costs over $250 billion annually. While previous studies have shown that excessive alcohol use can be detected from general patterns of social media engagement, we characterized how drinking-specific language varies across regions and cultures in the US. From a database of 38 billion public tweets, we selected those mentioning “drunk”, found the words and phrases distinctive of drinking posts, and then clustered these into topics and sets of semantically related words. We identified geolocated “drunk” tweets and correlated their language with the prevalence of self-reported excessive alcohol consumption (Behavioral Risk Factor Surveillance System; BRFSS). We then identified linguistic markers associated with excessive drinking in different regions and cultural communities as identified by the American Community Project. “Drunk” tweet frequency (of the 3.3 million geolocated “drunk” tweets) correlated with excessive alcohol consumption at both the county and state levels (r = 0.26 and 0.45, respectively, p < 0.01). Topic analyses revealed that excessive alcohol consumption was most correlated with references to drinking with friends (r = 0.20), family (r = 0.15), and driving under the influence (r = 0.14). Using the American Community Project classification, we found a number of cultural markers of drinking: religious communities had a high frequency of anti-drunk driving tweets, Hispanic centers discussed family members drinking, and college towns discussed sexual behavior. This study shows that Twitter can be used to explore the specific sociocultural contexts in which excessive alcohol use occurs within particular regions and communities. These findings can inform more targeted public health messaging and help to better understand cultural determinants of substance abuse.
Collapse
Affiliation(s)
- Salvatore Giorgi
- Computer and Information Science Department, University of Pennsylvania, Philadelphia, PA 19104, USA; (S.G.); (L.H.U.)
- National Institutes of Health, National Institute on Drug Abuse, Bethesda, MD 20892, USA
| | - David B. Yaden
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19104, USA; (D.B.Y.)
| | - Johannes C. Eichstaedt
- Department of Psychology & Institute for Human-Centered Artificial Intelligence, Stanford University, Stanford, CA 94305, USA;
| | - Robert D. Ashford
- Substance Use Disorders Institute, University of the Sciences, Philadelphia, PA 19104, USA;
| | - Anneke E.K. Buffone
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19104, USA; (D.B.Y.)
| | - H. Andrew Schwartz
- Department of Computer Science, Stony Brook University, Stony Brook, NY 11794, USA;
| | - Lyle H. Ungar
- Computer and Information Science Department, University of Pennsylvania, Philadelphia, PA 19104, USA; (S.G.); (L.H.U.)
| | - Brenda Curtis
- National Institutes of Health, National Institute on Drug Abuse, Bethesda, MD 20892, USA
- Correspondence:
| |
Collapse
|
47
|
Abstract
Objective: We computationally analyze the language of social media users diagnosed with ADHD to understand what they talk about, and how their language is correlated with users' characteristics such as personality and temporal orientation. Method: We analyzed approximately 1.3 million tweets written by 1,399 Twitter users with self-reported diagnoses of ADHD, comparing their posts with those used by a control set matched by age, gender, and period of activity. Results: Users with ADHD are found to be less agreeable, more open, to post more often, and to use more negations, hedging, and swear words. Posts are suggestive of themes of emotional dysregulation, self-criticism, substance abuse, and exhaustion. A machine learning model can predict which of these Twitter users has ADHD with an out-of-sample AUC of .836. Conclusion: Based on this emerging technology, conjectures of future uses of social media by researchers and clinicians to better understand the naturalistic manifestations and sequelae of ADHD.
Collapse
|
48
|
Merchant RM, Asch DA, Crutchley P, Ungar LH, Guntuku SC, Eichstaedt JC, Hill S, Padrez K, Smith RJ, Schwartz HA. Evaluating the predictability of medical conditions from social media posts. PLoS One 2019; 14:e0215476. [PMID: 31206534 PMCID: PMC6576767 DOI: 10.1371/journal.pone.0215476] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Accepted: 04/02/2019] [Indexed: 12/11/2022] Open
Abstract
We studied whether medical conditions across 21 broad categories were predictable from social media content across approximately 20 million words written by 999 consenting patients. Facebook language significantly improved upon the prediction accuracy of demographic variables for 18 of the 21 disease categories; it was particularly effective at predicting diabetes and mental health conditions including anxiety, depression and psychoses. Social media data are a quantifiable link into the otherwise elusive daily lives of patients, providing an avenue for study and assessment of behavioral and environmental disease risk factors. Analogous to the genome, social media data linked to medical diagnoses can be banked with patients’ consent, and an encoding of social media language can be used as markers of disease risk, serve as a screening tool, and elucidate disease epidemiology. In what we believe to be the first report linking electronic medical record data with social media data from consenting patients, we identified that patients’ Facebook status updates can predict many health conditions, suggesting opportunities to use social media data to determine disease onset or exacerbation and to conduct social media-based health interventions.
Collapse
Affiliation(s)
- Raina M Merchant
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - David A Asch
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,The Center for Health Equity Research and Promotion-Philadelphia Veterans Affairs Medical Center, Philadelphia, Pennsylvania, United States of America.,The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Patrick Crutchley
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Lyle H Ungar
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Sharath C Guntuku
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Johannes C Eichstaedt
- Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Shawndra Hill
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Microsoft Research, New York, New York, United States of America
| | - Kevin Padrez
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Robert J Smith
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - H Andrew Schwartz
- Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Department of Computer Science, Stony Brook University, Stony Brook, New York, United States of America
| |
Collapse
|
49
|
Pang D, Eichstaedt JC, Buffone A, Slaff B, Ruch W, Ungar LH. The language of character strengths: Predicting morally valued traits on social media. J Pers 2019; 88:287-306. [PMID: 31107975 PMCID: PMC7065131 DOI: 10.1111/jopy.12491] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2018] [Revised: 05/07/2019] [Accepted: 05/14/2019] [Indexed: 11/27/2022]
Abstract
OBJECTIVE Social media is increasingly being used to study psychological constructs. This study is the first to use Twitter language to investigate the 24 Values in Action Inventory of Character Strengths, which have been shown to predict important life domains such as well-being. METHOD We use both a top-down closed-vocabulary (Linguistic Inquiry and Word Count) and a data-driven open-vocabulary (Differential Language Analysis) approach to analyze 3,937,768 tweets from 4,423 participants (64.3% female), who answered a 240-item survey on character strengths. RESULTS We present the language profiles of (a) a global positivity factor accounting for 36% of the variances in the strengths, and (b) each of the 24 individual strengths, for which we find largely face-valid language associations. Machine learning models trained on language data to predict character strengths reach out-of-sample prediction accuracies comparable to previous work on personality (rmedian = 0.28, ranging from 0.13 to 0.51). CONCLUSIONS The findings suggest that Twitter can be used to characterize and predict character strengths. This technique could be used to measure the character strengths of large populations unobtrusively and cost-effectively.
Collapse
Affiliation(s)
- Dandan Pang
- Department of Work and Organizational Psychology, University of Bern, Bern, Switzerland.,Personality and Assessment, Department of Psychology, University of Zurich, Zurich, Switzerland
| | | | - Anneke Buffone
- Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Barry Slaff
- Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Willibald Ruch
- Personality and Assessment, Department of Psychology, University of Zurich, Zurich, Switzerland
| | - Lyle H Ungar
- Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania.,Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania
| |
Collapse
|
50
|
Weidman AC, Sun J, Vazire S, Quoidbach J, Ungar LH, Dunn EW. (Not) hearing happiness: Predicting fluctuations in happy mood from acoustic cues using machine learning. ACTA ACUST UNITED AC 2019; 20:642-658. [PMID: 30742458 DOI: 10.1037/emo0000571] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Recent popular claims surrounding virtual assistants suggest that computers will soon be able to hear our emotions. Supporting this possibility, promising work has harnessed big data and emergent technologies to automatically predict stable levels of one specific emotion, happiness, at the community (e.g., counties) and trait (i.e., people) levels. Furthermore, research in affective science has shown that nonverbal vocal bursts (e.g., sighs, gasps) and specific acoustic features (e.g., pitch, energy) can differentiate between distinct emotions (e.g., anger, happiness) and that machine-learning algorithms can detect these differences. Yet, to our knowledge, no work has tested whether computers can automatically detect normal, everyday, within-person fluctuations in one emotional state from acoustic analysis. To address this issue in the context of happy mood, across 3 studies (total N = 20,197), we asked participants to repeatedly report their state happy mood and to provide audio recordings-including both direct speech and ambient sounds-from which we extracted acoustic features. Using three different machine learning algorithms (neural networks, random forests, and support vector machines) and two sets of acoustic features, we found that acoustic features yielded minimal predictive insight into happy mood above chance. Neither multilevel modeling analyses nor human coders provided additional insight into state happy mood. These findings suggest that it is not yet possible to automatically assess fluctuations in one emotional state (i.e., happy mood) from acoustic analysis, pointing to a critical future direction for affective scientists interested in acoustic analysis of emotion and automated emotion detection. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Collapse
Affiliation(s)
| | | | | | - Jordi Quoidbach
- Department of People Management and Organisation, ESADE Business School
| | - Lyle H Ungar
- Ungar, Department of Computer and Information Science, University of Pennsylvania
| | | |
Collapse
|