51
|
Weissman GE, Ungar LH, Harhay MO, Courtright KR, Halpern SD. Construct validity of six sentiment analysis methods in the text of encounter notes of patients with critical illness. J Biomed Inform 2019; 89:114-121. [PMID: 30557683 PMCID: PMC6342660 DOI: 10.1016/j.jbi.2018.12.001] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 12/03/2018] [Accepted: 12/08/2018] [Indexed: 01/27/2023]
Abstract
Sentiment analysis may offer insights into patient outcomes through the subjective expressions made by clinicians in the text of encounter notes. We analyzed the predictive, concurrent, convergent, and content validity of six sentiment methods in a sample of 793,725 multidisciplinary clinical notes among 41,283 hospitalizations associated with an intensive care unit stay. None of these approaches improved early prediction of in-hospital mortality using logistic regression models, but did improve both discrimination and calibration when using random forests. Additionally, positive sentiment measured by the CoreNLP (OR 0.04, 95% CI 0.002-0.55), Pattern (OR 0.09, 95% CI 0.04-0.17), sentimentr (OR 0.37, 95% CI 0.25-0.63), and Opinion (OR 0.25, 95% CI 0.07-0.89) methods were inversely associated with death on the concurrent day after adjustment for demographic characteristics and illness severity. Median daily lexical coverage ranged from 5.4% to 20.1%. While sentiment between all methods was positively correlated, their agreement was weak. Sentiment analysis holds promise for clinical applications but will require a novel domain-specific method applicable to clinical text.
Collapse
|
52
|
Eichstaedt JC, Smith RJ, Merchant RM, Ungar LH, Crutchley P, Preoţiuc-Pietro D, Asch DA, Schwartz HA. Facebook language predicts depression in medical records. Proc Natl Acad Sci U S A 2018; 115:11203-11208. [PMID: 30322910 PMCID: PMC6217418 DOI: 10.1073/pnas.1802331115] [Citation(s) in RCA: 195] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Depression, the most prevalent mental illness, is underdiagnosed and undertreated, highlighting the need to extend the scope of current screening methods. Here, we use language from Facebook posts of consenting individuals to predict depression recorded in electronic medical records. We accessed the history of Facebook statuses posted by 683 patients visiting a large urban academic emergency department, 114 of whom had a diagnosis of depression in their medical records. Using only the language preceding their first documentation of a diagnosis of depression, we could identify depressed patients with fair accuracy [area under the curve (AUC) = 0.69], approximately matching the accuracy of screening surveys benchmarked against medical records. Restricting Facebook data to only the 6 months immediately preceding the first documented diagnosis of depression yielded a higher prediction accuracy (AUC = 0.72) for those users who had sufficient Facebook data. Significant prediction of future depression status was possible as far as 3 months before its first documentation. We found that language predictors of depression include emotional (sadness), interpersonal (loneliness, hostility), and cognitive (preoccupation with the self, rumination) processes. Unobtrusive depression assessment through social media of consenting individuals may become feasible as a scalable complement to existing screening and monitoring procedures.
Collapse
|
53
|
Valdes G, Chang AJ, Interian Y, Owen K, Jensen ST, Ungar LH, Cunha A, Solberg TD, Hsu IC. Salvage HDR Brachytherapy: Multiple Hypothesis Testing Versus Machine Learning Analysis. Int J Radiat Oncol Biol Phys 2018; 101:694-703. [DOI: 10.1016/j.ijrobp.2018.03.001] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Revised: 01/07/2018] [Accepted: 03/06/2018] [Indexed: 11/25/2022]
|
54
|
Weissman GE, Hubbard RA, Ungar LH, Harhay MO, Greene CS, Himes BE, Halpern SD. Inclusion of Unstructured Clinical Text Improves Early Prediction of Death or Prolonged ICU Stay. Crit Care Med 2018; 46:1125-1132. [PMID: 29629986 PMCID: PMC6005735 DOI: 10.1097/ccm.0000000000003148] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
OBJECTIVES Early prediction of undesired outcomes among newly hospitalized patients could improve patient triage and prompt conversations about patients' goals of care. We evaluated the performance of logistic regression, gradient boosting machine, random forest, and elastic net regression models, with and without unstructured clinical text data, to predict a binary composite outcome of in-hospital death or ICU length of stay greater than or equal to 7 days using data from the first 48 hours of hospitalization. DESIGN Retrospective cohort study with split sampling for model training and testing. SETTING A single urban academic hospital. PATIENTS All hospitalized patients who required ICU care at the Beth Israel Deaconess Medical Center in Boston, MA, from 2001 to 2012. INTERVENTIONS None. MEASUREMENTS AND MAIN RESULTS Among eligible 25,947 hospital admissions, we observed 5,504 (21.2%) in which patients died or had ICU length of stay greater than or equal to 7 days. The gradient boosting machine model had the highest discrimination without (area under the receiver operating characteristic curve, 0.83; 95% CI, 0.81-0.84) and with (area under the receiver operating characteristic curve, 0.89; 95% CI, 0.88-0.90) text-derived variables. Both gradient boosting machines and random forests outperformed logistic regression without text data (p < 0.001), whereas all models outperformed logistic regression with text data (p < 0.02). The inclusion of text data increased the discrimination of all four model types (p < 0.001). Among those models using text data, the increasing presence of terms "intubated" and "poor prognosis" were positively associated with mortality and ICU length of stay, whereas the term "extubated" was inversely associated with them. CONCLUSIONS Variables extracted from unstructured clinical text from the first 48 hours of hospital admission using natural language processing techniques significantly improved the abilities of logistic regression and other machine learning models to predict which patients died or had long ICU stays. Learning health systems may adapt such models using open-source approaches to capture local variation in care patterns.
Collapse
|
55
|
Curtis B, Giorgi S, Buffone AEK, Ungar LH, Ashford RD, Hemmons J, Summers D, Hamilton C, Schwartz HA. Can Twitter be used to predict county excessive alcohol consumption rates? PLoS One 2018; 13:e0194290. [PMID: 29617408 PMCID: PMC5884504 DOI: 10.1371/journal.pone.0194290] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Accepted: 02/28/2018] [Indexed: 01/26/2023] Open
Abstract
Objectives The current study analyzes a large set of Twitter data from 1,384 US counties to determine whether excessive alcohol consumption rates can be predicted by the words being posted from each county. Methods Data from over 138 million county-level tweets were analyzed using predictive modeling, differential language analysis, and mediating language analysis. Results Twitter language data captures cross-sectional patterns of excessive alcohol consumption beyond that of sociodemographic factors (e.g. age, gender, race, income, education), and can be used to accurately predict rates of excessive alcohol consumption. Additionally, mediation analysis found that Twitter topics (e.g. ‘ready gettin leave’) can explain much of the variance associated between socioeconomics and excessive alcohol consumption. Conclusions Twitter data can be used to predict public health concerns such as excessive drinking. Using mediation analysis in conjunction with predictive modeling allows for a high portion of the variance associated with socioeconomic status to be explained.
Collapse
|
56
|
Guntuku SC, Yaden DB, Kern ML, Ungar LH, Eichstaedt JC. Detecting depression and mental illness on social media: an integrative review. Curr Opin Behav Sci 2017. [DOI: 10.1016/j.cobeha.2017.07.005] [Citation(s) in RCA: 239] [Impact Index Per Article: 34.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
57
|
Yaden DB, Eichstaedt JC, Kern ML, Smith LK, Buffone A, Stillwell DJ, Kosinski M, Ungar LH, Seligman MEP, Schwartz HA. The Language of Religious Affiliation. SOCIAL PSYCHOLOGICAL AND PERSONALITY SCIENCE 2017. [DOI: 10.1177/1948550617711228] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Religious affiliation is an important identifying characteristic for many individuals and relates to numerous life outcomes including health, well-being, policy positions, and cognitive style. Using methods from computational linguistics, we examined language from 12,815 Facebook users in the United States and United Kingdom who indicated their religious affiliation. Religious individuals used more positive emotion words ( β = .278, p < .0001) and social themes such as family ( β = .242, p < .0001), while nonreligious people expressed more negative emotions like anger ( β = −.427, p < .0001) and categories related to cognitive processes, like tentativeness ( β = −.153, p < .0001). Nonreligious individuals also used more themes related to the body ( β = −.265, p < .0001) and death ( β = −.247, p < .0001). The findings offer directions for future research on religious affiliation, specifically in terms of social, emotional, and cognitive differences.
Collapse
|
58
|
Ranard BL, Werner RM, Antanavicius T, Schwartz HA, Smith RJ, Meisel ZF, Asch DA, Ungar LH, Merchant RM. Yelp Reviews Of Hospital Care Can Supplement And Inform Traditional Surveys Of The Patient Experience Of Care. Health Aff (Millwood) 2017; 35:697-705. [PMID: 27044971 DOI: 10.1377/hlthaff.2015.1030] [Citation(s) in RCA: 123] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Little is known about how real-time online rating platforms such as Yelp may complement the Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) survey, which is the US standard for evaluating patients' experiences after hospitalization. We compared the content of Yelp narrative reviews of hospitals to the topics in the HCAHPS survey, called domains in HCAHPS terminology. While the domains included in Yelp reviews covered the majority of HCAHPS domains, Yelp reviews covered an additional twelve domains not found in HCAHPS. The majority of Yelp topics that most strongly correlate with positive or negative reviews are not measured or reported by HCAHPS. The large collection of patient- and caregiver-centered experiences found on Yelp can be analyzed with natural language processing methods, identifying for policy makers the measures of hospital quality that matter most to patients and caregivers. The Yelp measures and analysis can also provide actionable feedback for hospitals.
Collapse
|
59
|
|
60
|
Satopää VA, Jensen ST, Pemantle R, Ungar LH. Partial information framework: Model-based aggregation of estimates from diverse information sources. Electron J Stat 2017. [DOI: 10.1214/17-ejs1346] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
61
|
Markopoulos PM, Aron R, Ungar LH. Product Information Websites: Are They Good for Consumers? J MANAGE INFORM SYST 2016. [DOI: 10.1080/07421222.2016.1243885] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
62
|
Valdes G, Luna JM, Eaton E, Simone CB, Ungar LH, Solberg TD. MediBoost: a Patient Stratification Tool for Interpretable Decision Making in the Era of Precision Medicine. Sci Rep 2016; 6:37854. [PMID: 27901055 PMCID: PMC5129017 DOI: 10.1038/srep37854] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 11/02/2016] [Indexed: 11/17/2022] Open
Abstract
Machine learning algorithms that are both interpretable and accurate are essential in applications such as medicine where errors can have a dire consequence. Unfortunately, there is currently a tradeoff between accuracy and interpretability among state-of-the-art methods. Decision trees are interpretable and are therefore used extensively throughout medicine for stratifying patients. Current decision tree algorithms, however, are consistently outperformed in accuracy by other, less-interpretable machine learning models, such as ensemble methods. We present MediBoost, a novel framework for constructing decision trees that retain interpretability while having accuracy similar to ensemble methods, and compare MediBoost’s performance to that of conventional decision trees and ensemble methods on 13 medical classification problems. MediBoost significantly outperformed current decision tree algorithms in 11 out of 13 problems, giving accuracy comparable to ensemble methods. The resulting trees are of the same type as decision trees used throughout clinical practice but have the advantage of improved accuracy. Our algorithm thus gives the best of both worlds: it grows a single, highly interpretable tree that has the high accuracy of ensemble methods.
Collapse
|
63
|
Kern ML, Park G, Eichstaedt JC, Schwartz HA, Sap M, Smith LK, Ungar LH. Gaining insights from social media language: Methodologies and challenges. Psychol Methods 2016; 21:507-525. [PMID: 27505683 DOI: 10.1037/met0000091] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Language data available through social media provide opportunities to study people at an unprecedented scale. However, little guidance is available to psychologists who want to enter this area of research. Drawing on tools and techniques developed in natural language processing, we first introduce psychologists to social media language research, identifying descriptive and predictive analyses that language data allow. Second, we describe how raw language data can be accessed and quantified for inclusion in subsequent analyses, exploring personality as expressed on Facebook to illustrate. Third, we highlight challenges and issues to be considered, including accessing and processing the data, interpreting effects, and ethical issues. Social media has become a valuable part of social life, and there is much we can learn by bringing together the tools of computer science with the theories and insights of psychology. (PsycINFO Database Record
Collapse
|
64
|
Ireland ME, Schwartz HA, Chen Q, Ungar LH, Albarracín D. Future-oriented tweets predict lower county-level HIV prevalence in the United States. Health Psychol 2016; 34S:1252-60. [PMID: 26651466 DOI: 10.1037/hea0000279] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
OBJECTIVE Future orientation promotes health and well-being at the individual level. Computerized text analysis of a dataset encompassing billions of words used across the United States on Twitter tested whether community-level rates of future-oriented messages correlated with lower human immunodeficiency virus (HIV) rates and moderated the association between behavioral risk indicators and HIV. METHOD Over 150 million tweets mapped to U.S. counties were analyzed using 2 methods of text analysis. First, county-level HIV rates (cases per 100,000) were regressed on aggregate usage of future-oriented language (e.g., will, gonna). A second data-driven method regressed HIV rates on individual words and phrases. RESULTS Results showed that counties with higher rates of future tense on Twitter had fewer HIV cases, independent of strong structural predictors of HIV such as population density. Future-oriented messages also appeared to buffer health risk: Sexually transmitted infection rates and references to risky behavior on Twitter were associated with higher HIV prevalence in all counties except those with high rates of future orientation. Data-driven analyses likewise showed that words and phrases referencing the future (e.g., tomorrow, would be) correlated with lower HIV prevalence. CONCLUSION Integrating big data approaches to text analysis and epidemiology with psychological theory may provide an inexpensive, real-time method of anticipating outbreaks of HIV and etiologically similar diseases.
Collapse
|
65
|
Ireland ME, Chen Q, Schwartz HA, Ungar LH, Albarracin D. Action Tweets Linked to Reduced County-Level HIV Prevalence in the United States: Online Messages and Structural Determinants. AIDS Behav 2016; 20:1256-64. [PMID: 26650382 DOI: 10.1007/s10461-015-1252-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
HIV is uncommon in most US counties but travels quickly through vulnerable communities when it strikes. Tracking behavior through social media may provide an unobtrusive, naturalistic means of predicting HIV outbreaks and understanding the behavioral and psychological factors that increase communities' risk. General action goals, or the motivation to engage in cognitive and motor activity, may support protective health behavior (e.g., using condoms) or encourage activity indiscriminately (e.g., risky sex), resulting in mixed health effects. We explored these opposing hypotheses by regressing county-level HIV prevalence on action language (e.g., work, plan) in over 150 million tweets mapped to US counties. Controlling for demographic and structural predictors of HIV, more active language was associated with lower HIV rates. By leveraging language used on social media to improve existing predictive models of geographic variation in HIV, future targeted HIV-prevention interventions may have a better chance of reaching high-risk communities before outbreaks occur.
Collapse
|
66
|
Park G, Yaden DB, Schwartz HA, Kern ML, Eichstaedt JC, Kosinski M, Stillwell D, Ungar LH, Seligman MEP. Women are Warmer but No Less Assertive than Men: Gender and Language on Facebook. PLoS One 2016; 11:e0155885. [PMID: 27223607 PMCID: PMC4881750 DOI: 10.1371/journal.pone.0155885] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2015] [Accepted: 05/05/2016] [Indexed: 11/30/2022] Open
Abstract
Using a large social media dataset and open-vocabulary methods from computational linguistics, we explored differences in language use across gender, affiliation, and assertiveness. In Study 1, we analyzed topics (groups of semantically similar words) across 10 million messages from over 52,000 Facebook users. Most language differed little across gender. However, topics most associated with self-identified female participants included friends, family, and social life, whereas topics most associated with self-identified male participants included swearing, anger, discussion of objects instead of people, and the use of argumentative language. In Study 2, we plotted male- and female-linked language topics along two interpersonal dimensions prevalent in gender research: affiliation and assertiveness. In a sample of over 15,000 Facebook users, we found substantial gender differences in the use of affiliative language and slight differences in assertive language. Language used more by self-identified females was interpersonally warmer, more compassionate, polite, and—contrary to previous findings—slightly more assertive in their language use, whereas language used more by self-identified males was colder, more hostile, and impersonal. Computational linguistic analysis combined with methods to automatically label topics offer means for testing psychological theories unobtrusively at large scale.
Collapse
|
67
|
Park G, Schwartz HA, Sap M, Kern ML, Weingarten E, Eichstaedt JC, Berger J, Stillwell DJ, Kosinski M, Ungar LH, Seligman MEP. Living in the Past, Present, and Future: Measuring Temporal Orientation With Language. J Pers 2016; 85:270-280. [DOI: 10.1111/jopy.12239] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
68
|
Schwartz HA, Sap M, Kern ML, Eichstaedt JC, Kapelner A, Agrawal M, Blanco E, Dziurzynski L, Park G, Stillwell D, Kosinski M, Seligman MEP, Ungar LH. PREDICTING INDIVIDUAL WELL-BEING THROUGH THE LANGUAGE OF SOCIAL MEDIA. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2016; 21:516-527. [PMID: 26776214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
We present the task of predicting individual well-being, as measured by a life satisfaction scale, through the language people use on social media. Well-being, which encompasses much more than emotion and mood, is linked with good mental and physical health. The ability to quickly and accurately assess it can supplement multi-million dollar national surveys as well as promote whole body health. Through crowd-sourced ratings of tweets and Facebook status updates, we create message-level predictive models for multiple components of well-being. However, well-being is ultimately attributed to people, so we perform an additional evaluation at the user-level, finding that a multi-level cascaded model, using both message-level predictions and userlevel features, performs best and outperforms popular lexicon-based happiness models. Finally, we suggest that analyses of language go beyond prediction by identifying the language that characterizes well-being.
Collapse
|
69
|
Leung YY, Kuksa PP, Amlie-Wolf A, Valladares O, Ungar LH, Kannan S, Gregory BD, Wang LS. DASHR: database of small human noncoding RNAs. Nucleic Acids Res 2015; 44:D216-22. [PMID: 26553799 PMCID: PMC4702848 DOI: 10.1093/nar/gkv1188] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2015] [Accepted: 10/25/2015] [Indexed: 11/20/2022] Open
Abstract
Small non-coding RNAs (sncRNAs) are highly abundant RNAs, typically <100 nucleotides long, that act as key regulators of diverse cellular processes. Although thousands of sncRNA genes are known to exist in the human genome, no single database provides searchable, unified annotation, and expression information for full sncRNA transcripts and mature RNA products derived from these larger RNAs. Here, we present the Database of small human noncoding RNAs (DASHR). DASHR contains the most comprehensive information to date on human sncRNA genes and mature sncRNA products. DASHR provides a simple user interface for researchers to view sequence and secondary structure, compare expression levels, and evidence of specific processing across all sncRNA genes and mature sncRNA products in various human tissues. DASHR annotation and expression data covers all major classes of sncRNAs including microRNAs (miRNAs), Piwi-interacting (piRNAs), small nuclear, nucleolar, cytoplasmic (sn-, sno-, scRNAs, respectively), transfer (tRNAs), and ribosomal RNAs (rRNAs). Currently, DASHR (v1.0) integrates 187 smRNA high-throughput sequencing (smRNA-seq) datasets with over 2.5 billion reads and annotation data from multiple public sources. DASHR contains annotations for ∼48 000 human sncRNA genes and mature sncRNA products, 82% of which are expressed in one or more of the curated tissues. DASHR is available at http://lisanwanglab.org/DASHR.
Collapse
|
70
|
Duckworth AL, Eichstaedt JC, Ungar LH. The Mechanics of Human Achievement. SOCIAL AND PERSONALITY PSYCHOLOGY COMPASS 2015; 9:359-369. [PMID: 26236393 DOI: 10.1111/spc3.12178] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Countless studies have addressed why some individuals achieve more than others. Nevertheless, the psychology of achievement lacks a unifying conceptual framework for synthesizing these empirical insights. We propose organizing achievement-related traits by two possible mechanisms of action: Traits that determine the rate at which an individual learns a skill are talent variables and can be distinguished conceptually from traits that determine the effort an individual puts forth. This approach takes inspiration from Newtonian mechanics: achievement is akin to distance traveled, effort to time, skill to speed, and talent to acceleration. A novel prediction from this model is that individual differences in effort (but not talent) influence achievement (but not skill) more substantially over longer (rather than shorter) time intervals. Conceptualizing skill as the multiplicative product of talent and effort, and achievement as the multiplicative product of skill and effort, advances similar, but less formal, propositions by several important earlier thinkers.
Collapse
|
71
|
Eichstaedt JC, Schwartz HA, Kern ML, Park G, Labarthe DR, Merchant RM, Jha S, Agrawal M, Dziurzynski LA, Sap M, Weeg C, Larson EE, Ungar LH, Seligman MEP. Psychological language on Twitter predicts county-level heart disease mortality. Psychol Sci 2015; 26:159-69. [PMID: 25605707 DOI: 10.1177/0956797614557867] [Citation(s) in RCA: 194] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Hostility and chronic stress are known risk factors for heart disease, but they are costly to assess on a large scale. We used language expressed on Twitter to characterize community-level psychological correlates of age-adjusted mortality from atherosclerotic heart disease (AHD). Language patterns reflecting negative social relationships, disengagement, and negative emotions-especially anger-emerged as risk factors; positive emotions and psychological engagement emerged as protective factors. Most correlations remained significant after controlling for income and education. A cross-sectional regression model based only on Twitter language predicted AHD mortality significantly better than did a model that combined 10 common demographic, socioeconomic, and health risk factors, including smoking, diabetes, hypertension, and obesity. Capturing community psychological characteristics through social media is feasible, and these characteristics are strong markers of cardiovascular mortality at the community level.
Collapse
|
72
|
Park G, Schwartz HA, Eichstaedt JC, Kern ML, Kosinski M, Stillwell DJ, Ungar LH, Seligman MEP. Automatic personality assessment through social media language. J Pers Soc Psychol 2014; 108:934-52. [PMID: 25365036 DOI: 10.1037/pspp0000020] [Citation(s) in RCA: 194] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Language use is a psychologically rich, stable individual difference with well-established correlations to personality. We describe a method for assessing personality using an open-vocabulary analysis of language from social media. We compiled the written language from 66,732 Facebook users and their questionnaire-based self-reported Big Five personality traits, and then we built a predictive model of personality based on their language. We used this model to predict the 5 personality factors in a separate sample of 4,824 Facebook users, examining (a) convergence with self-reports of personality at the domain- and facet-level; (b) discriminant validity between predictions of distinct traits; (c) agreement with informant reports of personality; (d) patterns of correlations with external criteria (e.g., number of friends, political attitudes, impulsiveness); and (e) test-retest reliability over 6-month intervals. Results indicated that language-based assessments can constitute valid personality measures: they agreed with self-reports and informant reports of personality, added incremental validity over informant reports, adequately discriminated between traits, exhibited patterns of correlations with external criteria similar to those found with self-reported personality, and were stable over 6-month intervals. Analysis of predictive language can provide rich portraits of the mental life associated with traits. This approach can complement and extend traditional methods, providing researchers with an additional measure that can quickly and cheaply assess large groups of participants with minimal burden.
Collapse
|
73
|
Merchant RM, Ha YP, Wong CA, Schwartz HA, Sap M, Ungar LH, Asch DA. The 2013 US Government Shutdown (#Shutdown) and health: an emerging role for social media. Am J Public Health 2014; 104:2248-50. [PMID: 25322303 DOI: 10.2105/ajph.2014.302118] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
In October 2013, multiple United States (US) federal health departments and agencies posted on Twitter, "We're sorry, but we will not be tweeting or responding to @replies during the shutdown. We'll be back as soon as possible!" These "last tweets" and the millions of responses they generated revealed social media's role as a forum for sharing and discussing information rapidly. Social media are now among the few dominant communication channels used today. We used social media to characterize the public discourse and sentiment about the shutdown. The 2013 shutdown represented an opportunity to explore the role social media might play in events that could affect health.
Collapse
|
74
|
Satopää VA, Jensen ST, Mellers BA, Tetlock PE, Ungar LH. Probability aggregation in time-series: Dynamic hierarchical modeling of sparse expert beliefs. Ann Appl Stat 2014. [DOI: 10.1214/14-aoas739] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
75
|
Baron J, Mellers BA, Tetlock PE, Stone E, Ungar LH. Two Reasons to Make Aggregated Probability Forecasts More Extreme. DECISION ANALYSIS 2014. [DOI: 10.1287/deca.2014.0293] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|