1
|
Deiner MS, Deiner NA, Hristidis V, McLeod SD, Doan T, Lietman TM, Porco TC. Use of Large Language Models to Assess the Likelihood of Epidemics From the Content of Tweets: Infodemiology Study. J Med Internet Res 2024; 26:e49139. [PMID: 38427404 PMCID: PMC10943433 DOI: 10.2196/49139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 12/20/2023] [Accepted: 01/19/2024] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND Previous work suggests that Google searches could be useful in identifying conjunctivitis epidemics. Content-based assessment of social media content may provide additional value in serving as early indicators of conjunctivitis and other systemic infectious diseases. OBJECTIVE We investigated whether large language models, specifically GPT-3.5 and GPT-4 (OpenAI), can provide probabilistic assessments of whether social media posts about conjunctivitis could indicate a regional outbreak. METHODS A total of 12,194 conjunctivitis-related tweets were obtained using a targeted Boolean search in multiple languages from India, Guam (United States), Martinique (France), the Philippines, American Samoa (United States), Fiji, Costa Rica, Haiti, and the Bahamas, covering the time frame from January 1, 2012, to March 13, 2023. By providing these tweets via prompts to GPT-3.5 and GPT-4, we obtained probabilistic assessments that were validated by 2 human raters. We then calculated Pearson correlations of these time series with tweet volume and the occurrence of known outbreaks in these 9 locations, with time series bootstrap used to compute CIs. RESULTS Probabilistic assessments derived from GPT-3.5 showed correlations of 0.60 (95% CI 0.47-0.70) and 0.53 (95% CI 0.40-0.65) with the 2 human raters, with higher results for GPT-4. The weekly averages of GPT-3.5 probabilities showed substantial correlations with weekly tweet volume for 44% (4/9) of the countries, with correlations ranging from 0.10 (95% CI 0.0-0.29) to 0.53 (95% CI 0.39-0.89), with larger correlations for GPT-4. More modest correlations were found for correlation with known epidemics, with substantial correlation only in American Samoa (0.40, 95% CI 0.16-0.81). CONCLUSIONS These findings suggest that GPT prompting can efficiently assess the content of social media posts and indicate possible disease outbreaks to a degree of accuracy comparable to that of humans. Furthermore, we found that automated content analysis of tweets is related to tweet volume for conjunctivitis-related posts in some locations and to the occurrence of actual epidemics. Future work may improve the sensitivity and specificity of these methods for disease outbreak detection.
Collapse
Affiliation(s)
- Michael S Deiner
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA, United States
- Francis I. Proctor Foundation for Research in Ophthalmology, University of California, San Francisco, San Francisco, CA, United States
| | - Natalie A Deiner
- College of Letters and Science, University of California, Santa Barbara, Santa Barbara, CA, United States
| | - Vagelis Hristidis
- Department of Computer Science and Engineering, University of California, Riverside, Riverside, CA, United States
| | - Stephen D McLeod
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA, United States
- Francis I. Proctor Foundation for Research in Ophthalmology, University of California, San Francisco, San Francisco, CA, United States
- American Academy of Ophthalmology, San Francisco, CA, United States
| | - Thuy Doan
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA, United States
- Francis I. Proctor Foundation for Research in Ophthalmology, University of California, San Francisco, San Francisco, CA, United States
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, United States
| | - Thomas M Lietman
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA, United States
- Francis I. Proctor Foundation for Research in Ophthalmology, University of California, San Francisco, San Francisco, CA, United States
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, United States
| | - Travis C Porco
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA, United States
- Francis I. Proctor Foundation for Research in Ophthalmology, University of California, San Francisco, San Francisco, CA, United States
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|
2
|
Haupt MR, Chiu M, Chang J, Li Z, Cuomo R, Mackey TK. Detecting nuance in conspiracy discourse: Advancing methods in infodemiology and communication science with machine learning and qualitative content coding. PLoS One 2023; 18:e0295414. [PMID: 38117843 PMCID: PMC10732406 DOI: 10.1371/journal.pone.0295414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 11/21/2023] [Indexed: 12/22/2023] Open
Abstract
The spread of misinformation and conspiracies has been an ongoing issue since the early stages of the internet era, resulting in the emergence of the field of infodemiology (i.e., information epidemiology), which investigates the transmission of health-related information. Due to the high volume of online misinformation in recent years, there is a need to continue advancing methodologies in order to effectively identify narratives and themes. While machine learning models can be used to detect misinformation and conspiracies, these models are limited in their generalizability to other datasets and misinformation phenomenon, and are often unable to detect implicit meanings in text that require contextual knowledge. To rapidly detect evolving conspiracist narratives within high volume online discourse while identifying nuanced themes requiring the comprehension of subtext, this study describes a hybrid methodology that combines natural language processing (i.e., topic modeling and sentiment analysis) with qualitative content coding approaches to characterize conspiracy discourse related to 5G wireless technology and COVID-19 on Twitter (currently known as 'X'). Discourse that focused on correcting 5G conspiracies was also analyzed for comparison. Sentiment analysis shows that conspiracy-related discourse was more likely to use language that was analytic, combative, past-oriented, referenced social status, and expressed negative emotions. Corrections discourse was more likely to use words reflecting cognitive processes, prosocial relations, health-related consequences, and future-oriented language. Inductive coding characterized conspiracist narratives related to global elites, anti-vax sentiment, medical authorities, religious figures, and false correlations between technology advancements and disease outbreaks. Further, the corrections discourse did not address many of the narratives prevalent in conspiracy conversations. This paper aims to further bridge the gap between computational and qualitative methodologies by demonstrating how both approaches can be used in tandem to emphasize the positive aspects of each methodology while minimizing their respective drawbacks.
Collapse
Affiliation(s)
- Michael Robert Haupt
- Department of Cognitive Science, University of California San Diego, La Jolla, California, United States of America
- Global Health Policy & Data Institute, San Diego, California, United States of America
| | - Michelle Chiu
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, United States of America
| | - Joseline Chang
- Rady School of Management, University of California San Diego, La Jolla, California, United States of America
| | - Zoe Li
- Global Health Policy & Data Institute, San Diego, California, United States of America
- S-3 Research, San Diego, California, United States of America
| | - Raphael Cuomo
- Department of Anesthesiology, University of California, San Diego School of Medicine, San Diego, California, United States of America
| | - Tim K. Mackey
- S-3 Research, San Diego, California, United States of America
- Global Health Program, Department of Anthropology, University of California, San Diego, California, United States of America
| |
Collapse
|
3
|
Mori Y, Miyatake N, Suzuki H, Mori Y, Okada S, Tanimoto K. Comparison of Impressions of COVID-19 Vaccination and Influenza Vaccination in Japan by Analyzing Social Media Using Text Mining. Vaccines (Basel) 2023; 11:1327. [PMID: 37631895 PMCID: PMC10458112 DOI: 10.3390/vaccines11081327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Revised: 07/26/2023] [Accepted: 08/03/2023] [Indexed: 08/29/2023] Open
Abstract
The aim of this study was to compare impressions of COVID-19 vaccination and influenza vaccination in Japan by analyzing social media (Twitter®) using a text-mining method. We obtained 10,000 tweets using the keywords "corona vaccine" and "influenza vaccine" on 15 December 2022 and 19 February 2023. We then counted the number of times the words were used and listed frequency of these words by a text-mining method called KH Coder. We also investigated concepts in the data using groups of words that often appeared together or groups of documents that contained the same words using multi-dimensional scaling (MDS). "Death" in relation to corona vaccine and "severe disease" for influenza vaccine were frequently used on 15 December 2022. The number of times the word "death" was used decreased, "after effect" was newly recognized for corona vaccine, and "severe disease" was not used in relation to influenza vaccine. Through this comprehensive analysis of social media data, we observed distinct variations in public perceptions of corona vaccination and influenza vaccination in Japan. These findings provide valuable insights for public health authorities and policymakers to better understand public sentiment and tailor their communication strategies accordingly.
Collapse
Affiliation(s)
- Yoshiro Mori
- Department of Hygiene, Faculty of Medicine, Kagawa University, Miki 761-0793, Japan; (N.M.); (H.S.)
- Sakaide City Hospital, Sakaide 762-8550, Japan; (S.O.); (K.T.)
| | - Nobuyuki Miyatake
- Department of Hygiene, Faculty of Medicine, Kagawa University, Miki 761-0793, Japan; (N.M.); (H.S.)
| | - Hiromi Suzuki
- Department of Hygiene, Faculty of Medicine, Kagawa University, Miki 761-0793, Japan; (N.M.); (H.S.)
| | - Yuka Mori
- Institute of Biomedical Sciences, Tokushima University Graduate School, Tokushima 770-8503, Japan;
| | - Setsuo Okada
- Sakaide City Hospital, Sakaide 762-8550, Japan; (S.O.); (K.T.)
| | | |
Collapse
|