1
|
Rook O, Zwart H, Dogterom M. Public attitudes to potential synthetic cells applications: Pragmatic support and ethical acceptance. PLoS One 2025; 20:e0319337. [PMID: 40014593 DOI: 10.1371/journal.pone.0319337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Accepted: 01/30/2025] [Indexed: 03/01/2025] Open
Abstract
Synthetic cells constructed bottom-up represent a novel direction in Synthetic Biology. It has the potential to deepen the scientific understanding of life and, in the longer run, to open up new pathways for medical and environmental applications. Mapping preliminary public attitudes towards emerging technologies is an important step to further societal discussion and stakeholder participation. We conducted a vignette survey with nationally representative samples from 13 European countries (Czech Republic, France, Germany, Greece, Hungary, Italy, the Netherlands, Poland, Romania, Spain, Sweden, Turkey, and UK; N = 8,382) to explore public attitudes towards prospective synthetic cell technologies, such as anticancer therapy, CO2 emissions conversion to biofuel, and industrial waste recycling. Using data-driven techniques, we built a decision tree model of the factors affecting participants' attitudes and summarized the prevalent themes behind one's motivation. Our findings suggest substantial public support for prospective synthetic cell applications in the societally beneficial fields, most notably in healthcare.
Collapse
Affiliation(s)
- Olga Rook
- Department of Bionanoscience, Kavli Institute of Nanoscience, Delft University of Technology, Delft, the Netherlands
- Erasmus School of Philosophy, Erasmus University Rotterdam, Rotterdam, the Netherlands
| | - Hub Zwart
- Erasmus School of Philosophy, Erasmus University Rotterdam, Rotterdam, the Netherlands
| | - Marileen Dogterom
- Department of Bionanoscience, Kavli Institute of Nanoscience, Delft University of Technology, Delft, the Netherlands
| |
Collapse
|
2
|
Kim S, Lee DG. Development and validation of an automated machine for self-injury assessment via young Koreans' natural writings. PLoS One 2025; 20:e0316619. [PMID: 39820936 PMCID: PMC11737660 DOI: 10.1371/journal.pone.0316619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Accepted: 12/13/2024] [Indexed: 01/19/2025] Open
Abstract
Self-injury is common in all countries, and 20% of South Korean youths experience self-injury. One of the barriers to assessment and treatment planning is the tendency of young self-injurers to conceal their identities. Following a new stream of research that uses online text data to assess psychological symptoms as they are described in online posts, this study developed a computerized machine that can analyze South Korean self-injurers' writing in assessing their self-injury severity. Based on 16,645 online posts, Study 1 developed a machine called the Korean Self-Injurious Text Reviewer (K-SITR) using Latent Dirichlet Allocation topic modeling and machine learning. The K-SITR's text-assessment results were statistically indistinguishable from those of professional counselors. Study 2 confirmed the validity of the K-SITR through a survey of 47 young Koreans who had experienced self-injury. Results showed that the K-SITR scores converged with participants' self-injury frequency and duration and discriminated from other heterogenous factors. The K-SITR also had incremental validity over two popular self-injury questionnaires. This study provides a new measure that may reduce the tendency of young self-injurers to self-conceal compared to traditional direct-item questionnaires.
Collapse
Affiliation(s)
- Seoyoung Kim
- Yonsei University Psychological Science Innovation Research Center, Yonsei University, Seoul, Republic of Korea
| | - Dong-gwi Lee
- Department of Psychology, Yonsei University, Seoul, Republic of Korea
| |
Collapse
|
3
|
Walsh J, Cave J, Griffiths F. Combining Topic Modeling, Sentiment Analysis, and Corpus Linguistics to Analyze Unstructured Web-Based Patient Experience Data: Case Study of Modafinil Experiences. J Med Internet Res 2024; 26:e54321. [PMID: 39662896 DOI: 10.2196/54321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 06/19/2024] [Accepted: 09/27/2024] [Indexed: 12/13/2024] Open
Abstract
BACKGROUND Patient experience data from social media offer patient-centered perspectives on disease, treatments, and health service delivery. Current guidelines typically rely on systematic reviews, while qualitative health studies are often seen as anecdotal and nongeneralizable. This study explores combining personal health experiences from multiple sources to create generalizable evidence. OBJECTIVE The study aims to (1) investigate how combining unsupervised natural language processing (NLP) and corpus linguistics can explore patient perspectives from a large unstructured dataset of modafinil experiences, (2) compare findings with Cochrane meta-analyses on modafinil's effectiveness, and (3) develop a methodology for analyzing such data. METHODS Using 69,022 posts from 790 sources, we used a variety of NLP and corpus techniques to analyze the data, including data cleaning techniques to maximize post context, Python for NLP techniques, and Sketch Engine for linguistic analysis. We used multiple topic mining approaches, such as latent Dirichlet allocation, nonnegative matrix factorization, and word-embedding methods. Sentiment analysis used TextBlob and Valence Aware Dictionary and Sentiment Reasoner, while corpus methods including collocation, concordance, and n-gram generation. Previous work had mapped topic mining to themes, such as health conditions, reasons for taking modafinil, symptom impacts, dosage, side effects, effectiveness, and treatment comparisons. RESULTS Key findings of the study included modafinil use across 166 health conditions, most frequently narcolepsy, multiple sclerosis, attention-deficit disorder, anxiety, sleep apnea, depression, bipolar disorder, chronic fatigue syndrome, fibromyalgia, and chronic disease. Word-embedding topic modeling mapped 70% of posts to predefined themes, while sentiment analysis revealed 65% positive responses, 6% neutral responses, and 28% negative responses. Notably, the perceived effectiveness of modafinil for various conditions strongly contrasts with the findings of existing randomized controlled trials and systematic reviews, which conclude insufficient or low-quality evidence of effectiveness. CONCLUSIONS This study demonstrated the value of combining NLP with linguistic techniques for analyzing large unstructured text datasets. Despite varying opinions, findings were methodologically consistent and challenged existing clinical evidence. This suggests that patient-generated data could potentially provide valuable insights into treatment outcomes, potentially improving clinical understanding and patient care.
Collapse
Affiliation(s)
- Julia Walsh
- Warwick Medical School, University of Warwick, Coventry, United Kingdom
| | - Jonathan Cave
- Department of Economics, University of Warwick, Coventry, United Kingdom
| | - Frances Griffiths
- Warwick Medical School, University of Warwick, Coventry, United Kingdom
- Centre for Health Policy, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
4
|
Rezapour M, Yazdinejad M, Rajabi Kouchi F, Habibi Baghi M, Khorrami Z, Khavanin Zadeh M, Pourbaghi E, Rezapour H. Text mining of hypertension researches in the west Asia region: a 12-year trend analysis. Ren Fail 2024; 46:2337285. [PMID: 38616180 PMCID: PMC11018045 DOI: 10.1080/0886022x.2024.2337285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 03/27/2024] [Indexed: 04/16/2024] Open
Abstract
More than half of the world population lives in Asia and hypertension (HTN) is the most prevalent risk factor found in Asia. There are numerous articles published about HTN in Eastern Mediterranean Region (EMRO) and artificial intelligence (AI) methods can analyze articles and extract top trends in each country. Present analysis uses Latent Dirichlet allocation (LDA) as an algorithm of topic modeling (TM) in text mining, to obtain subjective topic-word distribution from the 2790 studies over the EMRO. The period of checked studied is last 12 years and results of LDA analyses show that HTN researches published in EMRO discuss on changes in BP and the factors affecting it. Among the countries in the region, most of these articles are related to I.R Iran and Egypt, which have an increasing trend from 2017 to 2018 and reached the highest level in 2021. Meanwhile, Iraq and Lebanon have been conducting research since 2010. The EMRO word cloud illustrates 'BMI', 'mortality', 'age', and 'meal', which represent important indicators, dangerous outcomes of high BP, and gender of HTN patients in EMRO, respectively.
Collapse
Affiliation(s)
- Mohammad Rezapour
- Faculty Member of the Iranian Ministry of Science, Research and Technology, Tehran, Iran
| | | | - Faezeh Rajabi Kouchi
- Department of Computer Engineering, Central Tehran Branch, Islamic Azad University, Tehran, Iran
| | | | - Zahra Khorrami
- Ophthalmic Epidemiology Research Center, Research Institute for Ophthalmology and Vision Science, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Morteza Khavanin Zadeh
- Hasheminejad Kidney Center, School of Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Elmira Pourbaghi
- Faculty of Advanced Sciences and Technology, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
| | - Hassan Rezapour
- Department of Transportation and Urban Infrastructure Studies, Morgan State University, Baltimore, MD, USA
| |
Collapse
|
5
|
Mouafo PT, Nkengfack H, Tchoffo RN, Nguepi ND, Domguia EN. Examining the effectiveness of dissuasive taxes as a policy tool for reducing tobacco and alcohol consumption in Cameroon: A welfare and microsimulation analysis. Heliyon 2024; 10:e40174. [PMID: 39605828 PMCID: PMC11600084 DOI: 10.1016/j.heliyon.2024.e40174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2024] [Revised: 10/31/2024] [Accepted: 11/05/2024] [Indexed: 11/29/2024] Open
Abstract
Deterrent taxes are a crucial policy tool for reducing the consumption of harmful products like tobacco and alcohol. However, assessing dissuasive taxes impact different income groups is important to ensure that their burden is not disproportionately borne by low-income households. This study examines the effectiveness of deterrent taxes as an economic policy tool for reducing tobacco and alcohol consumption in Cameroon. We analyse the impact on household welfare and distributional effects using microsimulation analysis. The data come from the Cameroon Household Living Conditions Survey and the 2022 tax records. Our methodology is based on a dynamic computable general equilibrium (CGE) model enriched with an addiction model. The results indicate that deterrent taxes can significantly reduce the consumption of these harmful products but also have regressive effects on low-income households. In response, we recommend the adoption of a progressive tax structure and the establishment of targeted support programmes to mitigate the negative impact on vulnerable populations.
Collapse
Affiliation(s)
- Paul Tadzong Mouafo
- University of Dschang, Centre de Recherche de Management et d’Economie (CERME), Dschang, Cameroon
| | - Hilaire Nkengfack
- University of Dschang, Faculty of Economics and Management, Dschang, Cameroon
| | - Rodrigue Nobosse Tchoffo
- University of Dschang, Groupe de Recherche en Economie Appliquée et Developpement (GREAD), Dschang, Cameroon
| | - Nelson Derrick Nguepi
- Université catholique de Louvain, Economics School of Louvain (Belgium), University of Dschang, Groupe de Recherche en Economie Appliquée et Developpement (GREAD), Dschang, Cameroon
| | - Edmond Noubissi Domguia
- University of Dschang, Centre de Recherche de Management et d’Economie (CERME), Dschang, Cameroon
| |
Collapse
|
6
|
Shah SM, Aljawarneh MM, Saleem MA, Jawarneh MS. Mental illness detection through harvesting social media: a comprehensive literature review. PeerJ Comput Sci 2024; 10:e2296. [PMID: 39650445 PMCID: PMC11623008 DOI: 10.7717/peerj-cs.2296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 08/09/2024] [Indexed: 12/11/2024]
Abstract
Mental illness is a common disease that at its extremes leads to personal and societal suffering. A complicated multi-factorial disease, mental illness is influenced by a number of socioeconomic and clinical factors, including individual risk factors. Traditionally, approaches relying on personal interviews and filling out questionnaires have been employed to diagnose mental illness; however, these manual procedures have been found to be frequently prone to errors and unable to reliably identify individuals with mental illness. Fortunately, people with mental illnesses frequently express their ailments on social media, making it possible to more precisely identify mental disease by harvesting their social media posts. This study offers a thorough analysis of how to identify mental illnesses (more specifically, depression) from users' social media data. Along with the explanation of data acquisition, preprocessing, feature extraction, and classification techniques, the most recent published literature is presented to give the readers a thorough understanding of the subject. Since, in the recent past, the majority of the relevant scientific community has focused on using machine learning (ML) and deep learning (DL) models to identify mental illness, so the review also focuses on these techniques and along with their detail, their critical analysis is presented. More than 100 DL, ML, and natural language processing (NLP) based models developed for mental illness in the recent past have been reviewed, and their technical contributions and strengths are discussed. There exist multiple review studies, however, discussing extensive recent literature along with the complete road map on how to design a mental illness detection system using social media data and ML and DL classification methods is limited. The review also includes detail on how a dataset may be acquired from social media platforms, how it is preprocessed, and features are extracted from it to employ for mental illness detection. Hence, we anticipate that this review will help readers learn more and give them a comprehensive road map for identifying mental illnesses using users' social media data.
Collapse
Affiliation(s)
- Shahid Munir Shah
- Faculty of Engineering Sciences and Technology, Hamdard University, Karachi, Pakistan
| | | | - Muhammad Aamer Saleem
- Faculty of Engineering Sciences and Technology, Hamdard University, Karachi, Pakistan
| | | |
Collapse
|
7
|
Thiruganasambandamoorthy V, Probst MA, Poterucha TJ, Sandhu RK, Toarta C, Raj SR, Sheldon R, Rahgozar A, Grant L. Role of Artificial Intelligence in Improving Syncope Management. Can J Cardiol 2024; 40:1852-1864. [PMID: 38838932 DOI: 10.1016/j.cjca.2024.05.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/25/2024] [Accepted: 05/01/2024] [Indexed: 06/07/2024] Open
Abstract
Syncope is common in the general population and a common presenting symptom in acute care settings. Substantial costs are attributed to the care of patients with syncope. Current challenges include differentiating syncope from its mimickers, identifying serious underlying conditions that caused the syncope, and wide variations in current management. Although validated risk tools exist, especially for short-term prognosis, there is inconsistent application, and the current approach does not meet patient needs and expectations. Artificial intelligence (AI) techniques, such as machine learning methods including natural language processing, can potentially address the current challenges in syncope management. Preliminary evidence from published studies indicates that it is possible to accurately differentiate syncope from its mimickers and predict short-term prognosis and hospitalisation. More recently, AI analysis of electrocardiograms has shown promise in detection of serious structural and functional cardiac abnormalities, which has the potential to improve syncope care. Future AI studies have the potential to address current issues in syncope management. AI can automatically prognosticate risk in real time by accessing traditional and nontraditional data. However, steps to mitigate known problems such as generalisability, patient privacy, data protection, and liability will be needed. In the past AI has had limited impact due to underdeveloped analytical methods, lack of computing power, poor access to powerful computing systems, and availability of reliable high-quality data. All impediments except data have been solved. AI will live up to its promise to transform syncope care if the health care system can satisfy AI requirement of large scale, robust, accurate, and reliable data.
Collapse
Affiliation(s)
- Venkatesh Thiruganasambandamoorthy
- Department of Emergency Medicine, University of Ottawa, Ottawa, Ontario, Canada; Ottawa Hospital Research Institute, The Ottawa Hospital, Ottawa, Ontario, Canada; School of Epidemiology and Public Health, University of Ottawa, Ottawa, Ontario, Canada.
| | - Marc A Probst
- Department of Emergency Medicine, Columbia University Irving Medical Center, New York, New York, USA
| | - Timothy J Poterucha
- Seymour, Paul, and Gloria Milstein Division of Cardiology, Department of Medicine, Columbia University Irving Medical Center, New York, New York, USA
| | - Roopinder K Sandhu
- Libin Cardiovascular Institute, Department of Cardiac Sciences, University of Calgary, Calgary, Alberta, Canada
| | - Cristian Toarta
- Department of Emergency Medicine, McGill University, Montréal, Québec, Canada; McGill University Health Centre, Montréal, Québec, Canada
| | - Satish R Raj
- Libin Cardiovascular Institute, Department of Cardiac Sciences, University of Calgary, Calgary, Alberta, Canada
| | - Robert Sheldon
- Libin Cardiovascular Institute, Department of Cardiac Sciences, University of Calgary, Calgary, Alberta, Canada
| | - Arya Rahgozar
- Ottawa Hospital Research Institute, The Ottawa Hospital, Ottawa, Ontario, Canada; School of Engineering Design and Teaching Innovation, University of Ottawa, Ottawa, Ontario, Canada
| | - Lars Grant
- Department of Emergency Medicine, McGill University, Montréal, Québec, Canada; Lady Davis Research Institute, Montréal, Québec, Canada; Jewish General Hospital, Montréal, Québec, Canada
| |
Collapse
|
8
|
Zeng Q. Enhanced analysis of large-scale news text data using the bidirectional-Kmeans-LSTM-CNN model. PeerJ Comput Sci 2024; 10:e2213. [PMID: 39145200 PMCID: PMC11323039 DOI: 10.7717/peerj-cs.2213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Accepted: 07/01/2024] [Indexed: 08/16/2024]
Abstract
Traditional methods may be inefficient when processing large-scale data in the field of text mining, often struggling to identify and cluster relevant information accurately and efficiently. Additionally, capturing nuanced sentiment and emotional context within news text is challenging with conventional techniques. To address these issues, this article introduces an improved bidirectional-Kmeans-long short-term memory network-convolutional neural network (BiK-LSTM-CNN) model that incorporates emotional semantic analysis for high-dimensional news text visual extraction and media hotspot mining. The BiK-LSTM-CNN model comprises four modules: news text preprocessing, news text clustering, sentiment semantic analysis, and the BiK-LSTM-CNN model itself. By combining these components, the model effectively identifies common features within the input data, clusters similar news articles, and accurately analyzes the emotional semantics of the text. This comprehensive approach enhances both the accuracy and efficiency of visual extraction and hotspot mining. Experimental results demonstrate that compared to models such as Transformer, AdvLSTM, and NewRNN, BiK-LSTM-CNN achieves improvements in macro accuracy by 0.50%, 0.91%, and 1.34%, respectively. Similarly, macro recall rates increase by 0.51%, 1.24%, and 1.26%, while macro F1 scores improve by 0.52%, 1.23%, and 1.92%. Additionally, the BiK-LSTM-CNN model shows significant improvements in time efficiency, further establishing its potential as a more effective approach for processing and analyzing large-scale text data.
Collapse
Affiliation(s)
- Qingxiang Zeng
- College of Humanities and Media, Hubei University of Science and Technology, Xianning, Hubei, China
| |
Collapse
|
9
|
Wang L, Feng W, Zhang J, Li T. Fitness or socializing - A multi-dimensional analysis of online fitness communities users. iScience 2024; 27:109753. [PMID: 39040059 PMCID: PMC11261065 DOI: 10.1016/j.isci.2024.109753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 04/03/2024] [Accepted: 04/13/2024] [Indexed: 07/24/2024] Open
Abstract
The digital and social network revolution in the fitness industry will provide consumers with opportunities to achieve their healthy and active lifestyle goals both online and offline. The online fitness communities provide us an ideal context for the health behavior research with behavioral logs and user-generated content. Enhanced user profiles can empower platform operators to implement more tailored recommendations, thereby enhancing the efficiency of precision marketing and fitness promotion. This study aims to accurately construct user profiles for Chinese online fitness community users and provide future health promotion strategies accordingly. We propose a novel approach that integrates explicit behavior logs and implicit user preferences to accurately construct user profiles. Our findings indicate that users primarily prioritize fitness benefits, incentives, and decision-making. Our results demonstrate the relationship between fitness behavior and implicit preferences, suggesting that promoting fitness behavior can be achieved through streamlining decision-making processes, establishing incentive communities, and emphasizing benefits.
Collapse
Affiliation(s)
- Lei Wang
- Shandong University, School of Management, No. 27 Shanda South Road, Jinan 250100, Shandong, China
| | - Wanxuan Feng
- School of Physical Education, Shandong University, 17923 Jingshi Road, Jinan 250061, Shandong, China
| | - Jianghua Zhang
- Shandong University, School of Management, No. 27 Shanda South Road, Jinan 250100, Shandong, China
| | - Tuojian Li
- School of Physical Education, Shandong University, 17923 Jingshi Road, Jinan 250061, Shandong, China
| |
Collapse
|
10
|
Chen X, Zhang M, Bu Q, Tan B, Peng P, Zhou Y, Tang Y, Tian X, Deng D. Exploring hot topics and evolutionary paths in the Diagnosis-Related Groups (DRGs) field: a comparative study using LDA modeling. BMC Health Serv Res 2024; 24:756. [PMID: 38907246 PMCID: PMC11191315 DOI: 10.1186/s12913-024-11209-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 06/17/2024] [Indexed: 06/23/2024] Open
Abstract
BACKGROUND This study reviews the research status of Diagnosis-related groups (DRGs) payment system in China and globally by analyzing topical issues in this field and exploring the evolutionary trends of DRGs in different developmental stages. METHODS Abstracts of relevant literature in the field of DRGs were extracted from the China National Knowledge Infrastructure (CNKI) database and the Web of Science (WoS) core database and used as text data. A probabilistic distribution-based Latent Dirichlet Allocation (LDA) topic model was applied to mine the text topics. Topical issues were determined by topic intensity, and the cosine similarity of the topics in adjacent stages was calculated to analyze the topic evolution trend. RESULTS A total of 6,758 English articles and 3,321 Chinese articles were included. Foreign research on DRGs focuses on grouping optimization, implementation effects, and influencing factors, whereas research topics in China focus on grouping and payment mechanism establishment, medical cost change evaluation, medical quality control, and performance management reform exploration. CONCLUSIONS Currently, the field of DRGs in China is developing rapidly and attracting deepening research. However, the implementation depth of research in China remains insufficient compared with the in-depth research conducted abroad.
Collapse
Affiliation(s)
- Xinrui Chen
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Meng Zhang
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Qingqing Bu
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Bo Tan
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Peng Peng
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Yilin Zhou
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Yuqin Tang
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Xiaoqin Tian
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Dan Deng
- School of Public Health, Chongqing Medical University, Chongqing, China.
| |
Collapse
|
11
|
Jones AL, Shiramizu V, Jones BC. Decoding the language of first impressions: Comparing models of first impressions of faces derived from free-text descriptions and trait ratings. Br J Psychol 2024. [PMID: 38886926 DOI: 10.1111/bjop.12717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 05/24/2024] [Accepted: 06/01/2024] [Indexed: 06/20/2024]
Abstract
First impressions formed from facial appearance predict important social outcomes. Existing models of these impressions indicate they are underpinned by dimensions of Valence and Dominance, and are typically derived by applying data reduction methods to explicit ratings of faces for a range of traits. However, this approach is potentially problematic because the trait ratings may not fully capture the dimensions on which people spontaneously assess faces. Here, we used natural language processing to extract 'topics' directly from participants' free-text descriptions (i.e., their first impressions) of 2222 face images. Two topics emerged, reflecting first impressions related to positive emotional valence and warmth (Topic 1) and negative emotional valence and potential threat (Topic 2). Next, we investigated how these topics were related to Valence and Dominance components derived from explicit trait ratings. Collectively, these components explained only ~44% of the variance in the topics extracted from free-text descriptions and suggested that first impressions are underpinned by correlated valence dimensions that subsume the content of existing trait-rating-based models. Natural language offers a promising new avenue for understanding social cognition, and future work can examine the predictive utility of natural language and traditional data-driven models for impressions in varying social contexts.
Collapse
Affiliation(s)
- Alex L Jones
- School of Psychology, Swansea University, Swansea, UK
| | - Victor Shiramizu
- Department of Psychological Sciences & Health, University of Strathclyde, Glasgow, Scotland
| | - Benedict C Jones
- Department of Psychological Sciences & Health, University of Strathclyde, Glasgow, Scotland
| |
Collapse
|
12
|
Muthusami R, Mani Kandan N, Saritha K, Narenthiran B, Nagaprasad N, Ramaswamy K. Investigating topic modeling techniques through evaluation of topics discovered in short texts data across diverse domains. Sci Rep 2024; 14:12003. [PMID: 38796483 PMCID: PMC11127968 DOI: 10.1038/s41598-024-61738-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 05/09/2024] [Indexed: 05/28/2024] Open
Abstract
The online channel has affected many facets of an individual's identity, commercial, social policy, and culture, among others. It implies that discovering the topics on which these brief writings are focused, as well as examining the qualities of these short texts is critical. Another key issue that has been identified is the evaluation of newly discovered topics in terms of topic quality, which includes topic separation and coherence. A topic modeling method has been shown to be an outstanding aid in the linguistic interpretation of quite tiny texts. Based on the underlying strategy, topic models are divided into two categories: probabilistic methods and non-probabilistic methods. In this research, short texts are analyzed using topic models, including latent Dirichlet allocation (LDA) for probabilistic topic modeling and non-negative matrix factorization (NMF) for non-probabilistic topic modeling. A novel approach for topic evaluation is used, such as clustering methods and silhouette analysis on both models, to investigate performance in terms of quality. The experiment results indicate that the proposed evaluation method outperforms on both LDA and NMF.
Collapse
Affiliation(s)
- R Muthusami
- Department of Computer Applications, Dr. Mahalingam College of Engineering and Technology, Pollachi, Tamil Nadu, India
| | - N Mani Kandan
- Department of Mechanical Engineering, P.A. College of Engineering and Technology, Pollachi, 642002, Tamil Nadu, India
| | - K Saritha
- Department of Mathematics, P. A. College of Engineering and Technology, Pollachi, Tamil Nadu, India
| | - B Narenthiran
- Department of Mechanical Engineering, Karpagam Academy of Higher Education, Eachanari, Coimbatore, 641021, Tamil Nadu, India
| | - N Nagaprasad
- Department of Mechanical Engineering, ULTRA College of Engineering and Technology, Madurai, 625104, Tamil Nadu, India
| | - Krishnaraj Ramaswamy
- Centre for Excellence-Indigenous Knowledge, Innovative Technology Transfer and Entrepreneurship, Dambi Dollo University, Dambi Dollo, Ethiopia.
- Department of Mechanical Engineering, College of Engineering and Technology, Dambi Dollo University, Dambi Dollo, Ethiopia.
| |
Collapse
|
13
|
Ma W, Wang W. Evolution of renewable energy laws and policies in China. Heliyon 2024; 10:e29712. [PMID: 38681606 PMCID: PMC11046223 DOI: 10.1016/j.heliyon.2024.e29712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 04/03/2024] [Accepted: 04/14/2024] [Indexed: 05/01/2024] Open
Abstract
This study employs Latent Dirichlet Allocation (LDA) topic modelling methodology to analyze documents related to renewable energy laws and policies at the central level in China. The objective is to investigate the development and evolution of renewable energy policies in China and to gain insights into the national-level attitudes towards renewable energy development. The study consists of two phases: initially, renewable energy policy documents undergo keyword analysis using word clouds and keyword co-occurrence network analysis to elucidate the focal areas and their interconnections within the legal and policy texts. Subsequently, after determining the optimal number of topics for modelling based on topic perplexity and consistency results, the text undergoes data cleaning to isolate words with practical significance. These words are then incorporated into the LDA topic model to analyze the distribution and content of potential topics within the policies. Lastly, by linearly segmenting the time frame, changes in topic intensity over time are visually examined using heat maps. The findings indicate that energy policies have consistently prioritized "development" and emphasized the significance of "new energy" in renewable energy policies. Moreover, as renewable energy has progressed, governments and policymakers have come to acknowledge the importance of comprehensive energy planning, transitioning to clean energy sources, and regulating the electricity market. This growing awareness has led to efforts to strengthen policy and regulatory measures to foster renewable energy's sustainable development and utilization. In summary, this study highlights the effectiveness of the LDA topic model in analyzing renewable energy policies, advancing its adoption and furthering research in the field.
Collapse
Affiliation(s)
- Wenyu Ma
- SINOPEC Geophysical Research Institute Co. Ltd, Nanjing City, China
| | - Wenyu Wang
- City University of Hong Kong, Hong Kong, China
| |
Collapse
|
14
|
Méndez-Cruz CF, Rodríguez-Herrera J, Varela-Vega A, Mateo-Estrada V, Castillo-Ramírez S. Unsupervised learning and natural language processing highlight research trends in a superbug. Front Artif Intell 2024; 7:1336071. [PMID: 38576460 PMCID: PMC10991725 DOI: 10.3389/frai.2024.1336071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 03/11/2024] [Indexed: 04/06/2024] Open
Abstract
Introduction Antibiotic-resistant Acinetobacter baumannii is a very important nosocomial pathogen worldwide. Thousands of studies have been conducted about this pathogen. However, there has not been any attempt to use all this information to highlight the research trends concerning this pathogen. Methods Here we use unsupervised learning and natural language processing (NLP), two areas of Artificial Intelligence, to analyse the most extensive database of articles created (5,500+ articles, from 851 different journals, published over 3 decades). Results K-means clustering found 113 theme clusters and these were defined with representative terms automatically obtained with topic modelling, summarising different research areas. The biggest clusters, all with over 100 articles, are biased toward multidrug resistance, carbapenem resistance, clinical treatment, and nosocomial infections. However, we also found that some research areas, such as ecology and non-human infections, have received very little attention. This approach allowed us to study research themes over time unveiling those of recent interest, such as the use of Cefiderocol (a recently approved antibiotic) against A. baumannii. Discussion In a broader context, our results show that unsupervised learning, NLP and topic modelling can be used to describe and analyse the research themes for important infectious diseases. This strategy should be very useful to analyse other ESKAPE pathogens or any other pathogens relevant to Public Health.
Collapse
Affiliation(s)
- Carlos-Francisco Méndez-Cruz
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Joel Rodríguez-Herrera
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Alfredo Varela-Vega
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Valeria Mateo-Estrada
- Programa de Genómica Evolutiva, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Santiago Castillo-Ramírez
- Programa de Genómica Evolutiva, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| |
Collapse
|
15
|
Ramamoorthy T, Kulothungan V, Mappillairaju B. Topic modeling and social network analysis approach to explore diabetes discourse on Twitter in India. Front Artif Intell 2024; 7:1329185. [PMID: 38410423 PMCID: PMC10895681 DOI: 10.3389/frai.2024.1329185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Accepted: 01/22/2024] [Indexed: 02/28/2024] Open
Abstract
Introduction The utilization of social media presents a promising avenue for the prevention and management of diabetes. To effectively cater to the diabetes-related knowledge, support, and intervention needs of the community, it is imperative to attain a deeper understanding of the extent and content of discussions pertaining to this health issue. This study aims to assess and compare various topic modeling techniques to determine the most effective model for identifying the core themes in diabetes-related tweets, the sources responsible for disseminating this information, the reach of these themes, and the influential individuals within the Twitter community in India. Methods Twitter messages from India, dated between 7 November 2022 and 28 February 2023, were collected using the Twitter API. The unsupervised machine learning topic models, namely, Latent Dirichlet Allocation (LDA), non-negative matrix factorization (NMF), BERTopic, and Top2Vec, were compared, and the best-performing model was used to identify common diabetes-related topics. Influential users were identified through social network analysis. Results The NMF model outperformed the LDA model, whereas BERTopic performed better than Top2Vec. Diabetes-related conversations revolved around eight topics, namely, promotion, management, drug and personal story, consequences, risk factors and research, raising awareness and providing support, diet, and opinion and lifestyle changes. The influential nodes identified were mainly health professionals and healthcare organizations. Discussion The study identified important topics of discussion along with health professionals and healthcare organizations involved in sharing diabetes-related information with the public. Collaborations among influential healthcare organizations, health professionals, and the government can foster awareness and prevent noncommunicable diseases.
Collapse
Affiliation(s)
- Thilagavathi Ramamoorthy
- School of Public Health, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| | - Vaitheeswaran Kulothungan
- ICMR-National Centre for Disease Informatics and Research, Bengaluru, India
- SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| | - Bagavandas Mappillairaju
- Centre for Statistics, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| |
Collapse
|
16
|
Cahill PT, Ng S, Turkstra LS, Ferro MA, Campbell WN. Exploring the valued outcomes of school-based speech-language therapy services: a sequential iterative design. FRONTIERS IN REHABILITATION SCIENCES 2024; 5:1290800. [PMID: 38313699 PMCID: PMC10834652 DOI: 10.3389/fresc.2024.1290800] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 01/10/2024] [Indexed: 02/06/2024]
Abstract
Background Achieving outcomes that community members value is essential to high-quality, family-centred care. These valued outcomes should inform the production and interpretation of research evidence. To date, outcomes included in studies of service delivery models for speech-language services in schools have been narrowly defined, and do not match the outcomes suggested as important by families, teachers, and children. The most important outcomes of school-based, speech-languages services have not been directly and systematically investigated. We aimed to address this gap by asking school community members what outcomes were most relevant to evaluating and improving the delivery of speech-language services in schools. Methods A sequential, iterative mixed-method study was conducted using interviews with 14 family members, educators, and speech-language therapists that asked what outcomes or impacts of school-based services they considered most important or valuable. Summative content analysis was used to analyse the data. Structural topic modelling between rounds of qualitative analysis was used to describe both the quality and the quantity of the interview content. School community members' perspectives were compared through estimation of topic proportions within interviews from each member group and through qualitative comparison. Results Structural topic modelling diagnostics and qualitative interpretation of topic output suggested a six-topic solution. This solution was estimated successfully and yielded the following topics: (1) meeting all needs appropriately, (2) teamwork and collaboration, (3) building capacities, (4) supporting individual student needs in context, (5) coordinating care, and finally (6) supporting core educational goals. Families focused on school-based services meeting all needs appropriately and coordinating care, while educators highlighted supporting individual student needs in context. By contrast, speech-language therapists emphasized building capacities and supporting core educational goals. All school community members agreed that current assessment tools and outcome measures were inadequate to capture the most important impacts of school-based services. Conclusions Outcomes identified by school community members as important or valuable were broad, and included individual student outcomes, interpersonal outcomes, and systems-level outcomes. Although these outcomes were discussed by all member groups, each group focused on different outcomes in the interviews, suggesting differences in the prioritization of outcomes. We recommend building consensus regarding the most important outcomes for school-based speech-language services, as well as the prioritization of outcomes for measure development.
Collapse
Affiliation(s)
- Peter T. Cahill
- School of Rehabilitation Science, McMaster University, Hamilton, ON, Canada
| | - Stella Ng
- Department of Speech-Language Pathology, University of Toronto, Toronto, ON, Canada
- Centre for Interprofessional Education, University of Toronto, Toronto, ON, Canada
| | - Lyn S. Turkstra
- School of Rehabilitation Science, McMaster University, Hamilton, ON, Canada
| | - Mark A. Ferro
- School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada
- CanChild Centre for Childhood Disability Research, Hamilton, ON, Canada
| | - Wenonah N. Campbell
- School of Rehabilitation Science, McMaster University, Hamilton, ON, Canada
- CanChild Centre for Childhood Disability Research, Hamilton, ON, Canada
| |
Collapse
|
17
|
Bălăeț M, Kurtin DL, Gruia DC, Lerede A, Custovic D, Trender W, Jolly AE, Hellyer PJ, Hampshire A. Mapping the sociodemographic distribution and self-reported justifications for non-compliance with COVID-19 guidelines in the United Kingdom. Front Psychol 2023; 14:1183789. [PMID: 37539003 PMCID: PMC10395087 DOI: 10.3389/fpsyg.2023.1183789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 06/28/2023] [Indexed: 08/05/2023] Open
Abstract
Which population factors have predisposed people to disregard government safety guidelines during the COVID-19 pandemic and what justifications do they give for this non-compliance? To address these questions, we analyse fixed-choice and free-text responses to survey questions about compliance and government handling of the pandemic, collected from tens of thousands of members of the UK public at three 6-monthly timepoints. We report that sceptical opinions about the government and mainstream-media narrative, especially as pertaining to justification for guidelines, significantly predict non-compliance. However, free text topic modelling shows that such opinions are diverse, spanning from scepticism about government competence and self-interest to full-blown conspiracy theories, and covary in prevalence with sociodemographic variables. These results indicate that attempts to counter non-compliance through argument should account for this diversity in peoples' underlying opinions, and inform conversations aimed at bridging the gap between the general public and bodies of authority accordingly.
Collapse
Affiliation(s)
- Maria Bălăeț
- Department of Brain Sciences, Imperial College London, London, United Kingdom
| | - Danielle L. Kurtin
- Neuromodulation Lab, Department of Psychology, Faculty of Health and Medical Sciences, University of Surrey, Guildford, United Kingdom
| | - Dragos C. Gruia
- Department of Brain Sciences, Imperial College London, London, United Kingdom
| | - Annalaura Lerede
- Department of Brain Sciences, Imperial College London, London, United Kingdom
- UKRI Centre for Doctoral Training in AI for Healthcare, Department of Computing, Imperial College London, London, United Kingdom
| | - Darije Custovic
- Department of Brain Sciences, Imperial College London, London, United Kingdom
- UK Dementia Research Institute: Care Research & Technology, London, United Kingdom
| | - William Trender
- Department of Brain Sciences, Imperial College London, London, United Kingdom
- Engineering and Physical Sciences Research Council CDT Neurotechnology, Imperial College London, London, United Kingdom
| | - Amy E. Jolly
- NMR Unit, Queen Square Multiple Sclerosis Centre, UCL, Queen Square Institute of Neurology, Department of Neuroinflammation, Faculty of Brain Sciences, University College London, London, United Kingdom
| | - Peter J. Hellyer
- Department of Brain Sciences, Imperial College London, London, United Kingdom
- Centre for Neuroimaging Sciences, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom
| | - Adam Hampshire
- Department of Brain Sciences, Imperial College London, London, United Kingdom
| |
Collapse
|
18
|
Ma C, Qirui C. Spatial-temporal evolution pattern and optimization path of family education policy: An LDA thematic model approach. Heliyon 2023; 9:e17460. [PMID: 37415949 PMCID: PMC10320304 DOI: 10.1016/j.heliyon.2023.e17460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 06/10/2023] [Accepted: 06/19/2023] [Indexed: 07/08/2023] Open
Abstract
Family education policy plays a crucial role in modernizing family education. By examining the temporal and spatial evolution of this policy, its inherent logic, constructs, and optimal pathways can be better understood. The study analyzed local family education policy documents, extracting six major themes using the Latent Dirichlet Allocation (LDA) model, and presented them according to the calculated mean theme probability. The themes include parental ability, school security, institutional environment, government support, social coordination, and high-quality development. Parental ability and government support were found to be particularly prominent, suggesting that many local policies focus on enhancing parents' skills for delivering family education and bolstering the government's role in public affairs. This combines the dual responsibilities of being an educational entity and accountable subject in the joint development of family education. Understanding the characteristics and variations in temporal and spatial distribution can enrich family education policy design, fostering the high-quality development of family education initiatives. Based on the findings, the study proposes three optimization paths for policy design: promotion and empowerment (building a multi-cooperative system), regional interconnection (understanding the current state of local policies and leveraging their strengths), and breaking barriers (simultaneously promoting the inclusiveness of family education and brand development). This study emphasizes the needs of customizing family education policy based on the temporal and spatial features and local requirements for maximum outputs.
Collapse
|
19
|
Yuan Z, Hu W. Urban resilience to socioeconomic disruptions during the COVID-19 pandemic: Evidence from China. INTERNATIONAL JOURNAL OF DISASTER RISK REDUCTION : IJDRR 2023; 91:103670. [PMID: 37041883 PMCID: PMC10073087 DOI: 10.1016/j.ijdrr.2023.103670] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 03/24/2023] [Accepted: 03/30/2023] [Indexed: 05/05/2023]
Abstract
The COVID-19 pandemic and the associated restrictions have raised the awareness of building pandemic-resilient cities. Prior studies often evaluated the resilience of one type of urban system while lacking a comparison across various urban subsystems. This study fills this gap by measuring and comparing the adaptive resilience to the pandemic of various urban subsystems in Chinese cities. We propose a novel outcome measurement of the pandemic's socioeconomic impacts on cities, i.e., the citizens' complaints data, and use its temporal changes to measure cities' adaptive resilience to the pandemic. We find a wide range of urban subsystems were severely shocked by the pandemic, including the urban economy, construction-and-housing sector, welfare system, and education system. Different urban subsystems exhibit divergent degrees of adaptive resilience to the pandemic. Using cluster analysis, we also identify three types of cities with different patterns of adaptive resilience: cities whose general economies were the least resilient, cities whose construction-and-housing system was the least resilient, and cities that were mostly affected by restriction measures. Our findings contribute to the understanding of the pandemic's socioeconomic costs and help identify the divergent resilience of different urban subsystems so as to develop targeted policy interventions to improve cities' resilience to the pandemic.
Collapse
Affiliation(s)
- Zhihang Yuan
- Department of Public and International Affairs, City University of Hong Kong, Kowloon Tong, Hong Kong, China
| | - Wanyang Hu
- Department of Public and International Affairs, City University of Hong Kong, Kowloon Tong, Hong Kong, China
| |
Collapse
|
20
|
Moy AJ, Withall J, Hobensack M, Yeji Lee R, Levy DR, Rossetti SC, Rosenbloom ST, Johnson K, Cato K. Eliciting Insights From Chat Logs of the 25X5 Symposium to Reduce Documentation Burden: Novel Application of Topic Modeling. J Med Internet Res 2023; 25:e45645. [PMID: 37195741 PMCID: PMC10233429 DOI: 10.2196/45645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 03/03/2023] [Accepted: 03/30/2023] [Indexed: 04/03/2023] Open
Abstract
BACKGROUND Addressing clinician documentation burden through "targeted solutions" is a growing priority for many organizations ranging from government and academia to industry. Between January and February 2021, the 25 by 5: Symposium to Reduce Documentation Burden on US Clinicians by 75% (25X5 Symposium) convened across 2 weekly 2-hour sessions among experts and stakeholders to generate actionable goals for reducing clinician documentation over the next 5 years. Throughout this web-based symposium, we passively collected attendees' contributions to a chat functionality-with their knowledge that the content would be deidentified and made publicly available. This presented a novel opportunity to synthesize and understand participants' perceptions and interests from chat messages. We performed a content analysis of 25X5 Symposium chat logs to identify themes about reducing clinician documentation burden. OBJECTIVE The objective of this study was to explore unstructured chat log content from the web-based 25X5 Symposium to elicit latent insights on clinician documentation burden among clinicians, health care leaders, and other stakeholders using topic modeling. METHODS Across the 6 sessions, we captured 1787 messages among 167 unique chat participants cumulatively; 14 were private messages not included in the analysis. We implemented a latent Dirichlet allocation (LDA) topic model on the aggregated dataset to identify clinician documentation burden topics mentioned in the chat logs. Coherence scores and manual examination informed optimal model selection. Next, 5 domain experts independently and qualitatively assigned descriptive labels to model-identified topics and classified them into higher-level categories, which were finalized through a panel consensus. RESULTS We uncovered ten topics using the LDA model: (1) determining data and documentation needs (422/1773, 23.8%); (2) collectively reassessing documentation requirements in electronic health records (EHRs) (252/1773, 14.2%); (3) focusing documentation on patient narrative (162/1773, 9.1%); (4) documentation that adds value (147/1773, 8.3%); (5) regulatory impact on clinician burden (142/1773, 8%); (6) improved EHR user interface and design (128/1773, 7.2%); (7) addressing poor usability (122/1773, 6.9%); (8) sharing 25X5 Symposium resources (122/1773, 6.9%); (9) capturing data related to clinician practice (113/1773, 6.4%); and (10) the role of quality measures and technology in burnout (110/1773, 6.2%). Among these 10 topics, 5 high-level categories emerged: consensus building (821/1773, 46.3%), burden sources (365/1773, 20.6%), EHR design (250/1773, 14.1%), patient-centered care (162/1773, 9.1%), and symposium comments (122/1773, 6.9%). CONCLUSIONS We conducted a topic modeling analysis on 25X5 Symposium multiparticipant chat logs to explore the feasibility of this novel application and elicit additional insights on clinician documentation burden among attendees. Based on the results of our LDA analysis, consensus building, burden sources, EHR design, and patient-centered care may be important themes to consider when addressing clinician documentation burden. Our findings demonstrate the value of topic modeling in discovering topics associated with clinician documentation burden using unstructured textual content. Topic modeling may be a suitable approach to examine latent themes presented in web-based symposium chat logs.
Collapse
Affiliation(s)
- Amanda J Moy
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
| | - Jennifer Withall
- School of Nursing, Columbia University, New York, NY, United States
| | - Mollie Hobensack
- School of Nursing, Columbia University, New York, NY, United States
| | - Rachel Yeji Lee
- School of Nursing, Columbia University, New York, NY, United States
| | - Deborah R Levy
- School of Medicine, Yale University, New Haven, CT, United States
- Veteran's Affairs Connecticut Health Care System, Pain, Research, Informatics, Multi-morbidities Education Center, West Haven, CT, United States
| | - Sarah C Rossetti
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
- School of Nursing, Columbia University, New York, NY, United States
| | - S Trent Rosenbloom
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, United States
| | - Kevin Johnson
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, United States
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, United States
| | - Kenrick Cato
- School of Nursing, Columbia University, New York, NY, United States
- Department of Emergency Medicine, Columbia University Irving Medical Center, New York, NY, United States
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, United States
| |
Collapse
|
21
|
Smail B, Aliane H, Abdeldjalil O. Using an explicit query and a topic model for scientific article recommendation. EDUCATION AND INFORMATION TECHNOLOGIES 2023:1-14. [PMID: 37361818 PMCID: PMC10149631 DOI: 10.1007/s10639-023-11817-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Accepted: 04/13/2023] [Indexed: 06/28/2023]
Abstract
The search for relevant scientific articles is a crucial step in any research project. However, the vast number of articles published and available online in digital databases (Google Scholar, Semantic Scholar, etc.) can make this task tedious and negatively impact a researcher's productivity. This article proposes a new method of recommending scientific articles that takes advantage of content-based filtering. The challenge is to target relevant information that meets a researcher's needs, regardless of their research domain. Our recommendation method is based on semantic exploration using latent factors. Our goal is to achieve an optimal topic model that will serve as the basis for the recommendation process. Our experiences confirm our performance expectations, showing relevance and objectivity in the results.
Collapse
Affiliation(s)
- Boussaadi Smail
- DTISI, Research Center on Scientific and Technical Information Cerist, Algiers, Algeria
| | - Hassina Aliane
- Director of Information Sciences R&D Laboratory Head of Natural Language Processing and Digital Content Team Cerist, Algiers, Algeria
| | - Ouahabi Abdeldjalil
- 1.Polytech Tours, Imaging anBrain, University of Tours, INSERM U930 Tours, France
| |
Collapse
|
22
|
Laureate CDP, Buntine W, Linger H. A systematic review of the use of topic models for short text social media analysis. Artif Intell Rev 2023:1-33. [PMID: 37362887 PMCID: PMC10150353 DOI: 10.1007/s10462-023-10471-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/14/2023] [Indexed: 06/28/2023]
Abstract
Recently, research on short text topic models has addressed the challenges of social media datasets. These models are typically evaluated using automated measures. However, recent work suggests that these evaluation measures do not inform whether the topics produced can yield meaningful insights for those examining social media data. Efforts to address this issue, including gauging the alignment between automated and human evaluation tasks, are hampered by a lack of knowledge about how researchers use topic models. Further problems could arise if researchers do not construct topic models optimally or use them in a way that exceeds the models' limitations. These scenarios threaten the validity of topic model development and the insights produced by researchers employing topic modelling as a methodology. However, there is currently a lack of information about how and why topic models are used in applied research. As such, we performed a systematic literature review of 189 articles where topic modelling was used for social media analysis to understand how and why topic models are used for social media analysis. Our results suggest that the development of topic models is not aligned with the needs of those who use them for social media analysis. We have found that researchers use topic models sub-optimally. There is a lack of methodological support for researchers to build and interpret topics. We offer a set of recommendations for topic model researchers to address these problems and bridge the gap between development and applied research on short text topic models. Supplementary Information The online version contains supplementary material available at 10.1007/s10462-023-10471-x.
Collapse
Affiliation(s)
| | - Wray Buntine
- College of Engineering and Computer Science, VinUniversity, Vinhomes Ocean Park, Gia Lam District, Hanoi 10000 Vietnam
| | - Henry Linger
- Faculty of IT, Monash University, Wellington Rd, Clayton, VIC 3800 Australia
| |
Collapse
|
23
|
Srinivasarao U, Sharaff A. SMS sentiment classification using an evolutionary optimization based fuzzy recurrent neural network. MULTIMEDIA TOOLS AND APPLICATIONS 2023:1-32. [PMID: 37362691 PMCID: PMC10107590 DOI: 10.1007/s11042-023-15206-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 09/13/2022] [Accepted: 03/30/2023] [Indexed: 06/28/2023]
Abstract
Sentiment analysis using the inbox message polarity is a challenging task in text mining, this analysis is used to differentiate spam and ham messages in mail. Polarity estimation is mandatory for spam and ham identification, whereas developing a perfect architecture for such classification is the hot demanding topic. To fulfill that, fuzzy based Recurrent Neural network-based Harris Hawk optimization (FRNN-HHO) is introduced, which performs post-classification over the classified messages (spam and ham). Previously the authors tried to classify the spam and ham messages from the collection of SMSs. But sometimes, the spam messages may incorrectly be classified within the ham classes. This misclassification may reduce the accuracy. The sentiment analysis process is performed over the classified messages to improve such classification accuracy. The spam and ham messages from the available data are classified using a Kernel Extreme Learning Machine (KELM) classifier. The sentiment analysis and classification based experimental evaluation is carried out using accuracy, recall, f-measure, precision, RMSE, and MAE. The performance of the proposed architecture is evaluated using threedifferent datasets: SMS, Email, and spam-assassin. The Area under the curve (AUC) of the proposed approach is found to be 0.9699 (SMS dataset), 0.958 (Email dataset), and 0.95 (spam assassin).
Collapse
Affiliation(s)
- Ulligaddala Srinivasarao
- Department of Computer Science and Engineering, National Institute of Technology Raipur, Chhattisgarh, 492010 India
| | - Aakanksha Sharaff
- Department of Computer Science and Engineering, National Institute of Technology Raipur, Chhattisgarh, 492010 India
| |
Collapse
|
24
|
Cooper LN, Radunsky AP, Hanna JJ, Most ZM, Perl TM, Lehmann CU, Medford RJ. Analyzing an Emerging Pandemic on Twitter: Monkeypox. Open Forum Infect Dis 2023; 10:ofad142. [PMID: 37035497 PMCID: PMC10077829 DOI: 10.1093/ofid/ofad142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 03/15/2023] [Indexed: 03/19/2023] Open
Abstract
Background Social media platforms like Twitter provide important insights into the public's perceptions of global outbreaks like monkeypox. By analyzing tweets, we aimed to identify public knowledge and opinions on the monkeypox virus and related public health issues. Methods We analyzed English-language tweets using the keyword "monkeypox" from 1 May to 23 July 2022. We reported gender, ethnicity, and race of Twitter users and analyzed tweets to identify predominant sentiment and emotions. We performed topic modeling and compared cohorts of users who self-identify as LGBTQ+ (an abreviation for lesbian, gay, bisexual, transgender, queer, and/or questioning) allies versus users who do not, and cohorts identified as "bots" versus humans. Results A total of 48 330 tweets were written by LGBTQ+ self-identified advocates or allies. The mean sentiment score for all tweets was -0.413 on a -4 to +4 scale. Negative tweets comprised 39% of tweets. The most common emotions expressed were fear and sadness. Topic modeling identified unique topics among the 4 cohorts analyzed. Conclusions The spread of mis- and disinformation about monkeypox was common in our tweet library. Various conspiracy theories about the origins of monkeypox, its relationship to global economic concerns, and homophobic and racial comments were common. Conversely, many other tweets helped to provide information about monkeypox vaccines, disease symptoms, and prevention methods. Discussion of rising monkeypox case numbers globally was also a large aspect of the conversation. Conclusions We demonstrated that Twitter is an effective means of tracking sentiment about public healthcare issues. We gained insight into a subset of people, self-identified LGBTQ+ allies, who were more affected by monkeypox.
Collapse
Affiliation(s)
- Lauren N Cooper
- Clinical Informatics Center, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Alexander P Radunsky
- Department of Internal Medicine, Division of Infectious Diseases and Geographic Medicine, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - John J Hanna
- Clinical Informatics Center, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Department of Internal Medicine, Division of Infectious Diseases and Geographic Medicine, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Zachary M Most
- Department of Pediatrics, Division of Pediatric Infectious Disease, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Trish M Perl
- Department of Internal Medicine, Division of Infectious Diseases and Geographic Medicine, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Christoph U Lehmann
- Clinical Informatics Center, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Richard J Medford
- Clinical Informatics Center, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Department of Internal Medicine, Division of Infectious Diseases and Geographic Medicine, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
25
|
Golos AM, Guntuku SC, Piltch-Loeb R, Leininger LJ, Simanek AM, Kumar A, Albrecht SS, Dowd JB, Jones M, Buttenheim AM. Dear Pandemic: A topic modeling analysis of COVID-19 information needs among readers of an online science communication campaign. PLoS One 2023; 18:e0281773. [PMID: 36996093 PMCID: PMC10062627 DOI: 10.1371/journal.pone.0281773] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 02/01/2023] [Indexed: 03/31/2023] Open
Abstract
BACKGROUND The COVID-19 pandemic was accompanied by an "infodemic"-an overwhelming excess of accurate, inaccurate, and uncertain information. The social media-based science communication campaign Dear Pandemic was established to address the COVID-19 infodemic, in part by soliciting submissions from readers to an online question box. Our study characterized the information needs of Dear Pandemic's readers by identifying themes and longitudinal trends among question box submissions. METHODS We conducted a retrospective analysis of questions submitted from August 24, 2020, to August 24, 2021. We used Latent Dirichlet Allocation topic modeling to identify 25 topics among the submissions, then used thematic analysis to interpret the topics based on their top words and submissions. We used t-Distributed Stochastic Neighbor Embedding to visualize the relationship between topics, and we used generalized additive models to describe trends in topic prevalence over time. RESULTS We analyzed 3839 submissions, 90% from United States-based readers. We classified the 25 topics into 6 overarching themes: 'Scientific and Medical Basis of COVID-19,' 'COVID-19 Vaccine,' 'COVID-19 Mitigation Strategies,' 'Society and Institutions,' 'Family and Personal Relationships,' and 'Navigating the COVID-19 Infodemic.' Trends in topics about viral variants, vaccination, COVID-19 mitigation strategies, and children aligned with the news cycle and reflected the anticipation of future events. Over time, vaccine-related submissions became increasingly related to those surrounding social interaction. CONCLUSIONS Question box submissions represented distinct themes that varied in prominence over time. Dear Pandemic's readers sought information that would not only clarify novel scientific concepts, but would also be timely and practical to their personal lives. Our question box format and topic modeling approach offers science communicators a robust methodology for tracking, understanding, and responding to the information needs of online audiences.
Collapse
Affiliation(s)
- Aleksandra M. Golos
- Department of Family and Community Health, School of Nursing, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Sharath Chandra Guntuku
- Department of Computer and Information Science, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, United States of America
- Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Rachael Piltch-Loeb
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, United States of America
- Emergency Preparedness Research Evaluation and Practice Program, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, United States of America
| | - Lindsey J. Leininger
- Tuck School of Business, Dartmouth College, Hanover, NH, United States of America
| | - Amanda M. Simanek
- Joseph J. Zilber School of Public Health, University of Wisconsin-Milwaukee, Milwaukee, WI, United States of America
| | - Aparna Kumar
- College of Nursing, Thomas Jefferson University, Philadelphia, PA, United States of America
| | - Sandra S. Albrecht
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY, United States of America
| | - Jennifer Beam Dowd
- Leverhulme Centre for Demographic Science, University of Oxford, Oxford, United Kingdom
- Department of Sociology, University of Oxford, Oxford, United Kingdom
- Nuffield College, University of Oxford, Oxford, United Kingdom
| | - Malia Jones
- Applied Population Laboratory, Department of Community and Environmental Sociology, College of Agricultural and Life Sciences, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Alison M. Buttenheim
- Department of Family and Community Health, School of Nursing, University of Pennsylvania, Philadelphia, PA, United States of America
- Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA, United States of America
- Center for Health Incentives and Behavioral Economics, University of Pennsylvania, Philadelphia, PA, United States of America
| |
Collapse
|
26
|
Gerber S, O’Hearn M, Cruz SM, Reedy J, Mozaffarian D. Changes in Food Security, Healthfulness, and Access During the Coronavirus Disease 2019 Pandemic: Results From a National United States Survey. Curr Dev Nutr 2023; 7:100060. [PMID: 36937244 PMCID: PMC9968449 DOI: 10.1016/j.cdnut.2023.100060] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/15/2023] [Accepted: 02/18/2023] [Indexed: 02/27/2023] Open
Abstract
Background Coronavirus disease 2019 (COVID-19) disrupted access to food and adequate nutrition and the types of foods consumed. However, little empiric data exists on the changes in American's food and nutrition habits 2 y into the pandemic. Objectives To assess current and altered food choices ∼2 y into the COVID-19 pandemic in the months after historic public pandemic relief. Methods A national sample of 1878 United States adults balanced by age, sex, race/ethnicity, and income completed a one-time, online, semi-quantitative, 44-item questionnaire in Fall 2021 asking about the demographics, COVID-19 food choice changes (including free-text), and consumer priorities. This analysis investigates COVID-19 impacts on food security, healthfulness, and access. Results More than 35% of respondents reported improved food security and >45% reported improved food healthfulness compared with prepandemic status. Improvement was reported in more than 30% of Black/African-American and Hispanic/Latinx adults, adults with lower annual income, and female sex, despite over 75% reporting reduced choice of where to eat or buy food. The pandemic offered occasion for many to improve diet, but a similar number expressed that the pandemic destabilized healthy habits. Conclusions Our novel findings suggest that by late 2021, most Americans had improved food security and food choice healthfulness, despite reduced access to food service and retail, although with worsening among a meaningful proportion of Americans as well as heterogeneity in these changes. Vigorous federal, state, city, and community responses to the pandemic may have played a role in improving the food security and food choice healthfulness during the COVID-19 pandemic. Health crises differently impact health behaviors, but when accompanied by vigorous civic and community response, food security, and food healthfulness can be fortified.
Collapse
Affiliation(s)
- Suzannah Gerber
- Gerald J. and Dorothy R. Friedman School of Nutrition Science and Policy, Tufts University, Boston, MA, United States
- Betty and Guy Beatty Liver and Obesity Research Center, Inova Medical System, Fairfax, VA, United States
| | - Meghan O’Hearn
- Gerald J. and Dorothy R. Friedman School of Nutrition Science and Policy, Tufts University, Boston, MA, United States
| | - Sylara Marie Cruz
- Gerald J. and Dorothy R. Friedman School of Nutrition Science and Policy, Tufts University, Boston, MA, United States
| | - Julia Reedy
- Gerald J. and Dorothy R. Friedman School of Nutrition Science and Policy, Tufts University, Boston, MA, United States
| | - Dariush Mozaffarian
- Gerald J. and Dorothy R. Friedman School of Nutrition Science and Policy, Tufts University, Boston, MA, United States
- Tufts Medical Center, Tufts University School of Medicine, Boston, MA, United States
| |
Collapse
|
27
|
Borghouts J, Huang Y, Gibbs S, Hopfer S, Li C, Mark G. Understanding underlying moral values and language use of COVID-19 vaccine attitudes on twitter. PNAS NEXUS 2023; 2:pgad013. [PMID: 36896130 PMCID: PMC9991494 DOI: 10.1093/pnasnexus/pgad013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 12/23/2022] [Accepted: 01/09/2023] [Indexed: 03/11/2023]
Abstract
Public sentiment toward the COVID-19 vaccine as expressed on social media can interfere with communication by public health agencies on the importance of getting vaccinated. We investigated Twitter data to understand differences in sentiment, moral values, and language use between political ideologies on the COVID-19 vaccine. We estimated political ideology, conducted a sentiment analysis, and guided by the tenets of moral foundations theory (MFT), we analyzed 262,267 English language tweets from the United States containing COVID-19 vaccine-related keywords between May 2020 and October 2021. We applied the Moral Foundations Dictionary and used topic modeling and Word2Vec to understand moral values and the context of words central to the discussion of the vaccine debate. A quadratic trend showed that extreme ideologies of both Liberals and Conservatives expressed a higher negative sentiment than Moderates, with Conservatives expressing more negative sentiment than Liberals. Compared to Conservative tweets, we found the expression of Liberal tweets to be rooted in a wider set of moral values, associated with moral foundations of care (getting the vaccine for protection), fairness (having access to the vaccine), liberty (related to the vaccine mandate), and authority (trusting the vaccine mandate imposed by the government). Conservative tweets were found to be associated with harm (around safety of the vaccine) and oppression (around the government mandate). Furthermore, political ideology was associated with the expression of different meanings for the same words, e.g. "science" and "death." Our results inform public health outreach communication strategies to best tailor vaccine information to different groups.
Collapse
Affiliation(s)
- Judith Borghouts
- Department of Medicine, University of California Irvine, Irvine, CA 92617, USA
| | - Yicong Huang
- Department of Computer Science, University of California Irvine, Irvine, CA 92697, USA
| | - Sydney Gibbs
- Department of Computer Science, University of California Irvine, Irvine, CA 92697, USA
| | - Suellen Hopfer
- Department of Health, Society & Behavior in the Program in Public Health, University of California Irvine, Irvine, CA 92697, USA
| | - Chen Li
- Department of Computer Science, University of California Irvine, Irvine, CA 92697, USA
| | - Gloria Mark
- Department of Computer Science, University of California Irvine, Irvine, CA 92697, USA
| |
Collapse
|
28
|
Kinariwala S, Deshmukh S. Short text topic modelling using local and global word-context semantic correlation. MULTIMEDIA TOOLS AND APPLICATIONS 2023; 82:1-23. [PMID: 36747894 PMCID: PMC9891888 DOI: 10.1007/s11042-023-14352-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 02/21/2022] [Accepted: 01/02/2023] [Indexed: 06/18/2023]
Abstract
Nowadays, people use short text to portray their opinions on platforms of social media such as Twitter, Facebook, and YouTube, as well as on e-commerce websites such as Amazon and Flipkart to share their commercial purchasing experiences. Every day, billions of short texts are created worldwide in tweets, tags, keywords, search queries etc. However, this short text possesses inadequate contextual information, which can be ambiguous, sparse, noisy, remains a major challenge. State-of-the-art strategies of topic modeling such as Latent Dirichlet Allocation and Probabilistic Latent Semantic Analysis are not suitable as it contains a limited number of words in a single document. This work proposes a new model named G_SeaNMF (Gensim_SeaNMF) to improve the word-context semantic relationship by using local and global word embedding techniques. Word embeddings learned from a large corpus provide general semantic and syntactic information about words; it can guide topic modeling for short text collections as supporting information for sparse co-occurrence patterns. In the proposed model, SeaNMF (Semantics-assisted Non-negative Matrix Factorization) is incorporated with word2vec model of Gensim library to strengthen the word's semantic relationship. In this article, a short text topic modeling techniques based on DMM (Dirichlet Multinomial Mixture), self-aggregation and global word co-occurrence were explored. These are evaluated using different measures to gauge cluster coherence on real-world datasets such as Search Snippet, Biomedicine, Pascal Flickr, Tweet and TagMyNews. Empirical evaluation shows that a combination of local and global word embedding provides more appropriate words under each topic with improved outcomes.
Collapse
Affiliation(s)
| | - Sachin Deshmukh
- Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, Maharashtra India
| |
Collapse
|
29
|
Chen Z, Zeng G, Zhong S, Wang L. From the exotic to the everyday: The Avocado crossing borders via cyberspace. Appetite 2023; 180:106362. [PMID: 36368563 DOI: 10.1016/j.appet.2022.106362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 10/28/2022] [Accepted: 10/29/2022] [Indexed: 11/09/2022]
Abstract
With the globalization of food sales and consumption, exotic foods are now regularly crossing geographical and cultural borders and moving into local areas. This process is attracting ever-increasing attention from academics. Taking avocado consumption presented on Sina Weibo as an example, this research analyzes avocado related user-generated content on Sina Weibo over three years- 2013, 2015, and 2017- and employs topic modeling and semantic network methods to obtain the mechanism by which exotic food cross borders to appear in local consumers' daily food choices. Two specific links are explored: online information dissemination and offline daily consumption. The result indicates that a selective geographical narrative and framework for avocado information influence local consumers' choice of exotic foods according to three aspects: edibility, accessibility, and acceptability. For local consumers, the avocado is now connected with local objects and spaces, gradually transforming from a novelty to functional daily food and from low to high-frequency consumption to high-frequency consumption, escaping the marginal and penetrating into the core cultural context and completing the process of embedment into the everyday. This study refutes the assertion that "globalized diets bring about homogenized diets," explores the mechanism of influence by which information dissemination in cyberspace affects cultural borders, complements the study of food consumption in Southern countries, and provides new thoughts on the theoretical and practical exploration of food globalization from the perspective of food geography.
Collapse
Affiliation(s)
- Zheng Chen
- School of Tourism Management, Sun Yat-Sen University, China.
| | - Guojun Zeng
- School of Tourism Management, Sun Yat-Sen University, China.
| | - Shuru Zhong
- School of Tourism Management, Sun Yat-Sen University, China.
| | - Longjie Wang
- School of Management, Zhejiang University, China.
| |
Collapse
|
30
|
Dehghani M, Ebrahimi F. ParsBERT topic modeling of Persian scientific articles about COVID-19. INFORMATICS IN MEDICINE UNLOCKED 2022; 36:101144. [PMID: 36573134 PMCID: PMC9771580 DOI: 10.1016/j.imu.2022.101144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Revised: 12/03/2022] [Accepted: 12/04/2022] [Indexed: 12/24/2022] Open
Abstract
Purpose The COVID-19 pandemic has indisputably impacted every aspect of human life, and a host of studies have investigated its different aspects. This paper models the contents of Persian literature on COVID-19. Method This is a descriptive-exploratory study in which 815 articles were collected from the Magiran database. The articles were published before March 2022. The abstracts and titles were used in the modeling. The modeling was performed by combining the latent Dirichlet allocation (LDA) algorithm with ParsBERT. Findings Topic modeling indicated ten major topics, including medicine, psychology, humanities, politics, management, biology, economics, culture, engineering, and religion. The articles under the category of medicine had the largest cluster (42.3%), while engineering and religion had the smallest clusters (1.1% each). Conclusion The found topics in the created clusters have structural relationships. The COVID-19 effect on physical and mental health (medical and psychological topics) is the most crucial factor. These clusters provide evidence that COVID-19 affects all facets of human society at three levels: the individual, family, and society. Aside from the ten critical clusters in the humanities field, the utmost disorder is related to teaching and learning. For the first time, this research has presented a model of scientific communication in the field of COVID-19 based on the data collected from a Persian database - Magiran.
Collapse
Affiliation(s)
- Mohammad Dehghani
- School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
| | - Fezzeh Ebrahimi
- Department of Knowledge and Information Science, University of Isfahan, Isfahan, Iran
| |
Collapse
|
31
|
Thompson HM, Sharma B, Smith DL, Bhalla S, Erondu I, Hazra A, Ilyas Y, Pachwicewicz P, Sheth NK, Chhabra N, Karnik NS, Afshar M. Machine Learning Techniques to Explore Clinical Presentations of COVID-19 Severity and to Test the Association With Unhealthy Opioid Use: Retrospective Cross-sectional Cohort Study. JMIR Public Health Surveill 2022; 8:e38158. [PMID: 36265163 PMCID: PMC9746674 DOI: 10.2196/38158] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 05/23/2022] [Accepted: 10/18/2022] [Indexed: 11/07/2022] Open
Abstract
BACKGROUND The COVID-19 pandemic has exacerbated health inequities in the United States. People with unhealthy opioid use (UOU) may face disproportionate challenges with COVID-19 precautions, and the pandemic has disrupted access to opioids and UOU treatments. UOU impairs the immunological, cardiovascular, pulmonary, renal, and neurological systems and may increase severity of outcomes for COVID-19. OBJECTIVE We applied machine learning techniques to explore clinical presentations of hospitalized patients with UOU and COVID-19 and to test the association between UOU and COVID-19 disease severity. METHODS This retrospective, cross-sectional cohort study was conducted based on data from 4110 electronic health record patient encounters at an academic health center in Chicago between January 1, 2020, and December 31, 2020. The inclusion criterion was an unplanned admission of a patient aged ≥18 years; encounters were counted as COVID-19-positive if there was a positive test for COVID-19 or 2 COVID-19 International Classification of Disease, Tenth Revision codes. Using a predefined cutoff with optimal sensitivity and specificity to identify UOU, we ran a machine learning UOU classifier on the data for patients with COVID-19 to estimate the subcohort of patients with UOU. Topic modeling was used to explore and compare the clinical presentations documented for 2 subgroups: encounters with UOU and COVID-19 and those with no UOU and COVID-19. Mixed effects logistic regression accounted for multiple encounters for some patients and tested the association between UOU and COVID-19 outcome severity. Severity was measured with 3 utilization metrics: low-severity unplanned admission, medium-severity unplanned admission and receiving mechanical ventilation, and high-severity unplanned admission with in-hospital death. All models controlled for age, sex, race/ethnicity, insurance status, and BMI. RESULTS Topic modeling yielded 10 topics per subgroup and highlighted unique comorbidities associated with UOU and COVID-19 (eg, HIV) and no UOU and COVID-19 (eg, diabetes). In the regression analysis, each incremental increase in the classifier's predicted probability of UOU was associated with 1.16 higher odds of COVID-19 outcome severity (odds ratio 1.16, 95% CI 1.04-1.29; P=.009). CONCLUSIONS Among patients hospitalized with COVID-19, UOU is an independent risk factor associated with greater outcome severity, including in-hospital death. Social determinants of health and opioid-related overdose are unique comorbidities in the clinical presentation of the UOU patient subgroup. Additional research is needed on the role of COVID-19 therapeutics and inpatient management of acute COVID-19 pneumonia for patients with UOU. Further research is needed to test associations between expanded evidence-based harm reduction strategies for UOU and vaccination rates, hospitalizations, and risks for overdose and death among people with UOU and COVID-19. Machine learning techniques may offer more exhaustive means for cohort discovery and a novel mixed methods approach to population health.
Collapse
Affiliation(s)
- Hale M Thompson
- Section of Community Behavioral Health, Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, United States
- Center for Education, Research, and Advocacy, Department of Social and Behavioral Research, Howard Brown Health, Chicago, IL, United States
| | - Brihat Sharma
- Section of Community Behavioral Health, Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, United States
| | - Dale L Smith
- Section of Community Behavioral Health, Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, United States
| | - Sameer Bhalla
- Department of Internal Medicine, Rush University Medical Center, Chicago, IL, United States
| | - Ihuoma Erondu
- Section of Community Behavioral Health, Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, United States
| | - Aniruddha Hazra
- Section of Infectious Diseases and Global Health, Department of Medicine, University of Chicago, Chicago, IL, United States
| | - Yousaf Ilyas
- Section of Community Behavioral Health, Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, United States
| | - Paul Pachwicewicz
- Section of Community Behavioral Health, Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, United States
| | - Neeral K Sheth
- Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, United States
| | - Neeraj Chhabra
- Department of Emergency Medicine, Rush University Medical College, Rush University Medical Center, Chicago, IL, United States
| | - Niranjan S Karnik
- Section of Community Behavioral Health, Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, United States
| | - Majid Afshar
- Division of Pulmonary and Critical Care, Department of Medicine, School of Medicine and Public Health, University of Wisconsin, Madison, WI, United States
| |
Collapse
|
32
|
Ljajić A, Prodanović N, Medvecki D, Bašaragin B, Mitrović J. Uncovering the Reasons Behind COVID-19 Vaccine Hesitancy in Serbia: Sentiment-Based Topic Modeling. J Med Internet Res 2022; 24:e42261. [PMID: 36301673 PMCID: PMC9671489 DOI: 10.2196/42261] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 09/29/2022] [Accepted: 09/29/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Since the first COVID-19 vaccine appeared, there has been a growing tendency to automatically determine public attitudes toward it. In particular, it was important to find the reasons for vaccine hesitancy, since it was directly correlated with pandemic protraction. Natural language processing (NLP) and public health researchers have turned to social media (eg, Twitter, Reddit, and Facebook) for user-created content from which they can gauge public opinion on vaccination. To automatically process such content, they use a number of NLP techniques, most notably topic modeling. Topic modeling enables the automatic uncovering and grouping of hidden topics in the text. When applied to content that expresses a negative sentiment toward vaccination, it can give direct insight into the reasons for vaccine hesitancy. OBJECTIVE This study applies NLP methods to classify vaccination-related tweets by sentiment polarity and uncover the reasons for vaccine hesitancy among the negative tweets in the Serbian language. METHODS To study the attitudes and beliefs behind vaccine hesitancy, we collected 2 batches of tweets that mention some aspects of COVID-19 vaccination. The first batch of 8817 tweets was manually annotated as either relevant or irrelevant regarding the COVID-19 vaccination sentiment, and then the relevant tweets were annotated as positive, negative, or neutral. We used the annotated tweets to train a sequential bidirectional encoder representations from transformers (BERT)-based classifier for 2 tweet classification tasks to augment this initial data set. The first classifier distinguished between relevant and irrelevant tweets. The second classifier used the relevant tweets and classified them as negative, positive, or neutral. This sequential classifier was used to annotate the second batch of tweets. The combined data sets resulted in 3286 tweets with a negative sentiment: 1770 (53.9%) from the manually annotated data set and 1516 (46.1%) as a result of automatic classification. Topic modeling methods (latent Dirichlet allocation [LDA] and nonnegative matrix factorization [NMF]) were applied using the 3286 preprocessed tweets to detect the reasons for vaccine hesitancy. RESULTS The relevance classifier achieved an F-score of 0.91 and 0.96 for relevant and irrelevant tweets, respectively. The sentiment polarity classifier achieved an F-score of 0.87, 0.85, and 0.85 for negative, neutral, and positive sentiments, respectively. By summarizing the topics obtained in both models, we extracted 5 main groups of reasons for vaccine hesitancy: concern over vaccine side effects, concern over vaccine effectiveness, concern over insufficiently tested vaccines, mistrust of authorities, and conspiracy theories. CONCLUSIONS This paper presents a combination of NLP methods applied to find the reasons for vaccine hesitancy in Serbia. Given these reasons, it is now possible to better understand the concerns of people regarding the vaccination process.
Collapse
Affiliation(s)
- Adela Ljajić
- The Institute for Artificial Intelligence Research and Development of Serbia, Novi Sad, Serbia
| | - Nikola Prodanović
- The Institute for Artificial Intelligence Research and Development of Serbia, Novi Sad, Serbia
| | - Darija Medvecki
- The Institute for Artificial Intelligence Research and Development of Serbia, Novi Sad, Serbia
| | - Bojana Bašaragin
- The Institute for Artificial Intelligence Research and Development of Serbia, Novi Sad, Serbia
| | - Jelena Mitrović
- The Institute for Artificial Intelligence Research and Development of Serbia, Novi Sad, Serbia
- Faculty of Computer Science and Mathematics, University of Passau, Passau, Germany
| |
Collapse
|
33
|
A Novel Framework to Detect Irrelevant Software Requirements Based on MultiPhiLDA as the Topic Model. INFORMATICS 2022. [DOI: 10.3390/informatics9040087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open
Abstract
Noise in requirements has been known to be a defect in software requirements specifications (SRS). Detecting defects at an early stage is crucial in the process of software development. Noise can be in the form of irrelevant requirements that are included within an SRS. A previous study had attempted to detect noise in SRS, in which noise was considered as an outlier. However, the resulting method only demonstrated a moderate reliability due to the overshadowing of unique actor words by unique action words in the topic–word distribution. In this study, we propose a framework to identify irrelevant requirements based on the MultiPhiLDA method. The proposed framework distinguishes the topic–word distribution of actor words and action words as two separate topic–word distributions with two multinomial probability functions. Weights are used to maintain a proportional contribution of actor and action words. We also explore the use of two outlier detection methods, namely percentile-based outlier detection (PBOD) and angle-based outlier detection (ABOD), to distinguish irrelevant requirements from relevant requirements. The experimental results show that the proposed framework was able to exhibit better performance than previous methods. Furthermore, the use of the combination of ABOD as the outlier detection method and topic coherence as the estimation approach to determine the optimal number of topics and iterations in the proposed framework outperformed the other combinations and obtained sensitivity, specificity, F1-score, and G-mean values of 0.59, 0.65, 0.62, and 0.62, respectively.
Collapse
|
34
|
Murshed BAH, Mallappa S, Abawajy J, Saif MAN, Al-ariki HDE, Abdulwahab HM. Short text topic modelling approaches in the context of big data: taxonomy, survey, and analysis. Artif Intell Rev 2022; 56:5133-5260. [PMID: 36320612 PMCID: PMC9607740 DOI: 10.1007/s10462-022-10254-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/17/2022] [Indexed: 12/01/2022]
Abstract
Social media platforms such as (Twitter, Facebook, and Weibo) are being increasingly embraced by individuals, groups, and organizations as a valuable source of information. This social media generated information comes in the form of tweets or posts, and normally characterized as short text, huge, sparse, and low density. Since many real-world applications need semantic interpretation of such short texts, research in Short Text Topic Modeling (STTM) has recently gained a lot of interest to reveal unique and cohesive latent topics. This article examines the current state of the art in STTM algorithms. It presents a comprehensive survey and taxonomy of STTM algorithms for short text topic modelling. The article also includes a qualitative and quantitative study of the STTM algorithms, as well as analyses of the various strengths and drawbacks of STTM techniques. Moreover, a comparative analysis of the topic quality and performance of representative STTM models is presented. The performance evaluation is conducted on two real-world Twitter datasets: the Real-World Pandemic Twitter (RW-Pand-Twitter) dataset and Real-world Cyberbullying Twitter (RW-CB-Twitter) dataset in terms of several metrics such as topic coherence, purity, NMI, and accuracy. Finally, the open challenges and future research directions in this promising field are discussed to highlight the trends of research in STTM. The work presented in this paper is useful for researchers interested in learning state-of-the-art short text topic modelling and researchers focusing on developing new algorithms for short text topic modelling.
Collapse
Affiliation(s)
- Belal Abdullah Hezam Murshed
- Department of Studies in Computer Science, Mysore University, Mysore, 570006 Karnataka India
- Department of Computer Science, College of Engineering and IT, Amran University, Amran, Yemen
| | - Suresha Mallappa
- Department of Studies in Computer Science, Mysore University, Mysore, 570006 Karnataka India
| | - Jemal Abawajy
- School of Information Technology, Faculty of Science, Engineering and Built Environment, Deakin University, Geelong, VIC 3220 Australia
| | - Mufeed Ahmed Naji Saif
- Department of Computer Applications, Sri Jayachamarajendra College of Engineering, VTU, Mysore, Karnataka India
| | - Hasib Daowd Esmail Al-ariki
- Department of Computer Networks and Distributed Systems, Al Saeed Faculty for Engineering and IT, Taiz University, Taiz, Yemen
- Department of Computer Networks Engineering and Technologies, Sana’a Community College, Sana’a, Yemen
| | | |
Collapse
|
35
|
Sankaranarayanan R, Leung J, Abramenka-Lachheb V, Seo G, Lachheb A. Microlearning in Diverse Contexts: A Bibliometric Analysis. TECHTRENDS : FOR LEADERS IN EDUCATION & TRAINING 2022; 67:260-276. [PMID: 36254216 PMCID: PMC9557991 DOI: 10.1007/s11528-022-00794-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 09/24/2022] [Indexed: 06/16/2023]
Abstract
In recent years, publications on microlearning have substantially increased, as this topic has received extensive attention from scholars in the instructional design and technology discipline. To better characterize and understand microlearning, there is a need for comprehensive bibliometrics assessments of the literature on microlearning. To this end, this bibliometric study collected 208 relevant publications on microlearning from the Scopus database, published in diverse contexts. Using quantitative topic modeling and qualitative content analysis methods, we identified four major themes in these publications, namely: (1) design of microlearning; (2) implementation of microlearning as an instructional method strategy and an intervention; (3) evaluation of microlearning; and (4) the utilization of mobile devices for microlearning. Based on the study findings, we discuss the significance of the study and provide implications for research and practice, particularly in fostering rigorous inquiry on the topic of microlearning, expanding the context of research to include K-12 settings, and focusing on mobile-based microlearning.
Collapse
Affiliation(s)
| | | | | | - Grace Seo
- Seattle Pacific University, Seattle, WA USA
| | | |
Collapse
|
36
|
Lokanan M. The determinants of investment fraud: A machine learning and artificial intelligence approach. Front Big Data 2022; 5:961039. [PMID: 36299659 PMCID: PMC9589362 DOI: 10.3389/fdata.2022.961039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 08/31/2022] [Indexed: 11/05/2022] Open
Abstract
Investment fraud continues to be a severe problem in the Canadian securities industry. This paper aims to employ machine learning algorithms and artificial neural networks (ANN) to predict investment in Canada. Data for this study comes from cases heard by the Investment Industry Regulatory Organization of Canada (IIROC) between June 2008 and December 2019. In total, 406 cases were collected and coded for further analysis. After data cleaning and pre-processing, a total of 385 cases were coded for further analysis. The machine learning algorithms and artificial neural networks were able to predict investment fraud with very good results. In terms of standardized coefficient, the top five features in predicting fraud are offender experience, retired investors, the amount of money lost, the amount of money invested, and the investors' net worth. Machine learning and artificial intelligence have a pivotal role in regulation because they can identify the risks associated with fraud by learning from the data they ingest to survey past practices and come up with the best possible responses to predict fraud. If used correctly, machine learning in the form of regulatory technology can equip regulators with the tools to take corrective actions and make compliance more efficient to safeguard the markets and protect investors from unethical investment advisors.
Collapse
|
37
|
Curley C, Siapera E, Carthy J. Covid-19 Protesters and the Far Right on Telegram: Co-Conspirators or Accidental Bedfellows? SOCIAL MEDIA + SOCIETY 2022; 8:20563051221129187. [PMID: 36317081 PMCID: PMC9597280 DOI: 10.1177/20563051221129187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The COVID-19 pandemic led to the creation of a new protest movement, positioned against government lockdowns, mandatory vaccines, and related measures. Efforts to control misinformation by digital platforms resulted in take downs of key accounts and posts. This led some of these protest groups to migrate to platforms with less stringent content moderation policies, such as Telegram. Telegram has also been one of the destinations of the far right, whose deplatforming from mainstream platforms began a few years ago. Given the co-existence of these two movements on Telegram, the article examines their connections. Empirically, the article focused on Irish Telegram groups and channels, identifying relevant protest movements and collecting their posts. Using computational social science methods, we examine whether far-right terms and discourses are present and how this varies across different clusters of Telegram Covid-19 protest groups. In addition, we examine which actors are posting far-right content and what kind of roles they play in the network of Telegram groups. The findings indicate the presence of far-right discourses among the COVID-19 groups. However, the existence of these groups was not solely driven by the extreme right, and the incidence of far-right discourses was not equal across all COVID-19 protest groups. We interpret these findings under the prism of the mediation opportunity structure: while the far right appears to have taken advantage of the network opportunity structure afforded by deplatforming and the migration to Telegram, it did not succeed in diffusing its ideas widely among the COVID-19 protest groups in the Irish Telegram.
Collapse
Affiliation(s)
| | - Eugenia Siapera
- Eugenia Siapera, School of Information and Communication Studies, University College Dublin, Stillorgan Road, Dublin 4, Ireland.
| | | |
Collapse
|
38
|
Kang YB, McCosker A, Kamstra P, Farmer J. Resilience in Web-Based Mental Health Communities: Building a Resilience Dictionary With Semiautomatic Text Analysis. JMIR Form Res 2022; 6:e39013. [PMID: 36136394 PMCID: PMC9539645 DOI: 10.2196/39013] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 06/06/2022] [Accepted: 08/18/2022] [Indexed: 11/13/2022] Open
Abstract
Background Resilience is an accepted strengths-based concept that responds to change, adversity, and crises. This concept underpins both personal and community-based preventive approaches to mental health issues and shapes digital interventions. Online mental health peer-support forums have played a prominent role in enhancing resilience by providing accessible places for sharing lived experiences of mental issues and finding support. However, little research has been conducted on whether and how resilience is realized, hindering service providers’ ability to optimize resilience outcomes. Objective This study aimed to create a resilience dictionary that reflects the characteristics and realization of resilience within online mental health peer-support forums. The findings can be used to guide further analysis and improve resilience outcomes in mental health forums through targeted moderation and management. Methods A semiautomatic approach to creating a resilience dictionary was proposed using topic modeling and qualitative content analysis. We present a systematic 4-phase analysis pipeline that preprocesses raw forum posts, discovers core themes, conceptualizes resilience indicators, and generates a resilience dictionary. Our approach was applied to a mental health forum run by SANE (Schizophrenia: A National Emergency) Australia, with 70,179 forum posts between 2018 and 2020 by 2357 users being analyzed. Results The resilience dictionary and taxonomy developed in this study, reveal how resilience indicators (ie, “social capital,” “belonging,” “learning,” “adaptive capacity,” and “self-efficacy”) are characterized by themes commonly discussed in the forums; each theme’s top 10 most relevant descriptive terms and their synonyms; and the relatedness of resilience, reflecting a taxonomy of indicators that are more comprehensive (or compound) and more likely to facilitate the realization of others. The study showed that the resilience indicators “learning,” “belonging,” and “social capital” were more commonly realized, and “belonging” and “learning” served as foundations for “social capital” and “adaptive capacity” across the 2-year study period. Conclusions This study presents a resilience dictionary that improves our understanding of how aspects of resilience are realized in web-based mental health forums. The dictionary provides novel guidance on how to improve training to support and enhance automated systems for moderating mental health forum discussions.
Collapse
Affiliation(s)
- Yong-Bin Kang
- Australian Research Council (ARC) Centre of Excellence for Automated Decision-Making and Society (ADM+S), Swinburne University of Technology, Victoria, Australia
| | - Anthony McCosker
- Australian Research Council (ARC) Centre of Excellence for Automated Decision-Making and Society (ADM+S), Swinburne University of Technology, Victoria, Australia
- Social Innovation Research Institute, Swinburne University of Technology, Victoria, Australia
| | - Peter Kamstra
- Social Innovation Research Institute, Swinburne University of Technology, Victoria, Australia
| | - Jane Farmer
- Social Innovation Research Institute, Swinburne University of Technology, Victoria, Australia
| |
Collapse
|
39
|
Savin I, Teplyakov N. Topics of the nationwide phone-ins with Vladimir Putin and their role for public support and Russian economy. Inf Process Manag 2022. [DOI: 10.1016/j.ipm.2022.103043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
40
|
Salvatore C, Biffignandi S, Bianchi A. Corporate Social Responsibility Activities Through Twitter: From Topic Model Analysis to Indexes Measuring Communication Characteristics. SOCIAL INDICATORS RESEARCH 2022; 164:1217-1248. [PMID: 36034542 PMCID: PMC9391216 DOI: 10.1007/s11205-022-02993-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 07/26/2022] [Indexed: 06/15/2023]
Abstract
The communication of corporate social responsibility (CSR) highlights the behavior of the business toward CSR and their framework of sustainable development (SD), thus helping policymakers understand the role businesses play with respect to the 2030 Agenda. Despite its importance, this is still a relatively underexamined and emerging topic. In our paper, we focus on what businesses communicate about CSR through social media and how this relates to the Sustainable Development Goals (SDGs). We identified the topics discussed on Twitter, their evolution over time, and the differences across sectors. We applied the structural topic model (STM) algorithm, which allowed us to estimate the model, including document-level metadata (time and sector). This model proved to be a powerful tool for topic detection and the estimation of the effects of time and sector on the discussion proportion of the topics. Indeed, we found that the topics were well identified overall, and the model allowed catching signals from the data. We derived CSR communication indexes directly from the topic model (TM) results and propose the use of dissimilarity and homogeneity indexes to describe the communication mix and highlight differences and identify clusters.
Collapse
Affiliation(s)
- Camilla Salvatore
- Department of Economics, Management and Statistics (DEMS), University of Milano-Bicocca, Piazza dell’Ateneo Nuovo, 1, 20126 Milan, Italy
| | | | | |
Collapse
|
41
|
Baguley SI, Pavlova A, Consedine NS. More than a feeling? What does compassion in healthcare 'look like' to patients? Health Expect 2022; 25:1691-1702. [PMID: 35661516 PMCID: PMC9327826 DOI: 10.1111/hex.13512] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Revised: 04/08/2022] [Accepted: 04/11/2022] [Indexed: 12/30/2022] Open
Abstract
OBJECTIVE Compassion is important to patients and their families, predicts positive patient and practitioner outcomes, and is a professional requirement of physicians around the globe. Yet, despite the value placed on compassion, the empirical study of compassion remains in its infancy and little is known regarding what compassion 'looks like' to patients. The current study addresses limitations in prior work by asking patients what physicians do that helps them feel cared for. METHODS Topic modelling analysis was employed to identify empirical commonalities in the text responses of 767 patients describing physician behaviours that led to their feeling cared for. RESULTS Descriptively, seven meaningful groupings of physician actions experienced as compassion emerged: listening and paying attention (71% of responses), following-up and running tests (11%), continuity and holistic care (8%), respecting preferences (4%), genuine understanding (2%), body language and empathy (2%) and counselling and advocacy (1%). CONCLUSION These findings supplement prior work by identifying concrete actions that are experienced as caring by patients. These early data may provide clinicians with useful information to enhance their ability to customize care, strengthen patient-physician relationships and, ultimately, practice medicine in a way that is experienced as compassionate by patients. PUBLIC CONTRIBUTION This study involves the analysis of data provided by a diverse sample of patients from the general community population of New Zealand.
Collapse
Affiliation(s)
- Sofie I. Baguley
- Department of Psychological Medicine, Faculty of Medical and Health SciencesUniversity of AucklandAucklandNew Zealand
| | - Alina Pavlova
- Department of Psychological Medicine, Faculty of Medical and Health SciencesUniversity of AucklandAucklandNew Zealand
| | - Nathan S. Consedine
- Department of Psychological Medicine, Faculty of Medical and Health SciencesUniversity of AucklandAucklandNew Zealand
| |
Collapse
|
42
|
Ge R, Zhao H, Zhang S. Online Brand Community User Segments: A Text Mining Approach. Front Artif Intell 2022; 5:900775. [PMID: 35923837 PMCID: PMC9339712 DOI: 10.3389/frai.2022.900775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 06/17/2022] [Indexed: 11/21/2022] Open
Abstract
There is a trend that customers increasingly join the online brand community. However, evidence shows that there are nuances between different user segments, and only a small group of users are active. Thus, one key concern marketers face is identifying and targeting specific segments and decreasing user churn rates in an online environment. To this end, this study aims to propose a UGC-based segmentation of online brand community users, identify the characteristics of each segment, and consequently reduce online brand community users' churn rate. We used python to obtain users' post data from a well-known online brand community in China between July 2012 and December 2019, resulting in 912,452 posts and 20,493 users. We then use text mining and clustering methods to segment the users and compare the differences between the segments. Three groups—information-oriented users, entertainment-oriented users, and multi-motivation users—were emerged. Our results imply that entertainment-oriented users were the most active, yet, multi-directional users have the lowest probability of churn, with a churn rate of only 0.607 times than that of users who focus either on information or entertainment. Implications for marketing and future research opportunities are discussed.
Collapse
|
43
|
Rashid J, Kim J, Hussain A, Naseem U, Juneja S. A novel multiple kernel fuzzy topic modeling technique for biomedical data. BMC Bioinformatics 2022; 23:275. [PMID: 35820793 PMCID: PMC9277941 DOI: 10.1186/s12859-022-04780-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 06/08/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Text mining in the biomedical field has received much attention and regarded as the important research area since a lot of biomedical data is in text format. Topic modeling is one of the popular methods among text mining techniques used to discover hidden semantic structures, so called topics. However, discovering topics from biomedical data is a challenging task due to the sparsity, redundancy, and unstructured format. METHODS In this paper, we proposed a novel multiple kernel fuzzy topic modeling (MKFTM) technique using fusion probabilistic inverse document frequency and multiple kernel fuzzy c-means clustering algorithm for biomedical text mining. In detail, the proposed fusion probabilistic inverse document frequency method is used to estimate the weights of global terms while MKFTM generates frequencies of local and global terms with bag-of-words. In addition, the principal component analysis is applied to eliminate higher-order negative effects for term weights. RESULTS Extensive experiments are conducted on six biomedical datasets. MKFTM achieved the highest classification accuracy 99.04%, 99.62%, 99.69%, 99.61% in the Muchmore Springer dataset and 94.10%, 89.45%, 92.91%, 90.35% in the Ohsumed dataset. The CH index value of MKFTM is higher, which shows that its clustering performance is better than state-of-the-art topic models. CONCLUSION We have confirmed from results that proposed MKFTM approach is very efficient to handles to sparsity and redundancy problem in biomedical text documents. MKFTM discovers semantically relevant topics with high accuracy for biomedical documents. Its gives better results for classification and clustering in biomedical documents. MKFTM is a new approach to topic modeling, which has the flexibility to work with a variety of clustering methods.
Collapse
Affiliation(s)
- Junaid Rashid
- Department of Computer Science and Engineering, Kongju National University, Cheonan, 31080 Korea
| | - Jungeun Kim
- Department of Software, Department of Computer Science and Engineering, Kongju National University, Cheonan, 31080 Korea
| | - Amir Hussain
- Data Science and Cyber Analytics Research Group, Edinburgh Napier University, Edinburgh, EH11 4DY UK
| | - Usman Naseem
- School of Computer Science, University of Sydney, Sydney, Australia
| | - Sapna Juneja
- Department of Computer Science, KIET Group of Institutions, Dehli NCR, Ghaziabad, India
| |
Collapse
|
44
|
ZareRavasan A, Jeyaraj A. Evolution of Information Systems Business Value Research: Topic Modeling Analysis. JOURNAL OF COMPUTER INFORMATION SYSTEMS 2022. [DOI: 10.1080/08874417.2022.2085212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
45
|
Egger R, Yu J. A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts. FRONTIERS IN SOCIOLOGY 2022; 7:886498. [PMID: 35602001 PMCID: PMC9120935 DOI: 10.3389/fsoc.2022.886498] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 04/19/2022] [Indexed: 05/28/2023]
Abstract
The richness of social media data has opened a new avenue for social science research to gain insights into human behaviors and experiences. In particular, emerging data-driven approaches relying on topic models provide entirely new perspectives on interpreting social phenomena. However, the short, text-heavy, and unstructured nature of social media content often leads to methodological challenges in both data collection and analysis. In order to bridge the developing field of computational science and empirical social research, this study aims to evaluate the performance of four topic modeling techniques; namely latent Dirichlet allocation (LDA), non-negative matrix factorization (NMF), Top2Vec, and BERTopic. In view of the interplay between human relations and digital media, this research takes Twitter posts as the reference point and assesses the performance of different algorithms concerning their strengths and weaknesses in a social science context. Based on certain details during the analytical procedures and on quality issues, this research sheds light on the efficacy of using BERTopic and NMF to analyze Twitter data.
Collapse
Affiliation(s)
- Roman Egger
- Innovation and Management in Tourism, Salzburg University of Applied Sciences, Salzburg, Austria
| | - Joanne Yu
- Department of Tourism and Service Management, Modul University Vienna, Vienna, Austria
| |
Collapse
|
46
|
Ji J, Robbins M, Featherstone JD, Calabrese C, Barnett GA. Comparison of public discussions of gene editing on social media between the United States and China. PLoS One 2022; 17:e0267406. [PMID: 35500011 PMCID: PMC9060334 DOI: 10.1371/journal.pone.0267406] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 04/07/2022] [Indexed: 12/26/2022] Open
Abstract
The world’s first gene-edited babies event has stirred controversy on social media over the use of gene editing technology. Understanding public discussions about this controversy will provide important insights about opinions of science and facilitate informed policy decisions. This study compares public discussion topics about gene editing on Twitter and Weibo, as wel asthe evolution of these topics over four months. Latent Dirichlet allocation (LDA) was used to generate topics for 11,244 Weibo posts and 57,525 tweets from September 25, 2018, to January 25, 2019. Results showed a difference between the topics on Twitter versus Weibo: there were more nuanced discussions on Twitter, and the discussed topics between platforms focused on different areas. Temporal analysis showed that most discussions took place around gene-edited events. Based on our findings, suggestions were provided for policymakers and science communication practitioners to develop more effective communication strategies toward audiences in China and the U.S.
Collapse
Affiliation(s)
- Jiaojiao Ji
- Department of Science and Technology Communication, University of Science and Technology of China, Hefei, China
- * E-mail:
| | - Matthew Robbins
- Department of Communication, University of California, Davis, California, United States of America
| | - Jieyu Ding Featherstone
- Department of Communication, University of California, Davis, California, United States of America
| | - Christopher Calabrese
- Department of Communication, University of California, Davis, California, United States of America
| | - George A. Barnett
- Department of Communication, University of California, Davis, California, United States of America
| |
Collapse
|
47
|
Qiao F, Williams J. Topic Modelling and Sentiment Analysis of Global Warming Tweets. J ORGAN END USER COM 2022. [DOI: 10.4018/joeuc.294901] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
With the increasing extreme weather events and various disasters, people are paying more attention to environmental issues than ever, particularly global warming. Public debate on it has grown on various platforms, including newspapers and social media. This paper examines the topics and sentiments of the discussion of global warming on Twitter over a span of 18 months using two big data analytics techniques—topic modelling and sentiment analysis. There are seven main topics concerning global warming frequently debated on Twitter: factors causing global warming, consequences of global warming, actions necessary to stop global warming, relations between global warming and Covid-19; global warming’s relation with politics, global warming as a hoax, and global warming as a reality. The sentiment analysis shows that most people express positive emotions about global warming, though the most evoked emotion found across the data is fear, followed by trust. The study provides a general and critical view of the public’s principal concerns and their feelings about global warming on Twitter.
Collapse
Affiliation(s)
- Fang Qiao
- Xi'an International Studies University, China
| | | |
Collapse
|
48
|
A Hybrid Model for the Measurement of the Similarity between Twitter Profiles. SUSTAINABILITY 2022. [DOI: 10.3390/su14094909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Social media platforms have been an undeniable part of our lifestyle for the past decade. Analyzing the information that is being shared is a crucial step to understanding human behavior. Social media analysis aims to guarantee a better experience for the user and to increase user satisfaction. To draw any further conclusions, first, it is necessary to know how to compare users. In this paper, a hybrid model is proposed to measure the degree of similarity between Twitter profiles by calculating features related to the users’ behavioral habits. For this, first, the timeline of each profile was extracted using the official TwitterAPI. Then, three aspects of a profile were deliberated in parallel. Behavioral ratios are time-series-related information showing the consistency and habits of the user. Dynamic time warping was utilized to compare the behavioral ratios of two profiles. Next, the audience network was extracted for each user, and to estimate the similarity of two sets, the Jaccard similarity was used. Finally, for the content similarity measurement, the tweets were preprocessed using the feature extraction method; TF-IDF and DistilBERT were employed for feature extraction and then compared using the cosine similarity method. The results showed that TF-IDF had slightly better performance; it was therefore selected for use in the model. When measuring the similarity level of different profiles, a Random Forest classification model was used, which was trained on 19,900 users, revealing a 0.97 accuracy in detecting similar profiles from different ones. As a step further, this convoluted similarity measurement can find users with very short distances, which are indicative of duplicate users.
Collapse
|
49
|
Lu X, Sun L, Xie Z, Li D. Perception of the Food and Drug Administration Electronic Cigarette Flavor Enforcement Policy on Twitter: Observational Study. JMIR Public Health Surveill 2022; 8:e25697. [PMID: 35348461 PMCID: PMC9006136 DOI: 10.2196/25697] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Revised: 04/13/2021] [Accepted: 11/19/2021] [Indexed: 12/12/2022] Open
Abstract
Background
On January 2, 2020, the US Food and Drug Administration (FDA) released the electronic cigarette (e-cigarette) flavor enforcement policy to prohibit the sale of all flavored cartridge–based e-cigarettes, except for menthol and tobacco flavors.
Objective
This research aimed to examine the public perception of this FDA flavor enforcement policy and its impact on the public perception of e-cigarettes on Twitter.
Methods
A total of 2,341,660 e-cigarette–related tweets and 190,490 FDA flavor enforcement policy–related tweets in the United States were collected from Twitter before (between June 13 and August 22, 2019) and after (between January 2 and March 30, 2020) the announcement of the FDA flavor enforcement policy. Sentiment analysis was conducted to detect the changes in the public perceptions of the policy and e-cigarettes on Twitter. Topic modeling was used for finding frequently discussed topics about e-cigarettes.
Results
The proportion of negative sentiment tweets about e-cigarettes significantly increased after the announcement of the FDA flavor enforcement policy compared with before the announcement of the policy. In contrast, the overall sentiment toward the FDA flavor enforcement policy became less negative. The FDA flavor enforcement policy was the most popular topic associated with e-cigarettes after the announcement of the FDA flavor enforcement policy. Twitter users who discussed about e-cigarettes started to talk about other alternative ways of getting e-cigarettes after the FDA flavor enforcement policy.
Conclusions
Twitter users’ perceptions of e-cigarettes became more negative after the announcement of the FDA flavor enforcement policy.
Collapse
Affiliation(s)
- Xinyi Lu
- Goergen Institute for Data Science, University of Rochester, Rochester, NY, United States
| | - Li Sun
- Goergen Institute for Data Science, University of Rochester, Rochester, NY, United States
| | - Zidian Xie
- Department of Clinical & Translational Research, University of Rochester Medical Center, Rochester, NY, United States
| | - Dongmei Li
- Department of Clinical & Translational Research, University of Rochester Medical Center, Rochester, NY, United States
| |
Collapse
|
50
|
Predictive Fraud Analysis Applying the Fraud Triangle Theory through Data Mining Techniques. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12073382] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Fraud is increasingly common, and so are the losses caused by this phenomenon. There is, thus, an essential economic incentive to study this problem, particularly fraud prevention. One barrier complicating the research in this direction is the lack of public data sets that embed fraudulent activities. In addition, although efforts have been made to detect fraud using machine learning, such actions have not considered the component of human behavior when detecting fraud. We propose a mechanism to detect potential fraud by analyzing human behavior within a data set in this work. This approach combines a predefined topic model and a supervised classifier to generate an alert from the possible fraud-related text. Potential fraud would be detected based on a model built from such a classifier. As a result of this work, a synthetic fraud-related data set is made. Four topics associated with the vertices of the fraud triangle theory are unveiled when assessing different topic modeling techniques. After benchmarking topic modeling techniques and supervised and deep learning classifiers, we find that LDA, random forest, and CNN have the best performance in this scenario. The results of our work suggest that our approach is feasible in practice since several such models obtain an average AUC higher than 0.8. Namely, the fraud triangle theory combined with topic modeling and linear classifiers could provide a promising framework for predictive fraud analysis.
Collapse
|