1
|
Lin Z, Lin X, Yang X. An Automated Analysis Framework for Epidemiological Survey on COVID-19. IEEE J Biomed Health Inform 2024; 28:3186-3199. [PMID: 38412074 DOI: 10.1109/jbhi.2024.3370253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
For a long time, the prevention and control of COVID-19 has received significant attention. A crucial aspect of controlling the disease's spread is the epidemiological survey of patients and the subsequent analysis of epidemiological survey reports (case reports). However, current mainstream analysis approaches are all made manually. This manual method is time-consuming and manpower-intensive. This paper designs an automated visual epidemiological survey analysis (AVESA) framework for the epidemiological survey on COVID-19. AVESA designs a deep neural network for information extraction from case reports and automatically constructs an epidemiological knowledge graph based on predefined pattern. Moreover, a multi-dimensional knowledge reasoning model is developed for conducting knowledge reasoning in the complete COVID-19 epidemiological knowledge graph. In the entity extraction sub-task and multi-task extraction sub-task, AVESA achieved F1 scores of 85.12% and 92.29% respectively on the constructed dataset, significantly outperforming the standalone information extraction models. In full-graph computing, all three experiments align closely with manual analysis standards. In the risk analysis experiment, the weighted PageRank algorithm showed an average improvement of 11.21% in Top_Recall_n% over the standard PageRank algorithm. In the community detection experiment, the weighted Louvain algorithm showed a mere 4.34% community difference rate compared to manual analysis.
Collapse
|
2
|
Shah NA, Li Z, McMann T, Calac AJ, Le N, Nali MC, Cuomo RE, Mackey TK. Identification and Characterization of Synthetic Nicotine Product Promotion and Sales on Instagram Using Natural Language Processing. Nicotine Tob Res 2024; 26:580-588. [PMID: 37947271 DOI: 10.1093/ntr/ntad222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 09/01/2023] [Accepted: 11/01/2023] [Indexed: 11/12/2023]
Abstract
INTRODUCTION There has been a rapid proliferation of synthetic nicotine products in recent years, despite newly established regulatory authority and limited research into its health risks. Previous research has implicated social media platforms as an avenue for nicotine product unregulated sales. Yet, little is known about synthetic nicotine product content on social media. We utilized natural language processing to characterize the sales of synthetic nicotine products on Instagram. METHODS We collected Instagram posts by querying Instagram hashtags (eg, "#tobaccofreenicotine) related to synthetic nicotine. Using Bidirectional Encoder Representations from Transformers, collected posts were categorized into thematically related topic clusters. Posts within topic clusters relevant to study aims were then manually annotated for variables related to promotion and selling (eg, cost discussion, contact information for offline sales). RESULTS A total of 7425 unique posts were collected with 2219 posts identified as related to promotion and selling of synthetic nicotine products. Nicotine pouches (52.9%, n = 1174), electronic nicotine delivery systems (30.6%, n = 679), and flavored e-liquids (14.1%, n = 313) were most commonly promoted. About 16.1% (n = 345) of posts contained embedded hyperlinks and 5.8% (n = 129) provided contact information for purported offline transactions. Only 17.6% (n = 391) of posts contained synthetic nicotine-specific health warnings. CONCLUSIONS In the United States, synthetic nicotine products can only be legally marketed if they have received premarket authorization from the Food and Drug Administration (FDA). Despite these prohibitions, Instagram appears to be a hub for potentially unregulated sales of synthetic and "tobacco-free" products. Efforts are needed by platforms and regulators to enhance content moderation and prevent unregulated online sales of existing and emerging synthetic nicotine products. IMPLICATIONS There is limited clinical understanding of synthetic nicotine's unique health risks and how these novel products are changing over time due to regulatory oversight. Despite synthetic nicotine-specific regulatory measures, such as the requirement for premarket authorization and FDA warning letters issued to unauthorized sellers, access to and promotion of synthetic nicotine is widely occurring on Instagram, a platform with over 2 billion users and one that is popular among youth and young adults. Activities include direct-to-consumer sales from questionable sources, inadequate health warning disclosure, and exposure with limited age restrictions, all conditions necessary for the sale of various tobacco products. Notably, the number of these Instagram posts increased in response to the announcement of new FDA regulations. In response, more robust online monitoring, content moderation, and proactive enforcement are needed from platforms who should work collaboratively with regulators to identify, report, and remove content in clear violation of platform policies and federal laws. Regulatory implementation and enforcement should prioritize digital platforms as conduits for unregulated access to synthetic nicotine products and other future novel and emerging tobacco products.
Collapse
Affiliation(s)
- Neal A Shah
- Department of Anesthesiology, University of California, San Diego School of Medicine, San Diego, CA, USA
- Global Health Policy and Data Institute, San Diego, CA, USA
| | - Zhuoran Li
- San Diego Supercomputer Center, University of California, San Diego, CA, USA
- S-3 Research, San Diego, CA, USA
| | - Tiana McMann
- Global Health Policy and Data Institute, San Diego, CA, USA
- S-3 Research, San Diego, CA, USA
- Global Health Program Department of Anthropology, University of California, San Diego, La Jolla, CA, USA
| | - Alec J Calac
- Global Health Policy and Data Institute, San Diego, CA, USA
- The Herbert Wertheim School of Public Health and Human Longevity Science, University of California, San Diego, CA, USA
| | - Nicolette Le
- Global Health Program Department of Anthropology, University of California, San Diego, La Jolla, CA, USA
| | - Matthew C Nali
- Department of Anesthesiology, University of California, San Diego School of Medicine, San Diego, CA, USA
- Global Health Policy and Data Institute, San Diego, CA, USA
- S-3 Research, San Diego, CA, USA
| | - Raphael E Cuomo
- Department of Anesthesiology, University of California, San Diego School of Medicine, San Diego, CA, USA
- Global Health Policy and Data Institute, San Diego, CA, USA
| | - Tim K Mackey
- San Diego Supercomputer Center, University of California, San Diego, CA, USA
- S-3 Research, San Diego, CA, USA
- Global Health Program Department of Anthropology, University of California, San Diego, La Jolla, CA, USA
| |
Collapse
|
3
|
Liggett D, Frame B, Convey P, Hughes KA. How the COVID-19 pandemic signaled the demise of Antarctic exceptionalism. SCIENCE ADVANCES 2024; 10:eadk4424. [PMID: 38427734 PMCID: PMC10906921 DOI: 10.1126/sciadv.adk4424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 01/25/2024] [Indexed: 03/03/2024]
Abstract
This paper explores how the COVID-19 pandemic affected science and tourism activities and their governance in the Antarctic and Southern Ocean. The pandemic reduced the ability of Antarctic Treaty Parties to make decisions on policy issues and placed a considerable burden on researchers. Tourism was effectively suspended during the 2020-2021 Antarctic season and heavily reduced in 2021-2022 but rebounded to record levels in 2022-2023. The pandemic stimulated reflection on practices to facilitate dialog, especially through online events. Opportunities arose to integrate innovations developed during the pandemic more permanently into Antarctic practices, in relation to open science, reducing operational greenhouse gas footprints and barriers of access to Antarctic research and facilitating data sharing. However, as well as the long-term impacts arising directly from the pandemic, an assemblage of major geopolitical drivers are also in play and, combined, these signal a considerable weakening of Antarctic exceptionalism in the early Anthropocene.
Collapse
Affiliation(s)
| | - Bob Frame
- University of Canterbury, Christchurch, New Zealand
| | - Peter Convey
- British Antarctic Survey, Cambridge, United Kingdom
- University of Johannesburg, Johannesburg, South Africa
| | | |
Collapse
|
4
|
Cam H, Cam AV, Demirel U, Ahmed S. Sentiment analysis of financial Twitter posts on Twitter with the machine learning classifiers. Heliyon 2024; 10:e23784. [PMID: 38205287 PMCID: PMC10776998 DOI: 10.1016/j.heliyon.2023.e23784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 12/05/2023] [Accepted: 12/13/2023] [Indexed: 01/12/2024] Open
Abstract
This paper presents a sentiment analysis combining the lexicon-based and machine learning (ML)-based approaches in Turkish to investigate the public mood for the prediction of stock market behavior in BIST30, Borsa Istanbul. Our main motivation behind this study is to apply sentiment analysis to financial-related tweets in Turkish. We import 17189 tweets posted as "#Borsaistanbul, #Bist, #Bist30, #Bist100″ on Twitter between November 7, 2022, and November 15, 2022, via a MAXQDA 2020, a qualitative data analysis program. For the lexicon-based side, we use a multilingual sentiment offered by the Orange program to label the polarities of the 17189 samples as positive, negative, and neutral labels. Neutral labels are discarded for the machine learning experiments. For the machine learning side, we select 9076 data as positive and negative to implement the classification problem with six different supervised machine learning classifiers conducted in Python 3.6 with the sklearn library. In experiments, 80 % of the selected data is used for the training phase and the rest is used for the testing and validation phase. Results of the experiments show that the Support Vector Machine and Multilayer Perceptron classifier perform better than other classifiers with 0.89 and 0.88 accuracy and AUC values of 0.8729 and 0.8647 respectively. Other classifiers obtain approximately a 78,5 % accuracy rate. It is possible to increase sentiment analysis accuracy with parameter optimization on a larger, cleaner, and more balanced dataset by changing the pre-processing steps. This work can be expanded in the future to develop better sentiment analysis using deep learning approaches.
Collapse
Affiliation(s)
- Handan Cam
- Department of Management Information Systems, Faculty of Economic and Administrative Science, Gumushane University, 29000, Gumushane, Turkey
| | - Alper Veli Cam
- Department of Health Care Management, Faculty of Health Sciences, Gumushane University, 29000, Gumushane, Turkey
| | - Ugur Demirel
- Irfan Can Kose Vocational School, Gumushane University, 29000, Gumushane, Turkey
| | - Sana Ahmed
- Henley Business School, University of Reading, Reading, RG6 6AH, UK
| |
Collapse
|
5
|
Morita PP, Zakir Hussain I, Kaur J, Lotto M, Butt ZA. Tweeting for Health Using Real-time Mining and Artificial Intelligence-Based Analytics: Design and Development of a Big Data Ecosystem for Detecting and Analyzing Misinformation on Twitter. J Med Internet Res 2023; 25:e44356. [PMID: 37294603 PMCID: PMC10337356 DOI: 10.2196/44356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 02/09/2023] [Accepted: 03/14/2023] [Indexed: 03/16/2023] Open
Abstract
BACKGROUND Digital misinformation, primarily on social media, has led to harmful and costly beliefs in the general population. Notably, these beliefs have resulted in public health crises to the detriment of governments worldwide and their citizens. However, public health officials need access to a comprehensive system capable of mining and analyzing large volumes of social media data in real time. OBJECTIVE This study aimed to design and develop a big data pipeline and ecosystem (UbiLab Misinformation Analysis System [U-MAS]) to identify and analyze false or misleading information disseminated via social media on a certain topic or set of related topics. METHODS U-MAS is a platform-independent ecosystem developed in Python that leverages the Twitter V2 application programming interface and the Elastic Stack. The U-MAS expert system has 5 major components: data extraction framework, latent Dirichlet allocation (LDA) topic model, sentiment analyzer, misinformation classification model, and Elastic Cloud deployment (indexing of data and visualizations). The data extraction framework queries the data through the Twitter V2 application programming interface, with queries identified by public health experts. The LDA topic model, sentiment analyzer, and misinformation classification model are independently trained using a small, expert-validated subset of the extracted data. These models are then incorporated into U-MAS to analyze and classify the remaining data. Finally, the analyzed data are loaded into an index in the Elastic Cloud deployment and can then be presented on dashboards with advanced visualizations and analytics pertinent to infodemiology and infoveillance analysis. RESULTS U-MAS performed efficiently and accurately. Independent investigators have successfully used the system to extract significant insights into a fluoride-related health misinformation use case (2016 to 2021). The system is currently used for a vaccine hesitancy use case (2007 to 2022) and a heat wave-related illnesses use case (2011 to 2022). Each component in the system for the fluoride misinformation use case performed as expected. The data extraction framework handles large amounts of data within short periods. The LDA topic models achieved relatively high coherence values (0.54), and the predicted topics were accurate and befitting to the data. The sentiment analyzer performed at a correlation coefficient of 0.72 but could be improved in further iterations. The misinformation classifier attained a satisfactory correlation coefficient of 0.82 against expert-validated data. Moreover, the output dashboard and analytics hosted on the Elastic Cloud deployment are intuitive for researchers without a technical background and comprehensive in their visualization and analytics capabilities. In fact, the investigators of the fluoride misinformation use case have successfully used the system to extract interesting and important insights into public health, which have been published separately. CONCLUSIONS The novel U-MAS pipeline has the potential to detect and analyze misleading information related to a particular topic or set of related topics.
Collapse
Affiliation(s)
- Plinio Pelegrini Morita
- School of Public Health Sciences, Faculty of Health, University of Waterloo, Waterloo, ON, Canada
- Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada
- Research Institute for Aging, University of Waterloo, Waterloo, ON, Canada
- Institute of Health Policy, Management, and Evaluation, University of Toronto, Toronto, ON, Canada
- Centre for Digital Therapeutics, Techna Institute, University Health Network, Toronto, ON, Canada
| | - Irfhana Zakir Hussain
- School of Public Health Sciences, Faculty of Health, University of Waterloo, Waterloo, ON, Canada
- Department of Data Science and Business Systems, School of Computing, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, India
| | - Jasleen Kaur
- School of Public Health Sciences, Faculty of Health, University of Waterloo, Waterloo, ON, Canada
| | - Matheus Lotto
- School of Public Health Sciences, Faculty of Health, University of Waterloo, Waterloo, ON, Canada
- Department of Pediatric Dentistry, Orthodontics and Public Health, Bauru School of Dentistry, University of São Paulo,, Bauru, Brazil
| | - Zahid Ahmad Butt
- School of Public Health Sciences, Faculty of Health, University of Waterloo, Waterloo, ON, Canada
| |
Collapse
|
6
|
Theocharopoulos PC, Tsoukala A, Georgakopoulos SV, Tasoulis SK, Plagianakos VP. Analysing sentiment change detection of Covid-19 tweets. Neural Comput Appl 2023; 35:1-11. [PMID: 37362564 PMCID: PMC10230484 DOI: 10.1007/s00521-023-08662-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 05/10/2023] [Indexed: 06/28/2023]
Abstract
The Covid-19 pandemic made a significant impact on society, including the widespread implementation of lockdowns to prevent the spread of the virus. This measure led to a decrease in face-to-face social interactions and, as an equivalent, an increase in the use of social media platforms, such as Twitter. As part of Industry 4.0, sentiment analysis can be exploited to study public attitudes toward future pandemics and sociopolitical situations in general. This work presents an analysis framework by applying a combination of natural language processing techniques and machine learning algorithms to classify the sentiment of each tweet as positive, or negative. Through extensive experimentation, we expose the ideal model for this task and, subsequently, utilize sentiment predictions to perform time series analysis over the course of the pandemic. In addition, a change point detection algorithm was applied in order to identify the turning points in public attitudes toward the pandemic, which were validated by cross-referencing the news report at that particular period of time. Finally, we study the relationship between sentiment trends on social media and, news coverage of the pandemic, providing insights into the public's perception of the pandemic and its influence on the news.
Collapse
Affiliation(s)
| | - Anastasia Tsoukala
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
| | | | - Sotiris K. Tasoulis
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
| | - Vassilis P. Plagianakos
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
| |
Collapse
|
7
|
Qorib M, Oladunni T, Denis M, Ososanya E, Cotae P. COVID-19 Vaccine Hesitancy: A Global Public Health and Risk Modelling Framework Using an Environmental Deep Neural Network, Sentiment Classification with Text Mining and Emotional Reactions from COVID-19 Vaccination Tweets. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:ijerph20105803. [PMID: 37239532 DOI: 10.3390/ijerph20105803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Revised: 03/31/2023] [Accepted: 04/04/2023] [Indexed: 05/28/2023]
Abstract
Popular social media platforms, such as Twitter, have become an excellent source of information with their swift information dissemination. Individuals with different backgrounds convey their opinions through social media platforms. Consequently, these platforms have become a profound instrument for collecting enormous datasets. We believe that compiling, organizing, exploring, and analyzing data from social media platforms, such as Twitter, can offer various perspectives to public health organizations and decision makers in identifying factors that contribute to vaccine hesitancy. In this study, public tweets were downloaded daily from Tweeter using the Tweeter API. Before performing computation, the tweets were preprocessed and labeled. Vocabulary normalization was based on stemming and lemmatization. The NRCLexicon technique was deployed to convert the tweets into ten classes: positive sentiment, negative sentiment, and eight basic emotions (joy, trust, fear, surprise, anticipation, anger, disgust, and sadness). t-test was used to check the statistical significance of the relationships among the basic emotions. Our analysis shows that the p-values of joy-sadness, trust-disgust, fear-anger, surprise-anticipation, and negative-positive relations are close to zero. Finally, neural network architectures, including 1DCNN, LSTM, Multiple-Layer Perceptron, and BERT, were trained and tested in a COVID-19 multi-classification of sentiments and emotions (positive, negative, joy, sadness, trust, disgust, fear, anger, surprise, and anticipation). Our experiment attained an accuracy of 88.6% for 1DCNN at 1744 s, 89.93% accuracy for LSTM at 27,597 s, while MLP achieved an accuracy of 84.78% at 203 s. The study results show that the BERT model performed the best, with an accuracy of 96.71% at 8429 s.
Collapse
Affiliation(s)
- Miftahul Qorib
- Department of Computer Science and Information Technology, University of the District of Columbia, Washington, DC 20008, USA
- Department of Mathematics and Statistics, University of the District of Columbia, Washington, DC 20008, USA
| | - Timothy Oladunni
- Department of Computer Science, Morgan State University, Baltimore, MD 21251, USA
| | - Max Denis
- Department of Mechanical and Biomedical Engineering, University of the District of Columbia, Washington, DC 20008, USA
| | - Esther Ososanya
- Department of Electrical and Computer Engineering, University of the District of Columbia, Washington, DC 20008, USA
| | - Paul Cotae
- Department of Electrical and Computer Engineering, University of the District of Columbia, Washington, DC 20008, USA
| |
Collapse
|
8
|
Zammarchi G, Mola F, Conversano C. Using sentiment analysis to evaluate the impact of the COVID-19 outbreak on Italy's country reputation and stock market performance. STAT METHOD APPL-GER 2023; 32:1-22. [PMID: 37360253 PMCID: PMC10068702 DOI: 10.1007/s10260-023-00690-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/12/2023] [Indexed: 04/05/2023]
Abstract
During the recent Coronavirus disease 2019 (COVID-19) outbreak, the microblogging service Twitter has been widely used to share opinions and reactions to events. Italy was one of the first European countries to be severely affected by the outbreak and to establish lockdown and stay-at-home orders, potentially leading to country reputation damage. We resort to sentiment analysis to investigate changes in opinions about Italy reported on Twitter before and after the COVID-19 outbreak. Using different lexicons-based methods, we find a breakpoint corresponding to the date of the first established case of COVID-19 in Italy that causes a relevant change in sentiment scores used as a proxy of the country's reputation. Next, we demonstrate that sentiment scores about Italy are associated with the values of the FTSE-MIB index, the Italian Stock Exchange main index, as they serve as early detection signals of changes in the values of FTSE-MIB. Lastly, we evaluate whether different machine learning classifiers were able to determine the polarity of tweets posted before and after the outbreak with a different level of accuracy.
Collapse
Affiliation(s)
- Gianpaolo Zammarchi
- Department of Economics and Business Science, University of Cagliari, Cagliari, Italy
| | - Francesco Mola
- Department of Economics and Business Science, University of Cagliari, Cagliari, Italy
| | - Claudio Conversano
- Department of Economics and Business Science, University of Cagliari, Cagliari, Italy
| |
Collapse
|
9
|
Hasan MM, Islam MU, Sadeq MJ, Fung WK, Uddin J. Review on the Evaluation and Development of Artificial Intelligence for COVID-19 Containment. SENSORS (BASEL, SWITZERLAND) 2023; 23:527. [PMID: 36617124 PMCID: PMC9824505 DOI: 10.3390/s23010527] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 12/23/2022] [Accepted: 12/29/2022] [Indexed: 06/17/2023]
Abstract
Artificial intelligence has significantly enhanced the research paradigm and spectrum with a substantiated promise of continuous applicability in the real world domain. Artificial intelligence, the driving force of the current technological revolution, has been used in many frontiers, including education, security, gaming, finance, robotics, autonomous systems, entertainment, and most importantly the healthcare sector. With the rise of the COVID-19 pandemic, several prediction and detection methods using artificial intelligence have been employed to understand, forecast, handle, and curtail the ensuing threats. In this study, the most recent related publications, methodologies and medical reports were investigated with the purpose of studying artificial intelligence's role in the pandemic. This study presents a comprehensive review of artificial intelligence with specific attention to machine learning, deep learning, image processing, object detection, image segmentation, and few-shot learning studies that were utilized in several tasks related to COVID-19. In particular, genetic analysis, medical image analysis, clinical data analysis, sound analysis, biomedical data classification, socio-demographic data analysis, anomaly detection, health monitoring, personal protective equipment (PPE) observation, social control, and COVID-19 patients' mortality risk approaches were used in this study to forecast the threatening factors of COVID-19. This study demonstrates that artificial-intelligence-based algorithms integrated into Internet of Things wearable devices were quite effective and efficient in COVID-19 detection and forecasting insights which were actionable through wide usage. The results produced by the study prove that artificial intelligence is a promising arena of research that can be applied for disease prognosis, disease forecasting, drug discovery, and to the development of the healthcare sector on a global scale. We prove that artificial intelligence indeed played a significantly important role in helping to fight against COVID-19, and the insightful knowledge provided here could be extremely beneficial for practitioners and research experts in the healthcare domain to implement the artificial-intelligence-based systems in curbing the next pandemic or healthcare disaster.
Collapse
Affiliation(s)
- Md. Mahadi Hasan
- Department of Computer Science and Engineering, Asian University of Bangladesh, Ashulia 1349, Bangladesh
| | - Muhammad Usama Islam
- School of Computing and Informatics, University of Louisiana at Lafayette, Lafayette, LA 70504, USA
| | - Muhammad Jafar Sadeq
- Department of Computer Science and Engineering, Asian University of Bangladesh, Ashulia 1349, Bangladesh
| | - Wai-Keung Fung
- Department of Applied Computing and Engineering, Cardiff School of Technologies, Cardiff Metropolitan University, Cardiff CF5 2YB, UK
| | - Jasim Uddin
- Department of Applied Computing and Engineering, Cardiff School of Technologies, Cardiff Metropolitan University, Cardiff CF5 2YB, UK
| |
Collapse
|
10
|
Swapnarekha H, Nayak J, Behera HS, Dash PB, Pelusi D. An optimistic firefly algorithm-based deep learning approach for sentiment analysis of COVID-19 tweets. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:2382-2407. [PMID: 36899539 DOI: 10.3934/mbe.2023112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
The unprecedented rise in the number of COVID-19 cases has drawn global attention, as it has caused an adverse impact on the lives of people all over the world. As of December 31, 2021, more than 2, 86, 901, 222 people have been infected with COVID-19. The rise in the number of COVID-19 cases and deaths across the world has caused fear, anxiety and depression among individuals. Social media is the most dominant tool that disturbed human life during this pandemic. Among the social media platforms, Twitter is one of the most prominent and trusted social media platforms. To control and monitor the COVID-19 infection, it is necessary to analyze the sentiments of people expressed on their social media platforms. In this study, we proposed a deep learning approach known as a long short-term memory (LSTM) model for the analysis of tweets related to COVID-19 as positive or negative sentiments. In addition, the proposed approach makes use of the firefly algorithm to enhance the overall performance of the model. Further, the performance of the proposed model, along with other state-of-the-art ensemble and machine learning models, has been evaluated by using performance metrics such as accuracy, precision, recall, the AUC-ROC and the F1-score. The experimental results reveal that the proposed LSTM + Firefly approach obtained a better accuracy of 99.59% when compared with the other state-of-the-art models.
Collapse
Affiliation(s)
- H Swapnarekha
- Department of Information Technology, Aditya Institute of Technology and Management (AITAM), Tekkali, Andhra Pradesh 532201, India
- Department of Information Technology, Veer Surendra Sai University of Technology, Burla 768018, India
| | - Janmenjoy Nayak
- Department of Computer Science, Maharaja Sriram Chandra Bhanja Deo University, Baripada, Odisha 757003, India
| | - H S Behera
- Department of Information Technology, Veer Surendra Sai University of Technology, Burla 768018, India
| | - Pandit Byomakesha Dash
- Department of Information Technology, Aditya Institute of Technology and Management (AITAM), Tekkali, Andhra Pradesh 532201, India
| | - Danilo Pelusi
- Communication Sciences, University of Teramo, Coste Sant'agostino Campus, Teramo 64100, Italy
| |
Collapse
|
11
|
Sentiment analysis and emotion detection of post-COVID educational Tweets: Jordan case. SOCIAL NETWORK ANALYSIS AND MINING 2023; 13:39. [PMID: 36880094 PMCID: PMC9977637 DOI: 10.1007/s13278-023-01041-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 01/06/2023] [Accepted: 02/17/2023] [Indexed: 03/05/2023]
Abstract
Education evolved dramatically under Covid-19, and owing to the conditions, distant learning became mandatory. However, this has opened new realities to the educational business under the label of "Hybrid-Learning," where educational institutions are still using online learning in addition to face-to-face learning, which has changed people's lives and split their opinions and emotions. As a result, this study investigated the Jordanian community's perspectives and feelings on the transition from pure face-to-face education to blended education by examining related tweets in the post-COVID era. Specifically, using NLP Emotion detection and Sentiment Analysis approaches, as well as deep learning models. As a result of analyzing the collected tweets, 18.75% of studied Jordanian's community sample are dissatisfied (Anger and Hate), 21.25% are negative (Sad), 13% are Happy, and 24.50 percent are Neutral about it.
Collapse
|
12
|
A survey on the use of association rules mining techniques in textual social media. Artif Intell Rev 2023; 56:1175-1200. [PMID: 35578652 PMCID: PMC9096767 DOI: 10.1007/s10462-022-10196-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The incursion of social media in our lives has been much accentuated in the last decade. This has led to a multiplication of data mining tools aimed at obtaining knowledge from these data sources. One of the greatest challenges in this area is to be able to obtain this knowledge without the need for training processes, which requires structured information and pre-labelled datasets. This is where unsupervised data mining techniques come in. These techniques can obtain value from these unstructured and unlabelled data, providing very interesting solutions to enhance the decision-making process. In this paper, we first address the problem of social media mining, as well as the need for unsupervised techniques, in particular association rules, for its treatment. We follow with a broad overview of the applications of association rules in the domain of social media mining, specifically, their application to the problems of mining textual entities, such as tweets. We also focus on the strengths and weaknesses of using association rules for solving different tasks in textual social media. Finally, the paper provides a perspective overview of the challenges that association rules must face in the next decade within the field of social media mining.
Collapse
|
13
|
Kour H, Gupta MK. AI Assisted Attention Mechanism for Hybrid Neural Model to Assess Online Attitudes About COVID-19. Neural Process Lett 2022; 55:1-40. [PMID: 36575702 PMCID: PMC9780630 DOI: 10.1007/s11063-022-11112-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/10/2022] [Indexed: 12/24/2022]
Abstract
COVID-19 is a novel virus that presents challenges due to a lack of consistent and in-depth research. The news of the COVID-19 spreads across the globe, resulting in a flood of posts on social media sites. Apart from health, social, and economic disturbances brought by the COVID-19 pandemic, another important consequence involves public mental health crises which is of greater concern. Data related to COVID-19 is a valuable asset for researchers in understanding people's feelings related to the pandemic. It is thus important to extract the early information evolving public sentiments on social platforms during the outbreak of COVID-19. The objective of this study is to look at people's perceptions of the COVID-19 pandemic who interact with each other and share tweets on the Twitter platform. COVIDSenti, a large-scale benchmark dataset comprising 90,000 COVID-19 tweets collected from February to March 2020, during the initial phases of the outbreak served as the foundation for our experiments. A pre-trained bidirectional encoder representations from transformers (BERT) model is fine-tuned and embeddings generated are combined with two long short-term memory networks to propose the residual encoder transformation network model. The proposed model is used for multiclass text classification on a large dataset labeled as positive, negative, and neutral. The experimental outcomes validate that: (1) the proposed model is the best performing model, with 98% accuracy and 96% F1-score; (2) It also outperforms conventional machine learning algorithms and different variants of BERT, and (3) the approach achieves better results as compared to state-of-the-art on different benchmark datasets.
Collapse
Affiliation(s)
- Harnain Kour
- Department of Computer Science and Engineering, Shri Mata Vaishno Devi University, Katra, India
| | - Manoj K. Gupta
- Department of Computer Science and Engineering, Shri Mata Vaishno Devi University, Katra, India
| |
Collapse
|
14
|
Sadhukhan M, Bhattacherjee P, Mondal T, Dasgupta S, Bhattacharya I. Opinion classification at subtopic level from COVID vaccination-related tweets. INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING 2022:1-12. [PMID: 36531967 PMCID: PMC9734573 DOI: 10.1007/s11334-022-00516-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 11/22/2022] [Indexed: 06/17/2023]
Abstract
Coronavirus disease 2019 (Covid-19) is a contiguous disease which affected a large volume of population with a high mortality rate across the globe. For dealing with the recent spread of COVID-19, one of the prime measures was to vaccinate people in full extent. People across the globe have diverse opinion regarding the vaccination process, its side effect and effectiveness. Such opinions get located into different micro-blogging sites including twitter. Opinion mining through analyzing public sentiments of such micro-blogs is a common method for detection of public responses. This paper focuses on classifying the public opinions expressed related to COVID-19 vaccination at sub topic level. The procedure tries to find out different keywords regarding positive, negative and neutral sentences. From those keywords, different related query set was constructed using Rocchio query expansion algorithm for positive, negative and neutral sentiments. Later Extended query set is used to form subtopic using LDA algorithm to identify the nature of the tweets. The proposed LDA model came across with 0.56 coherence score with twenty subtopics, which is fair enough to classify the tweets in different classes. This trained model is finally used to classify the tweets in real time with Apache Kafka framework regarding different subtopic based on positive, negative or neutral sentiment.
Collapse
Affiliation(s)
- Mrinmoy Sadhukhan
- Computer Science, Indira Gandhi National Open University, New Delhi, India
| | - Pramita Bhattacherjee
- Department of IT, Government College of Engineering and Textile Technology, Serampore, West Bengal India
| | - Tamal Mondal
- Computer Science and Engineering Department, D Y Patil International University, Pune, India
| | - Sudakshina Dasgupta
- Department of IT, Government College of Engineering and Textile Technology, Serampore, West Bengal India
| | - Indrajit Bhattacharya
- Department of Computer Application, Kalyani Government Engineering College, Kalyani, Nadia, West Bengal India
| |
Collapse
|
15
|
Improving Public Health Policy by Comparing the Public Response during the Start of COVID-19 and Monkeypox on Twitter in Germany: A Mixed Methods Study. Vaccines (Basel) 2022; 10:vaccines10121985. [PMID: 36560395 PMCID: PMC9787903 DOI: 10.3390/vaccines10121985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 11/06/2022] [Accepted: 11/17/2022] [Indexed: 11/24/2022] Open
Abstract
Little is known about monkeypox public concerns since its widespread emergence in many countries. Tweets in Germany were examined in the first three months of COVID-19 and monkeypox to examine concerns and issues raised by the public. Understanding views and positions of the public could help to shape future public health campaigns. Few qualitative studies reviewed large datasets, and the results provide the first instance of the public thinking comparing COVID-19 and monkeypox. We retrieved 15,936 tweets from Germany using query words related to both epidemics in the first three months of each one. A sequential explanatory mixed methods research joined a machine learning approach with thematic analysis using a novel rapid tweet analysis protocol. In COVID-19 tweets, there was the selfing construct or feeling part of the emerging narrative of the spread and response. In contrast, during monkeypox, the public considered othering after the fatigue of the COVID-19 response, or an impersonal feeling toward the disease. During monkeypox, coherence and reconceptualization of new and competing information produced a customer rather than a consumer/producer model. Public healthcare policy should reconsider a one-size-fits-all model during information campaigns and produce a strategic approach embedded within a customer model to educate the public about preventative measures and updates. A multidisciplinary approach could prevent and minimize mis/disinformation.
Collapse
|
16
|
A Systematic Literature Review and Meta-Analysis of Studies on Online Fake News Detection. INFORMATION 2022. [DOI: 10.3390/info13110527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The ubiquitous access and exponential growth of information available on social media networks have facilitated the spread of fake news, complicating the task of distinguishing between this and real news. Fake news is a significant social barrier that has a profoundly negative impact on society. Despite the large number of studies on fake news detection, they have not yet been combined to offer coherent insight on trends and advancements in this domain. Hence, the primary objective of this study was to fill this knowledge gap. The method for selecting the pertinent articles for extraction was created using the preferred reporting items for systematic reviews and meta-analyses (PRISMA). This study reviewed deep learning, machine learning, and ensemble-based fake news detection methods by a meta-analysis of 125 studies to aggregate their results quantitatively. The meta-analysis primarily focused on statistics and the quantitative analysis of data from numerous separate primary investigations to identify overall trends. The results of the meta-analysis were reported by the spatial distribution, the approaches adopted, the sample size, and the performance of methods in terms of accuracy. According to the statistics of between-study variance high heterogeneity was found with τ2 = 3.441; the ratio of true heterogeneity to total observed variation was I2 = 75.27% with the heterogeneity chi-square (Q) = 501.34, the degree of freedom = 124, and p ≤ 0.001. A p-value of 0.912 from the Egger statistical test confirmed the absence of a publication bias. The findings of the meta-analysis demonstrated satisfaction with the effectiveness of the recommended approaches from the primary studies on fake news detection that were included. Furthermore, the findings can inform researchers about various approaches they can use to detect online fake news.
Collapse
|
17
|
Jain V, Kashyap KL. Ensemble hybrid model for Hindi COVID-19 text classification with metaheuristic optimization algorithm. MULTIMEDIA TOOLS AND APPLICATIONS 2022; 82:16839-16859. [PMID: 36313485 PMCID: PMC9589711 DOI: 10.1007/s11042-022-13937-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 08/08/2022] [Accepted: 09/12/2022] [Indexed: 06/16/2023]
Abstract
A SARS-CoV-2 virus has spread around the globe since March 2020. Millions of people infected worldwide with coronavirus. People from every country expressed their sentiments about coronavirus on social media. The aim of this work is to determine the general public opinion of Indian Twitter users about coronavirus. The Hindi tweets posted about COVID-19 is used as input data for sentiment analysis. The natural language processing is applied on input data for feature extraction. Further, the optimal features are selected from the pre-processed data using the metaheuristic based Grey wolf optimization technique. Finally, a hybrid of convolution neural network(CNN) and a long short-term memory (LSTM) model pair is employed to categorize the sentiments as positive, negative, and neutral. The outcome of the proposed model is compared with other machine learning techniques, namely, Random Forest, Decision Tree, K-Nearest Neighbor, Naive Bayes, Support vector machine (SVM), CNN, LSTM, LSTM-CNN, and CNN-LSTM. The highest accuracy of 87.75%, 88.41%, 87.89%, 85.54%, 89.11%, 91.46%, 88.72%, 91.54%, and 92.34% is obtained by Random Forest, Decision Tree, K-Nearest Neighbor, Naive Bayes, SVM, CNN, LSTM, LSTM-CNN, and CNN-LSTM, respectively. The proposed ensemble hybrid model gives the highest 95.54%, 91.44%, 89.63%, and 90.87% classification accuracy, precision, recall, and F-score, respectively.
Collapse
Affiliation(s)
- Vipin Jain
- SCSE, VIT University Bhopal, 466114 Madhya Pradesh, India
| | | |
Collapse
|
18
|
Yenkikar A, Babu CN, Hemanth DJ. Semantic relational machine learning model for sentiment analysis using cascade feature selection and heterogeneous classifier ensemble. PeerJ Comput Sci 2022; 8:e1100. [PMID: 36262147 PMCID: PMC9575864 DOI: 10.7717/peerj-cs.1100] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 08/23/2022] [Indexed: 06/16/2023]
Abstract
The exponential rise in social media via microblogging sites like Twitter has sparked curiosity in sentiment analysis that exploits user feedback towards a targeted product or service. Considering its significance in business intelligence and decision-making, numerous efforts have been made in this area. However, lack of dictionaries, unannotated data, large-scale unstructured data, and low accuracies have plagued these approaches. Also, sentiment classification through classifier ensemble has been underexplored in literature. In this article, we propose a Semantic Relational Machine Learning (SRML) model that automatically classifies the sentiment of tweets by using classifier ensemble and optimal features. The model employs the Cascaded Feature Selection (CFS) strategy, a novel statistical assessment approach based on Wilcoxon rank sum test, univariate logistic regression assisted significant predictor test and cross-correlation test. It further uses the efficacy of word2vec-based continuous bag-of-words and n-gram feature extraction in conjunction with SentiWordNet for finding optimal features for classification. We experiment on six public Twitter sentiment datasets, the STS-Gold dataset, the Obama-McCain Debate (OMD) dataset, the healthcare reform (HCR) dataset and the SemEval2017 Task 4A, 4B and 4C on a heterogeneous classifier ensemble comprising fourteen individual classifiers from different paradigms. Results from the experimental study indicate that CFS supports in attaining a higher classification accuracy with up to 50% lesser features compared to count vectorizer approach. In Intra-model performance assessment, the Artificial Neural Network-Gradient Descent (ANN-GD) classifier performs comparatively better than other individual classifiers, but the Best Trained Ensemble (BTE) strategy outperforms on all metrics. In inter-model performance assessment with existing state-of-the-art systems, the proposed model achieved higher accuracy and outperforms more accomplished models employing quantum-inspired sentiment representation (QSR), transformer-based methods like BERT, BERTweet, RoBERTa and ensemble techniques. The research thus provides critical insights into implementing similar strategy into building more generic and robust expert system for sentiment analysis that can be leveraged across industries.
Collapse
Affiliation(s)
- Anuradha Yenkikar
- Department of Computer Science and Engineering, M. S. Ramaiah University of Applied Sciences, Bengaluru, Karnataka, India
| | - C. Narendra Babu
- Department of Computer Science and Engineering, M. S. Ramaiah University of Applied Sciences, Bengaluru, Karnataka, India
| | - D. Jude Hemanth
- Department of Electronics and Communications Engineering, Karunya University, Coimbatore, Tamil Nadu, India
| |
Collapse
|
19
|
Chen M, Zhang L. Application of edge computing combined with deep learning model in the dynamic evolution of network public opinion in emergencies. THE JOURNAL OF SUPERCOMPUTING 2022; 79:1526-1543. [PMID: 35915780 PMCID: PMC9330939 DOI: 10.1007/s11227-022-04733-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 07/16/2022] [Indexed: 06/15/2023]
Abstract
The aim is to clarify the evolution mechanism of Network Public Opinion (NPO) in public emergencies. This work makes up for the insufficient semantic understanding in NPO-oriented emotion analysis and tries to maintain social harmony and stability. The combination of the Edge Computing (EC) and Deep Learning (DL) model is applied to the NPO-oriented Emotion Recognition Model (ERM). Firstly, the NPO on public emergencies is introduced. Secondly, three types of NPO emergencies are selected as research cases. An emotional rule system is established based on the One-Class Classification (OCC) model as emotional standards. The word embedding representation method represents the preprocessed Weibo text data. Convolutional Neural Network (CNN) is used as the classifier. The NPO-oriented ERM is implemented on CNN and verified through comparative experiments after the CNN's hyperparameters are adjusted. The research results show that the text annotation of the NPO based on OCC emotion rules can obtain better recognition performance. Additionally, the recognition effect of the improved CNN is significantly higher than the Support Vector Machine (SVM) in traditional Machine Learning (ML). This work realizes the technological innovation of automatic emotion recognition of NPO groups and provides a basis for the relevant government agencies to handle the NPO in public emergencies scientifically.
Collapse
Affiliation(s)
- Min Chen
- School of Business, Wenzhou University, Wenzhou, China
| | - Lili Zhang
- School of Business, Wenzhou University, Wenzhou, China
| |
Collapse
|
20
|
Deep Learning-Based Mental Health Model on Primary and Secondary School Students’ Quality Cultivation. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:7842304. [PMID: 35845877 PMCID: PMC9279049 DOI: 10.1155/2022/7842304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Revised: 05/24/2022] [Accepted: 06/08/2022] [Indexed: 12/04/2022]
Abstract
The purpose was to timely identify the mental disorders (MDs) of students receiving primary and secondary education (PSE) (PSE students) and improve their mental quality. Firstly, this work analyzes the research status of the mental health model (MHM) and the main contents of PSE student-oriented mental health quality cultivation under deep learning (DL). Secondly, an MHM is implemented based on big data technology (BDT) and the convolutional neural network (CNN). Simultaneously, the long short-term memory (LSTM) is introduced to optimize the proposed MHM. Finally, the performance of the MHM before and after optimization is evaluated, and the PSE student-oriented mental health quality training strategy based on the proposed MHM is offered. The results show that the accuracy curve is higher than the recall curve in all classification algorithms. The maximum recall rate is 0.58, and the minimum accuracy rate is 0.62. The decision tree (DT) algorithm has the best comprehensive performance among the five different classification algorithms, with accuracy of 0.68, recall rate of 0.58, and F1-measure of 0.69. Thus, the DT algorithm is selected as the classifier. The proposed MHM can identify 56% of students with MDs before optimization. After optimization, the accuracy is improved by 0.03. The recall rate is improved by 0.19, the F1-measure is improved by 0.05, and 75% of students with MDs can be identified. Diverse behavior data can improve the recognition effect of students' MDs. Meanwhile, from the 60th iteration, the mode accuracy and loss tend to be stable. By comparison, batch_size has little influence on the experimental results. The number of convolution kernels of the first convolution layer has little influence. The proposed MHM based on DL and CNN will indirectly improve the mental health quality of PSE students. The research provides a reference for cultivating the mental health quality of PSE students.
Collapse
|
21
|
Alkhaldi NA, Asiri Y, Mashraqi AM, Halawani HT, Abdel-Khalek S, Mansour RF. Leveraging Tweets for Artificial Intelligence Driven Sentiment Analysis on the COVID-19 Pandemic. Healthcare (Basel) 2022; 10:healthcare10050910. [PMID: 35628045 PMCID: PMC9141128 DOI: 10.3390/healthcare10050910] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 05/09/2022] [Accepted: 05/10/2022] [Indexed: 01/25/2023] Open
Abstract
The COVID-19 pandemic has been a disastrous event that has elevated several psychological issues such as depression given abrupt social changes and lack of employment. At the same time, social scientists and psychologists have gained significant interest in understanding the way people express emotions and sentiments at the time of pandemics. During the rise in COVID-19 cases with stricter lockdowns, people expressed their sentiments on social media. This offers a deep understanding of human psychology during catastrophic events. By exploiting user-generated content on social media such as Twitter, people’s thoughts and sentiments can be examined, which aids in introducing health intervention policies and awareness campaigns. The recent developments of natural language processing (NLP) and deep learning (DL) models have exposed noteworthy performance in sentiment analysis. With this in mind, this paper presents a new sunflower optimization with deep-learning-driven sentiment analysis and classification (SFODLD-SAC) on COVID-19 tweets. The presented SFODLD-SAC model focuses on the identification of people’s sentiments during the COVID-19 pandemic. To accomplish this, the SFODLD-SAC model initially preprocesses the tweets in distinct ways such as stemming, removal of stopwords, usernames, link punctuations, and numerals. In addition, the TF-IDF model is applied for the useful extraction of features from the preprocessed data. Moreover, the cascaded recurrent neural network (CRNN) model is employed to analyze and classify sentiments. Finally, the SFO algorithm is utilized to optimally adjust the hyperparameters involved in the CRNN model. The design of the SFODLD-SAC technique with the inclusion of an SFO algorithm-based hyperparameter optimizer for analyzing people's sentiments on COVID-19 shows the novelty of this study. The simulation analysis of the SFODLD-SAC model is performed using a benchmark dataset from the Kaggle repository. Extensive, comparative results report the promising performance of the SFODLD-SAC model over recent state-of-the-art models with maximum accuracy of 99.65%.
Collapse
Affiliation(s)
- Nora A. Alkhaldi
- Department of Computer Science, College of Computer Sciences and Information Technology, King Faisal University, Al-Ahsa 31982, Saudi Arabia;
| | - Yousef Asiri
- Department of Computer Science, College of Computer Science and Information Systems, Najran Univesity, Najran 61441, Saudi Arabia; (Y.A.); (A.M.M.)
| | - Aisha M. Mashraqi
- Department of Computer Science, College of Computer Science and Information Systems, Najran Univesity, Najran 61441, Saudi Arabia; (Y.A.); (A.M.M.)
| | - Hanan T. Halawani
- Department of Computer Science, College of Computer Science and Information Systems, Najran Univesity, Najran 61441, Saudi Arabia; (Y.A.); (A.M.M.)
- Correspondence:
| | - Sayed Abdel-Khalek
- Department of Mathematics, College of Science, Taif University, Taif 21944, Saudi Arabia;
| | - Romany F. Mansour
- Department of Mathematics, Faculty of Science, New Valley University, El-Kharga 72511, Egypt;
| |
Collapse
|
22
|
Srikanth J, Damodaram A, Teekaraman Y, Kuppusamy R, Thelkar AR. Sentiment Analysis on COVID-19 Twitter Data Streams Using Deep Belief Neural Networks. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:8898100. [PMID: 35535182 PMCID: PMC9077450 DOI: 10.1155/2022/8898100] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Accepted: 03/16/2022] [Indexed: 01/09/2023]
Abstract
Social media is Internet-based by design, allowing people to share content quickly via electronic means. People can openly express their thoughts on social media sites such as Twitter, which can then be shared with other people. During the recent COVID-19 outbreak, public opinion analytics provided useful information for determining the best public health response. At the same time, the dissemination of misinformation, aided by social media and other digital platforms, has proven to be a greater threat to global public health than the virus itself, as the COVID-19 pandemic has shown. The public's feelings on social distancing can be discovered by analysing articulated messages from Twitter. The automated method of recognizing and classifying subjective information in text data is known as sentiment analysis. In this research work, we have proposed to use a combination of preprocessing approaches such as tokenization, filtering, stemming, and building N-gram models. Deep belief neural network (DBN) with pseudo labelling is used to classify the tweets. Top layers of the base classifiers are boosted in the pseudo labelling strategy, whereas lower levels of the base classifiers share weights for feature extraction. By introducing the pseudo boost mechanism, our suggested technique preserves the same time complexity as a DBN while achieving fast convergence to optimality. The pseudo labelling improves the performance of the classification. It extracts the keywords from the tweets with high precision. The results reveal that using the DBN classifier in conjunction with the bigram in the N-gram model outperformed other models by 90.3 percent. The proposed approach can also aid medical professionals and decision-makers in determining the best course of action for each location based on their views regarding the pandemic.
Collapse
Affiliation(s)
- Jatla Srikanth
- Department of Computer Science and Engineering, Aurora's Technological and Research Institute, Hyderabad 500098, TS, India
| | - Avula Damodaram
- School of Information Technology (SIT), JNTUH, Hyderabad 500085, TS, India
| | - Yuvaraja Teekaraman
- Department of Electronic and Electrical Engineering, The University of Sheffield, Sheffield S1 3JD, UK
| | - Ramya Kuppusamy
- Department of Electrical and Electronics Engineering, Sri Sairam College of Engineering, Bangalore 562106, India
| | - Amruth Ramesh Thelkar
- Faculty of Electrical & Computer Engineering, Jimma Institute of Technology, Jimma University, Jimma, Ethiopia
| |
Collapse
|
23
|
Twitter Sentiment Analysis Using Ensemble based Deep Learning Model towards COVID-19 in India and European Countries. Pattern Recognit Lett 2022; 158:164-170. [PMID: 35464347 PMCID: PMC9014659 DOI: 10.1016/j.patrec.2022.04.027] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 04/06/2022] [Accepted: 04/16/2022] [Indexed: 11/22/2022]
Abstract
As of November 2021, more than 24.80 crore people are diagnosed with the coronavirus in that around 50.20 lakhs people lost their lives, because of this infectious disease. By understanding the people's sentiment's expressed in their social media (Facebook, Twitter, Instagram etc.) helps their governments in controlling, monitoring, and eradicating the coronavirus. Compared to other social media's, the twitter data are indispensable in the extraction of useful awareness information related to any crisis. In this article, a sentiment analysis model is proposed to analyze the real time tweets, which are related to coronavirus. Initially, around 3100 Indian and European people's tweets are collected between the time period of 23.03.2020 to 01.11.2021. Next, the data pre-processing and exploratory investigation are accomplished for better understanding of the collected data. Further, the feature extraction is performed using Term Frequency-Inverse Document Frequency (TF-IDF), GloVe, pre-trained Word2Vec, and fast text embedding's. The obtained feature vectors are fed to the ensemble classifier (Gated Recurrent Unit (GRU) and Capsule Neural Network (CapsNet)) for classifying the user's sentiment's as anger, sad, joy, and fear. The obtained experimental outcomes showed that the proposed model achieved 97.28% and 95.20% of prediction accuracy in classifying the both Indian and European people's sentiments.
Collapse
|
24
|
Bahuguna A, Yadav D, Senapati A, Saha BN. A unified deep neuro-fuzzy approach for COVID-19 twitter sentiment classification. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-219247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Covid-19 braces serious mental health crisis across the world. Since a vast majority of the population exploit social media platforms such as twitter to exchange information, rapid collecting and analyzing social media data to understand personal well-being and subsequently adopting adequate measures could avoid severe socio-economic damage. Sentiment analysis on twitter data is very useful to understand and identify the mental health issues. In this research, we proposed a unified deep neuro-fuzzy approach for Covid-19 twitter sentiment classification. Fuzzy logic has been a very powerful tool for twitter data analysis where approximate semantic and syntactic analysis is more relevant because correcting spelling and grammar in tweets are merely obnoxious. We conducted the experiment on three challenging COVID-19 twitter sentiment datasets. Experimental results demonstrate that fuzzy Sugeno integral based ensembled classifiers succeed over individual base classifiers.
Collapse
|
25
|
MIss RoBERTa WiLDe: Metaphor Identification Using Masked Language Model with Wiktionary Lexical Definitions. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12042081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Recent years have brought an unprecedented and rapid development in the field of Natural Language Processing. To a large degree this is due to the emergence of modern language models like GPT-3 (Generative Pre-trained Transformer 3), XLNet, and BERT (Bidirectional Encoder Representations from Transformers), which are pre-trained on a large amount of unlabeled data. These powerful models can be further used in the tasks that have traditionally been suffering from a lack of material that could be used for training. Metaphor identification task, which is aimed at automatic recognition of figurative language, is one of such tasks. The metaphorical use of words can be detected by comparing their contextual and basic meanings. In this work, we deliver the evidence that fully automatically collected dictionary definitions can be used as the optimal medium for retrieving the non-figurative word senses, which consequently may help improve the performance of the algorithms used in metaphor detection task. As the source of the lexical information, we use the openly available Wiktionary. Our method can be applied without changes to any other dataset designed for token-level metaphor detection given it is binary labeled. In the set of experiments, our proposed method (MIss RoBERTa WiLDe) outperforms or performs similarly well as the competing models on several datasets commonly chosen in the research on metaphor processing.
Collapse
|
26
|
Ismail H, Serhani MA, Hussien N, Elabyad R, Navaz A. Public wellbeing analytics framework using social media chatter data. SOCIAL NETWORK ANALYSIS AND MINING 2022; 12:163. [PMID: 36345490 PMCID: PMC9630074 DOI: 10.1007/s13278-022-00987-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Revised: 09/28/2022] [Accepted: 10/13/2022] [Indexed: 11/05/2022]
Abstract
Public wellbeing has always been crucial. Many governments around the globe prioritize the impact of their decisions on public wellbeing. In this paper, we propose an end-to-end public wellbeing analytics framework designed to predict the public’s wellbeing status and infer insights through the continuous analysis of social media content over several temporal events and across several locations. The proposed framework implements a novel distant supervision approach designed specifically to generate wellbeing-labeled datasets. In addition, it implements a wellbeing prediction model trained on contextualized sentence embeddings using BERT. Wellbeing predictions are visualized using several spatiotemporal analytics that can support decision-makers in gauging the impact of several government decisions and temporal events on the public, aiding in improving the decision-making process. Empirical experiments evaluate the effectiveness of the proposed distant supervision approach, the prediction model, and the utility of the produced analytics in gauging the public wellbeing status in a specific context.
Collapse
Affiliation(s)
- Heba Ismail
- grid.444459.c0000 0004 1762 9315College of Engineering, Abu Dhabi University, Abu Dhabi, UAE
| | - M. Adel Serhani
- grid.43519.3a0000 0001 2193 6666College of IT, United Arab Emirates University, Al Ain, UAE
| | - Nada Hussien
- grid.444459.c0000 0004 1762 9315College of Engineering, Abu Dhabi University, Abu Dhabi, UAE
| | - Rawan Elabyad
- grid.444459.c0000 0004 1762 9315College of Engineering, Abu Dhabi University, Abu Dhabi, UAE
| | - Alramzana Navaz
- grid.43519.3a0000 0001 2193 6666College of IT, United Arab Emirates University, Al Ain, UAE
| |
Collapse
|
27
|
Chinnalagu A, Durairaj AK. Context-based sentiment analysis on customer reviews using machine learning linear models. PeerJ Comput Sci 2021; 7:e813. [PMID: 35036535 PMCID: PMC8725657 DOI: 10.7717/peerj-cs.813] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Accepted: 11/22/2021] [Indexed: 06/14/2023]
Abstract
Customer satisfaction and their positive sentiments are some of the various goals for successful companies. However, analyzing customer reviews to predict accurate sentiments have been proven to be challenging and time-consuming due to high volumes of collected data from various sources. Several researchers approach this with algorithms, methods, and models. These include machine learning and deep learning (DL) methods, unigram and skip-gram based algorithms, as well as the Artificial Neural Network (ANN) and bag-of-word (BOW) regression model. Studies and research have revealed incoherence in polarity, model overfitting and performance issues, as well as high cost in data processing. This experiment was conducted to solve these revealing issues, by building a high performance yet cost-effective model for predicting accurate sentiments from large datasets containing customer reviews. This model uses the fastText library from Facebook's AI research (FAIR) Lab, as well as the traditional Linear Support Vector Machine (LSVM) to classify text and word embedding. Comparisons of this model were also done with the author's a custom multi-layer Sentiment Analysis (SA) Bi-directional Long Short-Term Memory (SA-BLSTM) model. The proposed fastText model, based on results, obtains a higher accuracy of 90.71% as well as 20% in performance compared to LSVM and SA-BLSTM models.
Collapse
Affiliation(s)
- Anandan Chinnalagu
- Computer Science, Government Arts College (Affiliated to Bharathidasan University, Tiruchirappalli), Kulithalai, Karur, Tamil Nadu, India
| | - Ashok Kumar Durairaj
- Computer Science, Government Arts College (Affiliated to Bharathidasan University, Tiruchirappalli), Kulithalai, Karur, Tamil Nadu, India
| |
Collapse
|
28
|
Abstract
The media plays an important role in disseminating facts and knowledge to the public at critical times, and the COVID-19 pandemic is a good example of such a period. This research is devoted to performing a comparative analysis of the representation of topics connected with the pandemic in the internet media of Kazakhstan and the Russian Federation. The main goal of the research is to propose a method that would make it possible to analyze the correlation between mass media dynamic indicators and the World Health Organization COVID-19 data. In order to solve the task, three approaches related to the representation of mass media dynamics in numerical form—automatically obtained topics, average sentiment, and dynamic indicators—were proposed and applied according to a manually selected list of search queries. The results of the analysis indicate similarities and differences in the ways in which the epidemiological situation is reflected in publications in Russia and in Kazakhstan. In particular, the publication activity in both countries correlates with the absolute indicators, such as the daily number of new infections, and the daily number of deaths. However, mass media tend to ignore the positive rate of confirmed cases and the virus reproduction rate. If we consider strictness of quarantine measures, mass media in Russia show a rather high correlation, while in Kazakhstan, the correlation is much lower. Analysis of search queries revealed that in Kazakhstan the problem of fake news and disinformation is more acute during periods of deterioration of the epidemiological situation, when the level of crime and poverty increase. The novelty of this work is the proposal and implementation of a method that allows the performing of a comparative analysis of objective COVID-19 statistics and several mass media indicators. In addition, it is the first time that such a comparative analysis, between different countries, has been performed on a corpus in a language other than English.
Collapse
|
29
|
Sentiment Analysis and Topic Modeling on Tweets about Online Education during COVID-19. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11188438] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Amid the worldwide COVID-19 pandemic lockdowns, the closure of educational institutes leads to an unprecedented rise in online learning. For limiting the impact of COVID-19 and obstructing its widespread, educational institutions closed their campuses immediately and academic activities are moved to e-learning platforms. The effectiveness of e-learning is a critical concern for both students and parents, specifically in terms of its suitability to students and teachers and its technical feasibility with respect to different social scenarios. Such concerns must be reviewed from several aspects before e-learning can be adopted at such a larger scale. This study endeavors to investigate the effectiveness of e-learning by analyzing the sentiments of people about e-learning. Due to the rise of social media as an important mode of communication recently, people’s views can be found on platforms such as Twitter, Instagram, Facebook, etc. This study uses a Twitter dataset containing 17,155 tweets about e-learning. Machine learning and deep learning approaches have shown their suitability, capability, and potential for image processing, object detection, and natural language processing tasks and text analysis is no exception. Machine learning approaches have been largely used both for annotation and text and sentiment analysis. Keeping in view the adequacy and efficacy of machine learning models, this study adopts TextBlob, VADER (Valence Aware Dictionary for Sentiment Reasoning), and SentiWordNet to analyze the polarity and subjectivity score of tweets’ text. Furthermore, bearing in mind the fact that machine learning models display high classification accuracy, various machine learning models have been used for sentiment classification. Two feature extraction techniques, TF-IDF (Term Frequency-Inverse Document Frequency) and BoW (Bag of Words) have been used to effectively build and evaluate the models. All the models have been evaluated in terms of various important performance metrics such as accuracy, precision, recall, and F1 score. The results reveal that the random forest and support vector machine classifier achieve the highest accuracy of 0.95 when used with Bow features. Performance comparison is carried out for results of TextBlob, VADER, and SentiWordNet, as well as classification results of machine learning models and deep learning models such as CNN (Convolutional Neural Network), LSTM (Long Short Term Memory), CNN-LSTM, and Bi-LSTM (Bidirectional-LSTM). Additionally, topic modeling is performed to find the problems associated with e-learning which indicates that uncertainty of campus opening date, children’s disabilities to grasp online education, and lagging efficient networks for online education are the top three problems.
Collapse
|
30
|
Kansal AK, Gautam J, Chintalapudi N, Jain S, Battineni G. Google Trend Analysis and Paradigm Shift of Online Education Platforms during the COVID-19 Pandemic. Infect Dis Rep 2021; 13:418-428. [PMID: 34065817 PMCID: PMC8162359 DOI: 10.3390/idr13020040] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 05/07/2021] [Accepted: 05/08/2021] [Indexed: 11/16/2022] Open
Abstract
Objective: The largest pandemic in history, the COVID-19 pandemic, has been declared a doomsday globally. The second wave spreading worldwide has devastating consequences in every sector of life. Several measures to contain and curb its infection have forged significant challenges for the education community. With an estimated 1.6 billion learners, the closure of schools and other educational institutions has impacted more than 90% of students worldwide from the elementary to tertiary level. Methods: In a view to studying impacts on student's fraternity, this article aims at addressing alternative ways of educating-more specifically, online education-through the analysis of Google trends for the past year. The study analyzed the platforms of online teaching and learning systems that have been enabling remote learning, thereby limiting the impact on the education system. Thorough text analysis is performed on an existing dataset from Kaggle to retrieve insight on the clustering of words that are more often looked at during this pandemic to find the general patterns of their occurrence. Findings: The results show that the coronavirus patients are the most trending patterns in word search clustering, with the education system being at the control and preventive measures to bring equilibrium in the system of education. There has been significant growth in online platforms in the last year. Existing assets of educational establishments have effectively converted conventional education into new-age online education with the help of virtual classes and other key online tools in this continually fluctuating scholastic setting. The effective usage of teaching tools such as Microsoft Teams, Zoom, Google Meet, and WebEx are the most used online platforms for the conduction of classes, and whiteboard software tools and learning apps such as Vedantu, Udemy, Byju's, and Whitehat Junior have been big market players in the education system over the pandemic year, especially in India. Conclusions: The article helps to draw a holistic approach of ongoing online teaching-learning methods during the lockdown and also highlights changes that took place in the conventional education system amid the COVID pandemic to overcome the persisting disruption in academic activities and to ensure correct perception towards the online procedure as a normal course of action in the new educational system. To fill in the void of classroom learning and to minimize the virus spread over the last year, digital learning in various schools and colleges has been emphasized, leading to a significant increase in the usage of whiteboard software platforms.
Collapse
Affiliation(s)
- Ashwani Kumar Kansal
- Kasturba Institute of Technology, Abdul Kalam Technological University, Lucknow 226031, India;
| | - Jyoti Gautam
- JSS Academy of Technical Education, Noida 201301, India;
| | - Nalini Chintalapudi
- Telemedicine and Telepharmacy Centre, School of Medicinal and Health Products Sciences, University of Camerino, 62032 Camerino, Italy;
| | - Shivani Jain
- Department of Computer Science & Engineering, Indira Gandhi Delhi Technical University for Women, Delhi 110006, India;
| | - Gopi Battineni
- Telemedicine and Telepharmacy Centre, School of Medicinal and Health Products Sciences, University of Camerino, 62032 Camerino, Italy;
| |
Collapse
|