1
|
Khan S, Qasim I, Khan W, Khan A, Ali Khan J, Qahmash A, Ghadi YY. An automated approach to identify sarcasm in low-resource language. PLoS One 2024; 19:e0307186. [PMID: 39637015 PMCID: PMC11620596 DOI: 10.1371/journal.pone.0307186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 07/02/2024] [Indexed: 12/07/2024] Open
Abstract
Sarcasm detection has emerged due to its applicability in natural language processing (NLP) but lacks substantial exploration in low-resource languages like Urdu, Arabic, Pashto, and Roman-Urdu. While fewer studies identifying sarcasm have focused on low-resource languages, most of the work is in English. This research addresses the gap by exploring the efficacy of diverse machine learning (ML) algorithms in identifying sarcasm in Urdu. The scarcity of annotated datasets for low-resource language becomes a challenge. To overcome the challenge, we curated and released a comparatively large dataset named Urdu Sarcastic Tweets (UST) Dataset, comprising user-generated comments from [Formula: see text] (former Twitter). Automatic sarcasm detection in text involves using computational methods to determine if a given statement is intended to be sarcastic. However, this task is challenging due to the influence of the user's behavior and attitude and their expression of emotions. To address this challenge, we employ various baseline ML classifiers to evaluate their effectiveness in detecting sarcasm in low-resource languages. The primary models evaluated in this study are support vector machine (SVM), decision tree (DT), K-Nearest Neighbor Classifier (K-NN), linear regression (LR), random forest (RF), Naïve Bayes (NB), and XGBoost. Our study's assessment involved validating the performance of these ML classifiers on two distinct datasets-the Tanz-Indicator and the UST dataset. The SVM classifier consistently outperformed other ML models with an accuracy of 0.85 across various experimental setups. This research underscores the importance of tailored sarcasm detection approaches to accommodate specific linguistic characteristics in low-resource languages, paving the way for future investigations. By providing open access to the UST dataset, we encourage its use as a benchmark for sarcasm detection research in similar linguistic contexts.
Collapse
Affiliation(s)
- Shumaila Khan
- Institute of CS & IT, University of Science & Technology, Bannu, Pakistan
| | - Iqbal Qasim
- Institute of CS & IT, University of Science & Technology, Bannu, Pakistan
| | - Wahab Khan
- Institute of CS & IT, University of Science & Technology, Bannu, Pakistan
| | - Aurangzeb Khan
- Institute of CS & IT, University of Science & Technology, Bannu, Pakistan
| | - Javed Ali Khan
- Department of Computer Science, School of Physics, Engineering & Computer Science, University of Hertfordshire, Hatfield, United Kingdom
| | - Ayman Qahmash
- Department of Informatics and Computer Systems, King Khalid University, Abha, Saudi Arabia
| | | |
Collapse
|
2
|
Tucudean G, Bucos M, Dragulescu B, Caleanu CD. Natural language processing with transformers: a review. PeerJ Comput Sci 2024; 10:e2222. [PMID: 39145251 PMCID: PMC11322986 DOI: 10.7717/peerj-cs.2222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 07/08/2024] [Indexed: 08/16/2024]
Abstract
Natural language processing (NLP) tasks can be addressed with several deep learning architectures, and many different approaches have proven to be efficient. This study aims to briefly summarize the use cases for NLP tasks along with the main architectures. This research presents transformer-based solutions for NLP tasks such as Bidirectional Encoder Representations from Transformers (BERT), and Generative Pre-Training (GPT) architectures. To achieve that, we conducted a step-by-step process in the review strategy: identify the recent studies that include Transformers, apply filters to extract the most consistent studies, identify and define inclusion and exclusion criteria, assess the strategy proposed in each study, and finally discuss the methods and architectures presented in the resulting articles. These steps facilitated the systematic summarization and comparative analysis of NLP applications based on Transformer architectures. The primary focus is the current state of the NLP domain, particularly regarding its applications, language models, and data set types. The results provide insights into the challenges encountered in this research domain.
Collapse
Affiliation(s)
- Georgiana Tucudean
- Communications Department, Politehnica University Timișoara, Timișoara, Timiș, România
| | - Marian Bucos
- Communications Department, Politehnica University Timișoara, Timișoara, Timiș, România
| | - Bogdan Dragulescu
- Communications Department, Politehnica University Timișoara, Timișoara, Timiș, România
| | - Catalin Daniel Caleanu
- Applied Electronics Department, Politehnica University Timișoara, Timișoara, Timiș, România
| |
Collapse
|
3
|
Yakura H. Evaluating Large Language Models' Ability Using a Psychiatric Screening Tool Based on Metaphor and Sarcasm Scenarios. J Intell 2024; 12:70. [PMID: 39057190 PMCID: PMC11278383 DOI: 10.3390/jintelligence12070070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 07/15/2024] [Accepted: 07/18/2024] [Indexed: 07/28/2024] Open
Abstract
Metaphors and sarcasm are precious fruits of our highly evolved social communication skills. However, children with the condition then known as Asperger syndrome are known to have difficulties in comprehending sarcasm, even if they possess adequate verbal IQs for understanding metaphors. Accordingly, researchers had employed a screening test that assesses metaphor and sarcasm comprehension to distinguish Asperger syndrome from other conditions with similar external behaviors (e.g., attention-deficit/hyperactivity disorder). This study employs a standardized test to evaluate recent large language models' (LLMs) understanding of nuanced human communication. The results indicate improved metaphor comprehension with increased model parameters; however, no similar improvement was observed for sarcasm comprehension. Considering that a human's ability to grasp sarcasm has been associated with the amygdala, a pivotal cerebral region for emotional learning, a distinctive strategy for training LLMs would be imperative to imbue them with the ability in a cognitively grounded manner.
Collapse
Affiliation(s)
- Hiromu Yakura
- Max-Planck Institute for Human Development, 14195 Berlin, Germany
| |
Collapse
|
4
|
Zhang H, Shafiq MO. Survey of transformers and towards ensemble learning using transformers for natural language processing. JOURNAL OF BIG DATA 2024; 11:25. [PMID: 38321999 PMCID: PMC10838835 DOI: 10.1186/s40537-023-00842-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 10/11/2023] [Indexed: 02/08/2024]
Abstract
The transformer model is a famous natural language processing model proposed by Google in 2017. Now, with the extensive development of deep learning, many natural language processing tasks can be solved by deep learning methods. After the BERT model was proposed, many pre-trained models such as the XLNet model, the RoBERTa model, and the ALBERT model were also proposed in the research community. These models perform very well in various natural language processing tasks. In this paper, we describe and compare these well-known models. In addition, we also apply several types of existing and well-known models which are the BERT model, the XLNet model, the RoBERTa model, the GPT2 model, and the ALBERT model to different existing and well-known natural language processing tasks, and analyze each model based on their performance. There are a few papers that comprehensively compare various transformer models. In our paper, we use six types of well-known tasks, such as sentiment analysis, question answering, text generation, text summarization, name entity recognition, and topic modeling tasks to compare the performance of various transformer models. In addition, using the existing models, we also propose ensemble learning models for the different natural language processing tasks. The results show that our ensemble learning models perform better than a single classifier on specific tasks. Graphical Abstract
Collapse
Affiliation(s)
- Hongzhi Zhang
- School of Information Technology, Carleton University, Ottawa, ON Canada
| | - M. Omair Shafiq
- School of Information Technology, Carleton University, Ottawa, ON Canada
| |
Collapse
|
5
|
Singh S, Kumar M, Kumar A, Verma BK, Abhishek K, Selvarajan S. Efficient pneumonia detection using Vision Transformers on chest X-rays. Sci Rep 2024; 14:2487. [PMID: 38291130 PMCID: PMC10827725 DOI: 10.1038/s41598-024-52703-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 01/22/2024] [Indexed: 02/01/2024] Open
Abstract
Pneumonia is a widespread and acute respiratory infection that impacts people of all ages. Early detection and treatment of pneumonia are essential for avoiding complications and enhancing clinical results. We can reduce mortality, improve healthcare efficiency, and contribute to the global battle against a disease that has plagued humanity for centuries by devising and deploying effective detection methods. Detecting pneumonia is not only a medical necessity but also a humanitarian imperative and a technological frontier. Chest X-rays are a frequently used imaging modality for diagnosing pneumonia. This paper examines in detail a cutting-edge method for detecting pneumonia implemented on the Vision Transformer (ViT) architecture on a public dataset of chest X-rays available on Kaggle. To acquire global context and spatial relationships from chest X-ray images, the proposed framework deploys the ViT model, which integrates self-attention mechanisms and transformer architecture. According to our experimentation with the proposed Vision Transformer-based framework, it achieves a higher accuracy of 97.61%, sensitivity of 95%, and specificity of 98% in detecting pneumonia from chest X-rays. The ViT model is preferable for capturing global context, comprehending spatial relationships, and processing images that have different resolutions. The framework establishes its efficacy as a robust pneumonia detection solution by surpassing convolutional neural network (CNN) based architectures.
Collapse
Affiliation(s)
| | - Manoj Kumar
- JSS Academy of Technical Education, Noida, India
| | - Abhay Kumar
- National Institute of Technology Patna, Patna, India
| | | | | | - Shitharth Selvarajan
- School of Built Environment, Engineering and Computing, Leeds Beckett University, LS1 3HE, Leeds, UK.
| |
Collapse
|
6
|
Samaras L, García-Barriocanal E, Sicilia MA. Sentiment analysis of COVID-19 cases in Greece using Twitter data. EXPERT SYSTEMS WITH APPLICATIONS 2023; 230:120577. [PMID: 37317687 PMCID: PMC10245283 DOI: 10.1016/j.eswa.2023.120577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 04/29/2023] [Accepted: 05/27/2023] [Indexed: 06/16/2023]
Abstract
Background Syndromic surveillance with the use of Internet data has been used to track and forecast epidemics for the last two decades, using different sources from social media to search engine records. More recently, studies have addressed how the World Wide Web could be used as a valuable source for analysing the reactions of the public to outbreaks and revealing emotions and sentiment impact from certain events, notably that of pandemics. Objective The objective of this research is to evaluate the capability of Twitter messages (tweets) in estimating the sentiment impact of COVID-19 cases in Greece in real time as related to cases. Methods 153,528 tweets were gathered from 18,730 Twitter users totalling 2,840,024 words for exactly one year and were examined towards two sentimental lexicons: one in English language translated into Greek (using the Vader library) and one in Greek. We then used the specific sentimental ranking included in these lexicons to track i) the positive and negative impact of COVID-19 and ii) six types of sentiments: Surprise, Disgust, Anger, Happiness, Fear and Sadness and iii) the correlations between real cases of COVID-19 and sentiments and correlations between sentiments and the volume of data. Results Surprise (25.32%) mainly and secondly Disgust (19.88%) were found to be the prevailing sentiments of COVID-19. The correlation coefficient (R2) for the Vader lexicon is -0.07454 related to cases and -0.,70668 to the tweets, while the other lexicon had 0.167387 and -0.93095 respectively, all measured at significance level of p < 0.01. Evidence shows that the sentiment does not correlate with the spread of COVID-19, possibly since the interest in COVID-19 declined after a certain time.
Collapse
Affiliation(s)
- Loukas Samaras
- Computer Science Department, Polytechnic Building, University of Alcalá, Ctra. De Barcelona km. 33.6, 28871 Alcalá de Henares (Madrid), Spain
| | - Elena García-Barriocanal
- Computer Science Department, Polytechnic Building, University of Alcalá, Ctra. De Barcelona km. 33.6, 28871 Alcalá de Henares (Madrid), Spain
| | - Miguel-Angel Sicilia
- Computer Science Department, Polytechnic Building, University of Alcalá, Ctra. De Barcelona km. 33.6, 28871 Alcalá de Henares (Madrid), Spain
| |
Collapse
|
7
|
Hossain MR, Hoque MM, Siddique N, Sarker IH. CovTiNet: Covid text identification network using attention-based positional embedding feature fusion. Neural Comput Appl 2023; 35:13503-13527. [PMCID: PMC10011801 DOI: 10.1007/s00521-023-08442-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 02/24/2023] [Indexed: 03/28/2023]
Abstract
Covid text identification (CTI) is a crucial research concern in natural language processing (NLP). Social and electronic media are simultaneously adding a large volume of Covid-affiliated text on the World Wide Web due to the effortless access to the Internet, electronic gadgets and the Covid outbreak. Most of these texts are uninformative and contain misinformation, disinformation and malinformation that create an infodemic. Thus, Covid text identification is essential for controlling societal distrust and panic. Though very little Covid-related research (such as Covid disinformation, misinformation and fake news) has been reported in high-resource languages (e.g. English), CTI in low-resource languages (like Bengali) is in the preliminary stage to date. However, automatic CTI in Bengali text is challenging due to the deficit of benchmark corpora, complex linguistic constructs, immense verb inflexions and scarcity of NLP tools. On the other hand, the manual processing of Bengali Covid texts is arduous and costly due to their messy or unstructured forms. This research proposes a deep learning-based network (CovTiNet) to identify Covid text in Bengali. The CovTiNet incorporates an attention-based position embedding feature fusion for text-to-feature representation and attention-based CNN for Covid text identification. Experimental results show that the proposed CovTiNet achieved the highest accuracy of 96.61±.001% on the developed dataset (BCovC) compared to the other methods and baselines (i.e. BERT-M, IndicBERT, ELECTRA-Bengali, DistilBERT-M, BiLSTM, DCNN, CNN, LSTM, VDCNN and ACNN).
Collapse
Affiliation(s)
- Md. Rajib Hossain
- Department of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong, 4349 Bangladesh
| | - Mohammed Moshiul Hoque
- Department of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong, 4349 Bangladesh
| | - Nazmul Siddique
- School of Computing, Engineering and Intelligent Systems, Ulster University, Londonderry, UK
| | - Iqbal H. Sarker
- Department of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong, 4349 Bangladesh
- Security Research Institute, Edith Cowan University, Joondalup, WA 6027 Australia
| |
Collapse
|
8
|
Replicable semi-supervised approaches to state-of-the-art stance detection of tweets. Inf Process Manag 2023. [DOI: 10.1016/j.ipm.2022.103199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
9
|
End-to-End Transformer-Based Models in Textual-Based NLP. AI 2023. [DOI: 10.3390/ai4010004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Transformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transformer’s standard architecture. This survey focuses on TB models used in the field of Natural Language Processing (NLP) for textual-based tasks. We begin with an overview of the fundamental concepts at the heart of the success of these models. Then, we classify them based on their architecture and training mode. We compare the advantages and disadvantages of popular techniques in terms of architectural design and experimental value. Finally, we discuss open research, directions, and potential future work to help solve current TB application challenges in NLP.
Collapse
|
10
|
Das S, Ghosh S, Kolya AK, Ekbal A. Unparalleled sarcasm: a framework of parallel deep LSTMs with cross activation functions towards detection and generation of sarcastic statements. LANG RESOUR EVAL 2022. [DOI: 10.1007/s10579-022-09622-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
11
|
Automatic Sarcasm Detection: Systematic Literature Review. INFORMATION 2022. [DOI: 10.3390/info13080399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Sarcasm is an integral part of human language and culture. Naturally, it has garnered great interest from researchers from varied fields of study, including Artificial Intelligence, especially Natural Language Processing. Automatic sarcasm detection has become an increasingly popular topic in the past decade. The research conducted in this paper presents, through a systematic literature review, the evolution of the automatic sarcasm detection task from its inception in 2010 to the present day. No such work has been conducted thus far and it is essential to establish the progress that researchers have made when tackling this task and, moving forward, what the trends are. This study finds that multi-modal approaches and transformer-based architectures have become increasingly popular in recent years. Additionally, this paper presents a critique of the work carried out so far and proposes future directions of research in the field.
Collapse
|
12
|
Zhou L, Zhou K, Liu C. Stance detection of user reviews on social network with integrated structural information. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-221953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Stance detection is the task of classifying user reviews towards a given topic as either supporting, denying, querying, or commenting (SDQC). Most approaches for solving this problem use only the textual features, including the linguistic features and users’ vocabulary choice. A few approaches have shown that information from the network structure like graph model can add value, in addition to the textual features, by providing social connections and interactions that may be vital for the stance detection task. In this paper, we present a novel model that combines the text features with the network structure by (1) creating a graph-structure model based on conversational structure towards specific topics and (2) constructing a tree-gated neural network model (TreeGGNN) to capture structure information among reviews. We evaluate our model on four baseline models, which shows that the combination of text and network can achieve an improvement of 2–6% over the state-of-the-art baselines.
Collapse
Affiliation(s)
- Lixin Zhou
- Business School, University of Shanghai for Science and Technology, Shanghai, China
| | - Kexin Zhou
- Business School, University of Shanghai for Science and Technology, Shanghai, China
| | - Chen Liu
- Business School, University of Shanghai for Science and Technology, Shanghai, China
| |
Collapse
|
13
|
Sliding space-disparity transformer for stereo matching. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07621-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
14
|
Ahuja R, Sharma SC. Transformer-Based Word Embedding With CNN Model to Detect Sarcasm and Irony. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-021-06193-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
15
|
Hosted Cuckoo Optimization Algorithm with Stacked Autoencoder-Enabled Sarcasm Detection in Online Social Networks. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12147119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Sarcasm detection has received considerable interest in online social media networks due to the dramatic expansion in Internet usage. Sarcasm is a linguistic expression of dislikes or negative emotions by using overstated language constructs. Recently, detecting sarcastic posts on social networking platforms has gained popularity, especially since sarcastic comments in the form of tweets typically involve positive words that describe undesirable or negative characteristics. Simultaneously, the emergence of machine learning (ML) algorithms has made it easier to design efficacious sarcasm detection techniques. This study introduces a new Hosted Cuckoo Optimization Algorithm with Stacked Autoencoder-Enabled Sarcasm Detection and Classification (HCOA-SACDC) model. The presented HCOA-SACDC model predominantly focuses on the detection and classification of sarcasm in the OSN environment. To achieve this, the HCOA-SACDC model pre-processes input data to make them compatible for further processing. Furthermore, the term frequency–inverse document frequency (TF-IDF) model is employed for the useful extraction of features. Moreover, the stacked autoencoder (SAE) model is utilized for the recognition and categorization of sarcasm. Since the parameters related to the SAE model considerably affect the overall classification performance, the HCO algorithm is exploited to fine-tune the parameters involved in the SAE, showing the novelty of the work. A comprehensive experimental analysis of a benchmark dataset is performed to highlight the superior outcomes of the HCOA-SACDC model. The simulation results indicate that the HCOA-SACDC model accomplished enhanced performance over other techniques.
Collapse
|
16
|
Liu J, Tian S, Yu L, Long J, zhou T, Wang B. Attention-based multi-modal fusion sarcasm detection. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-213501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Sarcasm is a way to express the thoughts of a person. The intended meaning of the ideas expressed through sarcasm is often the opposite of the apparent meaning. Previous work on sarcasm detection mainly focused on the text. But nowadays most information is multi-modal, including text and images. Therefore, the task of targeting multi-modal sarcasm detection is becoming an increasingly hot research topic. In order to better detect the accurate meaning of multi-modal sarcasm information, this paper proposed a multi-modal fusion sarcasm detection model based on the attention mechanism, which introduced Vision Transformer (ViT) to extract image features and designed a Double-Layer Bi-Directional Gated Recurrent Unit (D-BiGRU) to extract text features. The features of the two modalities are fused into one feature vector and predicted after attention enhancement. The model presented in this paper gained significant experimental results on the baseline datasets, which are 0.71% and 0.38% higher than that of the best baseline model proposed on F1-score and accuracy respectively.
Collapse
Affiliation(s)
- Jing Liu
- School of Software, Xinjiang University, Xinjiang, China
| | - Shengwei Tian
- School of Software, Xinjiang University, Xinjiang, China
| | - Long Yu
- Network and Information Center, Xinjiang University, Xinjiang, China
| | - Jun Long
- School of Information Science and Engineering, Central South University, Changsha, China
- Big Data and Knowledge Engineering Institute, Central South University, Changsha, China
| | - Tiejun zhou
- Xinjiang Internet Information Center, Xinjiang, China
| | - Bo Wang
- School of Software, Xinjiang University, Xinjiang, China
| |
Collapse
|
17
|
Building towards Automated Cyberbullying Detection: A Comparative Analysis. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:4794227. [PMID: 35789611 PMCID: PMC9250443 DOI: 10.1155/2022/4794227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/27/2022] [Accepted: 05/30/2022] [Indexed: 11/17/2022]
Abstract
The increased use of social media among digitally anonymous users, sharing their thoughts and opinions, can facilitate participation and collaboration. However, this anonymity feature which gives users freedom of speech and allows them to conduct activities without being judged by others can also encourage cyberbullying and hate speech. Predators can hide their identity and reach a wide range of audience anytime and anywhere. According to the detrimental effect of cyberbullying, there is a growing need for cyberbullying detection approaches. In this survey paper, a comparative analysis of the automated cyberbullying techniques from different perspectives is discussed including data annotation, data preprocessing, and feature engineering. In addition, the importance of emojis in expressing emotions as well as their influence on sentiment classification and text comprehension leads us to discuss the role of incorporating emojis in the process of cyberbullying detection and their influence on the detection performance. Furthermore, the different domains for using self-supervised learning (SSL) as an annotation technique for cyberbullying detection are explored.
Collapse
|
18
|
Semantic-aware conditional variational autoencoder for one-to-many dialogue generation. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07182-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
19
|
A ranked solution for social media fact checking using epidemic spread modeling. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
20
|
Ortega-Bueno R, Rosso P, Medina Pagola JE. Multi-view informed attention-based model for Irony and Satire detection in Spanish variants. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107597] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
21
|
A human-centred deep learning approach facilitating design pedagogues to frame creative questions. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-06511-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
|
22
|
Subba B, Kumari S. A heterogeneous stacking ensemble based sentiment analysis framework using multiple word embeddings. Comput Intell 2021. [DOI: 10.1111/coin.12478] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Affiliation(s)
- Basant Subba
- Department of CSE National Institute of Technology Hamirpur Hamirpur India
| | - Simpy Kumari
- Department of CSE National Institute of Technology Hamirpur Hamirpur India
| |
Collapse
|
23
|
Public Sentiment toward Solar Energy—Opinion Mining of Twitter Using a Transformer-Based Language Model. SUSTAINABILITY 2021. [DOI: 10.3390/su13052673] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Public acceptance and support for renewable energy are important determinants of the low-carbon energy transition. This paper examines public sentiment toward solar energy in the United States using data from Twitter, a micro-blogging platform on which people post messages, known as tweets. We filtered tweets specific to solar energy and performed a classification task using Robustly optimized Bidirectional Encoder Representations from Transformers (RoBERTa). Our RoBERTa-based sentiment classification model, fine-tuned with 6300 manually annotated tweets specific to solar energy, attains 80.2% accuracy for ternary (positive, neutral, or negative) classification. Analyzing 266,686 tweets during the period of January to December 2020, we find public sentiment varies widely across states (Coefficient of Variation =164.66%). Within the study period, the Northeast U.S. region shows more positive sentiment toward solar energy than did the South U.S. region. Public opinion on solar energy is more positive in states with a larger share of Democratic voters in the 2020 presidential election. Public sentiment toward solar energy is more positive in states with consumer-friendly net metering policies and a more mature solar market. States that wish to gain public support for solar energy might want to consider implementing consumer-friendly net metering policies and support the growth of solar businesses.
Collapse
|