1
|
Li Z, Wang C. Achieving Sharp Upper Bounds on the Expressive Power of Neural Networks via Tropical Polynomials. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:2931-2945. [PMID: 38315593 DOI: 10.1109/tnnls.2024.3350786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
The expressive power of neural networks describes the ability to represent or approximate complex functions. The number of linear regions is the standard and most natural measure of expressive power. However, a major challenge in utilizing the number of linear regions as a measure of expressive power is the exponential gap between the theoretical upper and lower bounds, which becomes more pronounced as the neural network capacity increases. In this article, we aim to derive a sharp upper bound on piecewise linear neural networks (PLNNs) to bridge this gap. Specifically, we first establish the relationship between tropical polynomials and PLNNs. In the unexpanded tropical polynomials form, we make the proposition that hyperplanes are not all in the general positions, thereby reducing the number of intersecting hyperplanes. We propose a rank-based approach and present the empirical analysis that this approach outperforms previous Zaslavsky's theorem-based methods. In the expanded tropical polynomials form, accounting for limitations in weight initialization and model computational precision, we raise the concept that the values range of each term is bounded. We propose a precision-based approach that transforms the approximate exponential growth of the number of linear regions into polynomial growth with width, which is effective at larger layer widths. Finally, we compare the number of linear regions that can be represented by each hidden layer in both forms and derive a sharp upper bound for PLNNs. Empirical analysis and experimental results provide compelling evidence for the efficacy and feasibility of this sharp upper bound on both simulated experiments and real datasets.
Collapse
|
2
|
Li P, Fu X, Chen J, Hu J. CoGraphNet for enhanced text classification using word-sentence heterogeneous graph representations and improved interpretability. Sci Rep 2025; 15:356. [PMID: 39747366 PMCID: PMC11696360 DOI: 10.1038/s41598-024-83535-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 12/16/2024] [Indexed: 01/04/2025] Open
Abstract
Text Graph Representation Learning through Graph Neural Networks (TG-GNN) is a powerful approach in natural language processing and information retrieval. However, it faces challenges in computational complexity and interpretability. In this work, we propose CoGraphNet, a novel graph-based model for text classification, addressing key issues. To overcome information loss, we construct separate heterogeneous graphs for words and sentences, capturing multi-tiered contextual information. We enhance interpretability by incorporating positional bias weights, improving model clarity. CoGraphNet provides precise analysis, highlighting important words or sentences. We achieve enhanced contextual comprehension and accuracy through novel graph structures and the SwiGLU activation function. Experiments on Ohsumed, MR, R52, and 20NG datasets confirm CoGraphNet's effectiveness in complex classification tasks, demonstrating its superiority.
Collapse
Affiliation(s)
- Pengyi Li
- Suzhou Yuelan Technology Development Co., Ltd, SuZhou, 215128, China.
| | - Xueying Fu
- School of Computer Sciences, Universiti Sains Malaysia, 11800, Penang, Malaysia
| | - Juntao Chen
- School of Mathematics and Artificial Intelligence, Chongqing University of Arts and Sciences, Chongqing, 402160, China.
| | - Junyi Hu
- School of Economics and Management, Beijing jiaotong University, Shandong, 264401, China
| |
Collapse
|
3
|
Jiang Z, Ding Q. A framework for hardware trojan detection based on contrastive learning. Sci Rep 2024; 14:30847. [PMID: 39730555 DOI: 10.1038/s41598-024-81473-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 11/26/2024] [Indexed: 12/29/2024] Open
Abstract
With the rapid development of the semiconductor industry, Hardware Trojans (HT) as a kind of malicious function that can be implanted at will in all processes of integrated circuit design, manufacturing, and deployment have become a great threat in the field of hardware security. Side-channel analysis is widely used in the detection of HT due to its high efficiency, non-contact nature, and accuracy. In this paper, we propose a framework for HT detection based on contrastive learning using power consumption information in unsupervised or weakly supervised scenarios. First, the framework augments the data, such as creatively using a one-dimensional discrete chaotic mapping to disturb the data to achieve data augmentation to improve the generalization capabilities of the model. Second, the model representation is learned by comparing the similarities and differences between samples, freeing it from the dependence on labels. Finally, the detection of HT is accomplished more efficiently by categorizing the side information during circuit operation through the backbone network. Experiments on data from nine different public HTs show that the proposed method exhibits better generalization capabilities using the same network model within a comparative learning framework. The model trained on the dataset of small Trojan T100 has a detection efficiency advantage of up to 44% in detecting large Trojans, while the model trained on the dataset of large Trojan T2100 has a detection efficiency advantage of up to 10% in detecting small Trojans. The results in data imbalanced and noisy environments also show that the contrastive learning framework in this paper can better fulfill the requirements of detecting unknown HT in unsupervised or weakly supervised scenarios.
Collapse
Affiliation(s)
- Zijing Jiang
- Electronic Engineering College, Heilongjiang University, Harbin, 150080, China
| | - Qun Ding
- Electronic Engineering College, Heilongjiang University, Harbin, 150080, China.
| |
Collapse
|
4
|
Xu N, Wang Y. Decision model of public opinion risk in campus social network based on hybrid dynamic deletion and shortest path algorithm. PLoS One 2024; 19:e0310894. [PMID: 39556588 PMCID: PMC11573221 DOI: 10.1371/journal.pone.0310894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 09/08/2024] [Indexed: 11/20/2024] Open
Abstract
Aiming at the problem that traditional network public opinion monitoring and searching are inefficient and can easily cause resource waste, the study firstly, through the dynamic deletion-shortest path algorithm to classify network text, and on this basis, innovatively constructs a text sentiment classification model based on the variant of convolutional neural network and recurrent neural network, and secondly, uses attention mechanism to classify the model. improvement of the classification model by using the attention mechanism. The research results show that the average precision rate, recall rate, and F-value of the dynamic deletion-shortest path algorithm are 97.30%, 79.55%, and 87.53%, and the classification speed is 397 KB/s, which is better than the traditional shortest path algorithm. In the classification effect measurement of long text, the accuracy and F-value of the recurrent neural network variant model are above 84%, and the accuracy of the text sentiment classification model with the introduction of the attention mechanism is improved by 3.89% compared to the pre-improvement period. In summary, the dynamic deletion-shortest path algorithm proposed in the study and the sentiment classification model with the introduction of the attention mechanism have superior performance and can provide certain application value for campus social network opinion risk decision-making.
Collapse
Affiliation(s)
- Nan Xu
- School of Economics and Management, Xidian University, Xi’an, China
| | - Yifeng Wang
- School of Economics and Management, Xidian University, Xi’an, China
| |
Collapse
|
5
|
Liu Z, Wen C, Su Z, Liu S, Sun J, Kong W, Yang Z. Emotion-Semantic-Aware Dual Contrastive Learning for Epistemic Emotion Identification of Learner-Generated Reviews in MOOCs. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:16464-16477. [PMID: 37486839 DOI: 10.1109/tnnls.2023.3294636] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/26/2023]
Abstract
Identifying the epistemic emotions of learner-generated reviews in massive open online courses (MOOCs) can help instructors provide adaptive guidance and interventions for learners. The epistemic emotion identification task is a fine-grained identification task that contains multiple categories of emotions arising during the learning process. Previous studies only consider emotional or semantic information within the review texts alone, which leads to insufficient feature representation. In addition, some categories of epistemic emotions are ambiguously distributed in feature space, making them hard to be distinguished. In this article, we present an emotion-semantic-aware dual contrastive learning (ES-DCL) approach to tackle these issues. In order to learn sufficient feature representation, implicit semantic features and human-interpretable emotional features are, respectively, extracted from two different views to form complementary emotional-semantic features. On this basis, by leveraging the experience of domain experts and the input emotional-semantic features, two types of contrastive losses (label contrastive loss and feature contrastive loss) are formulated. They are designed to train the discriminative distribution of emotional-semantic features in the sample space and to solve the anisotropy problem between different categories of epistemic emotions. The proposed ES-DCL is compared with 11 other baseline models on four different disciplinary MOOCs review datasets. Extensive experimental results show that our approach improves the performance of epistemic emotion identification, and significantly outperforms state-of-the-art deep learning-based methods in learning more discriminative sentence representations.
Collapse
|
6
|
Zhang K, Wu L, Lv G, Chen E, Ruan S, Liu J, Zhang Z, Zhou J, Wang M. Description-Enhanced Label Embedding Contrastive Learning for Text Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14889-14902. [PMID: 37327102 DOI: 10.1109/tnnls.2023.3282020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Text classification is one of the fundamental tasks in natural language processing, which requires an agent to determine the most appropriate category for input sentences. Recently, deep neural networks have achieved impressive performance in this area, especially pretrained language models (PLMs). Usually, these methods concentrate on input sentences and corresponding semantic embedding generation. However, for another essential component: labels, most existing works either treat them as meaningless one-hot vectors or use vanilla embedding methods to learn label representations along with model training, underestimating the semantic information and guidance that these labels reveal. To alleviate this problem and better exploit label information, in this article, we employ self-supervised learning (SSL) in model learning process and design a novel self-supervised relation of relation ( [Formula: see text]) classification task for label utilization from a one-hot manner perspective. Then, we propose a novel relation of relation learning network( [Formula: see text]-Net) for text classification, in which text classification and [Formula: see text] classification are treated as optimization targets. Meanwhile, triplet loss is employed to enhance the analysis of differences and connections among labels. Moreover, considering that one-hot usage is still short of exploiting label information, we incorporate external knowledge from WordNet to obtain multiaspect descriptions for label semantic learning and extend [Formula: see text]-Net to a novel description-enhanced label embedding network(DELE) from a label embedding perspective. One step further, since these fine-grained descriptions may introduce unexpected noise, we develop a mutual interaction module to select appropriate parts from input sentences and labels simultaneously based on contrastive learning (CL) for noise mitigation. Extensive experiments on different text classification tasks reveal that [Formula: see text]-Net can effectively improve the classification performance and DELE can make better use of label information and further improve the performance. As a byproduct, we have released the codes to facilitate other research.
Collapse
|
7
|
Le-Khac UN, Bolton M, Boxall NJ, Wallace SMN, George Y. Living review framework for better policy design and management of hazardous waste in Australia. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 924:171556. [PMID: 38458450 DOI: 10.1016/j.scitotenv.2024.171556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 02/25/2024] [Accepted: 03/04/2024] [Indexed: 03/10/2024]
Abstract
The significant increase in hazardous waste generation in Australia has led to the discussion over the incorporation of artificial intelligence into the hazardous waste management system. Recent studies explored the potential applications of artificial intelligence in various processes of managing waste. However, no study has examined the use of text mining in the hazardous waste management sector for the purpose of informing policymakers. This study developed a living review framework which applied supervised text classification and text mining techniques to extract knowledge using the domain literature data between 2022 and 2023. The framework employed statistical classification models trained using iterative training and the best model XGBoost achieved an F1 score of 0.87. Using a small set of 126 manually labelled global articles, XGBoost automatically predicted the labels of 678 Australian articles with high confidence. Then, keyword extraction and unsupervised topic modelling with Latent Dirichlet Allocation (LDA) were performed. Results indicated that there were 2 main research themes in Australian literature: (1) the key waste streams and (2) the resource recovery and recycling of waste. The implication of this framework would benefit the policymakers, researchers, and hazardous waste management organisations by serving as a real time guideline of the current key waste streams and research themes in the literature which allow robust knowledge to be applied to waste management and highlight where the gap in research remains.
Collapse
Affiliation(s)
- Uyen N Le-Khac
- Data Science and AI Department, Faculty of Information Technology, Monash University, Australia.
| | - Mitzi Bolton
- Monash Sustainable Development Institute, Monash University, Australia
| | - Naomi J Boxall
- Environment, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia
| | - Stephanie M N Wallace
- Centre for Anthropogenic Pollution Impact and Management (CAPIM), School of BioSciences, University of Melbourne, Australia
| | - Yasmeen George
- Data Science and AI Department, Faculty of Information Technology, Monash University, Australia
| |
Collapse
|
8
|
Zhang D, Li J, Xie Y, Wulamu A. Research on performance variations of classifiers with the influence of pre-processing methods for Chinese short text classification. PLoS One 2023; 18:e0292582. [PMID: 37824464 PMCID: PMC10569603 DOI: 10.1371/journal.pone.0292582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 09/24/2023] [Indexed: 10/14/2023] Open
Abstract
Text pre-processing is an important component of a Chinese text classification. At present, however, most of the studies on this topic focus on exploring the influence of preprocessing methods on a few text classification algorithms using English text. In this paper we experimentally compared fifteen commonly used classifiers on two Chinese datasets using three widely used Chinese preprocessing methods that include word segmentation, Chinese specific stop word removal, and Chinese specific symbol removal. We then explored the influence of the preprocessing methods on the final classifications according to various conditions such as classification evaluation, combination style, and classifier selection. Finally, we conducted a battery of various additional experiments, and found that most of the classifiers improved in performance after proper preprocessing was applied. Our general conclusion is that the systematic use of preprocessing methods can have a positive impact on the classification of Chinese short text, using classification evaluation such as macro-F1, combination of preprocessing methods such as word segmentation, Chinese specific stop word and symbol removal, and classifier selection such as machine and deep learning models. We find that the best macro-f1s for categorizing text for the two datasets are 92.13% and 91.99%, which represent improvements of 0.3% and 2%, respectively over the compared baselines.
Collapse
Affiliation(s)
- Dezheng Zhang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Haidian, Beijing, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, University of Science and Technology Beijing, Haidian, Beijing, China
| | - Jing Li
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Haidian, Beijing, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, University of Science and Technology Beijing, Haidian, Beijing, China
| | - Yonghong Xie
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Haidian, Beijing, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, University of Science and Technology Beijing, Haidian, Beijing, China
| | - Aziguli Wulamu
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Haidian, Beijing, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, University of Science and Technology Beijing, Haidian, Beijing, China
| |
Collapse
|
9
|
Guo Y, Zhou D, Ruan X, Cao J. Variational gated autoencoder-based feature extraction model for inferring disease-miRNA associations based on multiview features. Neural Netw 2023; 165:491-505. [PMID: 37336034 DOI: 10.1016/j.neunet.2023.05.052] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 05/19/2023] [Accepted: 05/28/2023] [Indexed: 06/21/2023]
Abstract
MicroRNAs (miRNA) play critical roles in diverse biological processes of diseases. Inferring potential disease-miRNA associations enable us to better understand the development and diagnosis of complex human diseases via computational algorithms. The work presents a variational gated autoencoder-based feature extraction model to extract complex contextual features for inferring potential disease-miRNA associations. Specifically, our model fuses three different similarities of miRNAs into a comprehensive miRNA network and then combines two various similarities of diseases into a comprehensive disease network, respectively. Then, a novel graph autoencoder is designed to extract multilevel representations based on variational gate mechanisms from heterogeneous networks of miRNAs and diseases. Finally, a gate-based association predictor is devised to combine multiscale representations of miRNAs and diseases via a novel contrastive cross-entropy function, and then infer disease-miRNA associations. Experimental results indicate that our proposed model achieves remarkable association prediction performance, proving the efficacy of the variational gate mechanism and contrastive cross-entropy loss for inferring disease-miRNA associations.
Collapse
Affiliation(s)
- Yanbu Guo
- College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China.
| | - Dongming Zhou
- School of Information Science and Engineering, Yunnan University, Kunming 650500, China.
| | - Xiaoli Ruan
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China.
| | - Jinde Cao
- School of Mathematics, Southeast University, Nanjing 211189, China; Yonsei Frontier Lab, Yonsei University, Seoul 03722, South Korea.
| |
Collapse
|
10
|
Liang W, Chen X, Huang S, Xiong G, Yan K, Zhou X. Federal learning edge network based sentiment analysis combating global COVID-19. COMPUTER COMMUNICATIONS 2023; 204:33-42. [PMID: 36970130 PMCID: PMC10030440 DOI: 10.1016/j.comcom.2023.03.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Revised: 01/01/2023] [Accepted: 03/07/2023] [Indexed: 06/18/2023]
Abstract
As one of the important research topics in the field of natural language processing, sentiment analysis aims to analyze web data related to COVID-19, e.g., supporting China government agencies combating COVID-19. There are popular sentiment analysis models based on deep learning techniques, but their performance is limited by the size and distribution of the dataset. In this study, we propose a model based on a federal learning framework with Bert and multi-scale convolutional neural network (Fed_BERT_MSCNN), which contains a Bidirectional Encoder Representations from Transformer modules and a multi-scale convolution layer. The federal learning framework contains a central server and local deep learning machines that train local datasets. Parameter communications were processed through edge networks. The weighted average of each participant's model parameters was communicated in the edge network for final utilization. The proposed federal network not only solves the problem of insufficient data, but also ensures the data privacy of the social platform during the training process and improve the communication efficiency. In the experiment, we used datasets of six social platforms, and used accuracy and F1-score as evaluation criteria to conduct comparative studies. The performance of the proposed Fed_BERT_MSCNN model was generally superior than the existing models in the literature.
Collapse
Affiliation(s)
- Wei Liang
- Business School, Central South University, Changsha, 410083, China
- Changsha Social Laboratory of Artificial Intelligence, Hunan University of Technology and Business, Changsha, 410205, China
| | - Xiaohong Chen
- Business School, Central South University, Changsha, 410083, China
- Changsha Social Laboratory of Artificial Intelligence, Hunan University of Technology and Business, Changsha, 410205, China
| | - Suzhen Huang
- Big Data Institute, Central South University, Changsha, 410083, China
| | - Guanghao Xiong
- College of Information Engineering, China Jiliang University, Hangzhou, 310018, China
| | - Ke Yan
- Department of the Built Environment, College of Design and Engineering, National University of Singapore, 4 Architecture Drive, Singapore 117566, Singapore
| | - Xiaokang Zhou
- Faculty of Data Science, Shiga University, Hikone, 5228522, Japan
- RIKEN Center for Advanced Intelligence Project, RIKEN, Tokyo, 1030027, Japan
| |
Collapse
|
11
|
Ai W, Wang Z, Shao H, Meng T, Li K. A multi-semantic passing framework for semi-supervised long text classification. APPL INTELL 2023. [DOI: 10.1007/s10489-023-04556-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
12
|
A multi-view method of scientific paper classification via heterogeneous graph embeddings. Scientometrics 2022. [DOI: 10.1007/s11192-022-04419-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
13
|
Jia S, Jiang S, Zhang S, Xu M, Jia X. Graph-in-Graph Convolutional Network for Hyperspectral Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:1157-1171. [PMID: 35724277 DOI: 10.1109/tnnls.2022.3182715] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
With the development of hyperspectral sensors, accessible hyperspectral images (HSIs) are increasing, and pixel-oriented classification has attracted much attention. Recently, graph convolutional networks (GCNs) have been proposed to process graph-structured data in non-Euclidean domains and have been employed in HSI classification. But most methods based on GCN are hard to sufficiently exploit information of ground objects due to feature aggregation. To solve this issue, in this article, we proposed a graph-in-graph (GiG) model and a related GiG convolutional network (GiGCN) for HSI classification from a superpixel viewpoint. The GiG representation covers information inside and outside superpixels, respectively, corresponding to the local and global characteristics of ground objects. Concretely, after segmenting HSI into disjoint superpixels, each one is converted to an internal graph. Meanwhile, an external graph is constructed according to the spatial adjacent relationships among superpixels. Significantly, each node in the external graph embeds a corresponding internal graph, forming the so-called GiG structure. Then, GiGCN composed of internal and External graph convolution (EGC) is designed to extract hierarchical features and integrate them into multiple scales, improving the discriminability of GiGCN. Ensemble learning is incorporated to further boost the robustness of GiGCN. It is worth noting that we are the first to propose the GiG framework from the superpixel point and the GiGCN scheme for HSI classification. Experiment results on four benchmark datasets demonstrate that our proposed method is effective and feasible for HSI classification with limited labeled samples. For study replication, the code developed for this study is available at https://github.com/ShuGuoJ/GiGCN.git.
Collapse
|
14
|
Dai K, Li X, Huang X, Ye Y. SentATN: learning sentence transferable embeddings for cross-domain sentiment classification. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03434-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
15
|
Abstract
As the vital technology of natural language understanding, sentence representation reasoning technology mainly focuses on sentence representation methods and reasoning models. Although the performance has been improved, there are still some problems, such as incomplete sentence semantic expression, lack of depth of reasoning model, and lack of interpretability of the reasoning process. Given the reasoning model’s lack of reasoning depth and interpretability, a deep fusion matching network is designed in this paper, which mainly includes a coding layer, matching layer, dependency convolution layer, information aggregation layer, and inference prediction layer. Based on a deep matching network, the matching layer is improved. Furthermore, the heuristic matching algorithm replaces the bidirectional long-short memory neural network to simplify the interactive fusion. As a result, it improves the reasoning depth and reduces the complexity of the model; the dependency convolution layer uses the tree-type convolution network to extract the sentence structure information along with the sentence dependency tree structure, which improves the interpretability of the reasoning process. Finally, the performance of the model is verified on several datasets. The results show that the reasoning effect of the model is better than that of the shallow reasoning model, and the accuracy rate on the SNLI test set reaches 89.0%. At the same time, the semantic correlation analysis results show that the dependency convolution layer is beneficial in improving the interpretability of the reasoning process.
Collapse
|
16
|
A Study of Text Vectorization Method Combining Topic Model and Transfer Learning. Processes (Basel) 2022. [DOI: 10.3390/pr10020350] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
With the development of Internet cloud technology, the scale of data is expanding. Traditional processing methods find it difficult to deal with the problem of information extraction of big data. Therefore, it is necessary to use machine-learning-assisted intelligent processing to extract information from data in order to solve the optimization problem in complex systems. There are many forms of data storage. Among them, text data is an important data type that directly reflects semantic information. Text vectorization is an important concept in natural language processing tasks. Because text data can not be directly used for model parameter training, it is necessary to vectorize the original text data and make it numerical, and then the feature extraction operation can be carried out. The traditional text digitization method is often realized by constructing a bag of words, but the vector generated by this method can not reflect the semantic relationship between words, and it also easily causes the problems of data sparsity and dimension explosion. Therefore, this paper proposes a text vectorization method combining a topic model and transfer learning. Firstly, the topic model is selected to model the text data and extract its keywords, to grasp the main information of the text data. Then, with the help of the bidirectional encoder representations from transformers (BERT) model, which belongs to the pretrained model, model transfer learning is carried out to generate vectors, which are applied to the calculation of similarity between texts. By setting up a comparative experiment, this method is compared with the traditional vectorization method. The experimental results show that the vector generated by the topic-modeling- and transfer-learning-based text vectorization (TTTV) proposed in this paper can obtain better results when calculating the similarity between texts with the same topic, which means that it can more accurately judge whether the contents of the given two texts belong to the same topic.
Collapse
|
17
|
Bert-Enhanced Text Graph Neural Network for Classification. ENTROPY 2021; 23:e23111536. [PMID: 34828233 PMCID: PMC8624482 DOI: 10.3390/e23111536] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 11/14/2021] [Accepted: 11/17/2021] [Indexed: 11/25/2022]
Abstract
Text classification is a fundamental research direction, aims to assign tags to text units. Recently, graph neural networks (GNN) have exhibited some excellent properties in textual information processing. Furthermore, the pre-trained language model also realized promising effects in many tasks. However, many text processing methods cannot model a single text unit’s structure or ignore the semantic features. To solve these problems and comprehensively utilize the text’s structure information and semantic information, we propose a Bert-Enhanced text Graph Neural Network model (BEGNN). For each text, we construct a text graph separately according to the co-occurrence relationship of words and use GNN to extract text features. Moreover, we employ Bert to extract semantic features. The former part can take into account the structural information, and the latter can focus on modeling the semantic information. Finally, we interact and aggregate these two features of different granularity to get a more effective representation. Experiments on standard datasets demonstrate the effectiveness of BEGNN.
Collapse
|
18
|
Gaye B, Zhang D, Wulamu A. Sentiment classification for employees reviews using regression vector- stochastic gradient descent classifier (RV-SGDC). PeerJ Comput Sci 2021; 7:e712. [PMID: 34712795 PMCID: PMC8507482 DOI: 10.7717/peerj-cs.712] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 08/22/2021] [Indexed: 06/13/2023]
Abstract
The satisfaction of employees is very important for any organization to make sufficient progress in production and to achieve its goals. Organizations try to keep their employees satisfied by making their policies according to employees' demands which help to create a good environment for the collective. For this reason, it is beneficial for organizations to perform staff satisfaction surveys to be analyzed, allowing them to gauge the levels of satisfaction among employees. Sentiment analysis is an approach that can assist in this regard as it categorizes sentiments of reviews into positive and negative results. In this study, we perform experiments for the world's big six companies and classify their employees' reviews based on their sentiments. For this, we proposed an approach using lexicon-based and machine learning based techniques. Firstly, we extracted the sentiments of employees from text reviews and labeled the dataset as positive and negative using TextBlob. Then we proposed a hybrid/voting model named Regression Vector-Stochastic Gradient Descent Classifier (RV-SGDC) for sentiment classification. RV-SGDC is a combination of logistic regression, support vector machines, and stochastic gradient descent. We combined these models under a majority voting criteria. We also used other machine learning models in the performance comparison of RV-SGDC. Further, three feature extraction techniques: term frequency-inverse document frequency (TF-IDF), bag of words, and global vectors are used to train learning models. We evaluated the performance of all models in terms of accuracy, precision, recall, and F1 score. The results revealed that RV-SGDC outperforms with a 0.97 accuracy score using the TF-IDF feature due to its hybrid architecture.
Collapse
Affiliation(s)
- Babacar Gaye
- School of Computer and Communication Engineering, University of Science and Technology, Beijing, China
| | - Dezheng Zhang
- School of Computer and Communication Engineering, University of Science and Technology, Beijing, China
| | - Aziguli Wulamu
- School of Computer and Communication Engineering, University of Science and Technology, Beijing, China
| |
Collapse
|