1
|
Li J, Fu S, Fu W, Wang L, Pan X. An efficient framework based on local multi-representatives and noise-robust synthetic example generation for self-labeled semi-supervised classification. Neural Netw 2025; 185:107142. [PMID: 39889375 DOI: 10.1016/j.neunet.2025.107142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Revised: 01/03/2025] [Accepted: 01/09/2025] [Indexed: 02/03/2025]
Abstract
While self-labeled methods can exploit unlabeled and labeled instances to train classifiers, they are also restricted by the labeled instance number and distribution. SEG-SSC, k-means-SSC, LC-SSC, and LCSEG-SSC are sophisticated solutions for overcoming the above restrictions. However, when some classes overlap, they suffer from the following challenging technical defects: (a) they fail to effectively improve the labeled instance distribution by identifying a single local representative in a cluster; (b) they have a low accuracy or a high degree of manual interference for predicting identified unlabeled local representatives; and (c) they fail to effectively improve the labeled instance number due to noise generation. To address the above issues, a framework based on local multi-representatives and noise-robust synthetic example generation (LMR-NRSEG-SSC) is proposed for self-labeled semi-supervised classification. First, a newly proposed local multi-representatives search algorithm with multi-granularity ideas is used to partition labeled and unlabeled instances into independent clusters and identify unlabeled local multi-representatives in each cluster. Second, a newly proposed divide-and-conquer self-labeling is used to predict unlabeled local multi-representatives, with the goal of improving the labeled instance distribution. Third, a newly proposed noise-robust oversampling technique based on local multi-representatives is used to create safe labeled synthetic instances with little noise, with the goal of improving the labeled instance number. Finally, almost any algorithm of the self-labeled methods can be performed on improved labeled and unlabeled instances to train classifiers with effects. Experiments demonstrated that LMR-NRSEG-SSC outperformed 7 sophisticated self-labeled frameworks in improving 2 advanced self-labeled methods on extensive benchmark datasets.
Collapse
Affiliation(s)
- Junnan Li
- School of Artificial Intelligence and Big Data, Chongqing Industry Polytechnic College, 401120, China.
| | - Shun Fu
- School of Artificial Intelligence and Big Data, Chongqing Industry Polytechnic College, 401120, China.
| | - Wei Fu
- School of Artificial Intelligence and Big Data, Chongqing Industry Polytechnic College, 401120, China.
| | - Lufeng Wang
- School of Artificial Intelligence and Big Data, Chongqing Industry Polytechnic College, 401120, China.
| | - Xin Pan
- School of Artificial Intelligence and Big Data, Chongqing Industry Polytechnic College, 401120, China; School of Cybersecurity, University of Electronic Science and Technology of China, Chengdu 611731, China; Mashang Consumer Finance Co., Ltd., 401120, China
| |
Collapse
|
2
|
Xia S, Lian X, Wang G, Gao X, Chen J, Peng X. GBSVM: An Efficient and Robust Support Vector Machine Framework via Granular-Ball Computing. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:9253-9267. [PMID: 38954574 DOI: 10.1109/tnnls.2024.3417433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
Granular-ball support vector machine (GBSVM) is a significant attempt to construct a classifier using the coarse-to-fine granularity of a granular ball as input, rather than a single data point. It is the first classifier whose input contains no points. However, the existing model has some errors, and its dual model has not been derived. As a result, the current algorithm cannot be implemented or applied. To address these problems, we fix the errors of the original model of the existing GBSVM and derive its dual model. Furthermore, a particle swarm optimization (PSO) algorithm is designed to solve the dual problem. The sequential minimal optimization (SMO) algorithm is also carefully designed to solve the dual problem. The latter is faster and more stable. The experimental results on the UCI benchmark datasets demonstrate that GBSVM is more robust and efficient. All codes have been released in the open source library available at: http://www.cquptshuyinxia.com/GBSVM.html or https://github.com/syxiaa/GBSVM.
Collapse
|
3
|
Liu Z, Li J, Zhang X, Wang XZ. Incremental Incomplete Concept-Cognitive Learning Model: A Stochastic Strategy. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:393-406. [PMID: 37999965 DOI: 10.1109/tnnls.2023.3333537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/26/2023]
Abstract
Concept-cognitive learning is an emerging area of cognitive computing, which refers to continuously learning new knowledge by imitating the human cognition process. However, the existing research on concept-cognitive learning is still at the level of complete cognition as well as cognitive operators, which is far from the real cognition process. Meanwhile, the current classification algorithms based on concept-cognitive learning models (CCLMs) are not mature enough yet since their cognitive results highly depend on the cognition order of attributes. To address the above problems, this article presents a novel concept-cognitive learning method, namely, stochastic incremental incomplete concept-cognitive learning method (SI2CCLM), whose cognition process adopts a stochastic strategy that is independent of the order of attributes. Moreover, a new classification algorithm based on SI2CCLM is developed, and the analysis of the parameters and convergence of the algorithm is made. Finally, we show the cognitive effectiveness of SI2CCLM by comparing it with other concept-cognitive learning methods. In addition, the average accuracy of our model on 24 datasets is 82.02%, which is higher than the compared 20 classification algorithms, and the elapsed time of our model also has advantages.
Collapse
|
4
|
Xia S, Wang C, Wang G, Gao X, Ding W, Yu J, Zhai Y, Chen Z. GBRS: A Unified Granular-Ball Learning Model of Pawlak Rough Set and Neighborhood Rough Set. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1719-1733. [PMID: 37943647 DOI: 10.1109/tnnls.2023.3325199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]
Abstract
Pawlak rough set (PRS) and neighborhood rough set (NRS) are the two most common rough set theoretical models. Although the PRS can use equivalence classes to represent knowledge, it is unable to process continuous data. On the other hand, NRSs, which can process continuous data, rather lose the ability of using equivalence classes to represent knowledge. To remedy this deficit, this article presents a granular-ball rough set (GBRS) based on the granular-ball computing combining the robustness and the adaptability of the granular-ball computing. The GBRS can simultaneously represent both the PRS and the NRS, enabling it not only to be able to deal with continuous data and to use equivalence classes for knowledge representation as well. In addition, we propose an implementation algorithm of the GBRS by introducing the positive region of GBRS into the PRS framework. The experimental results on benchmark datasets demonstrate that the learning accuracy of the GBRS has been significantly improved compared with the PRS and the traditional NRS. The GBRS also outperforms nine popular or the state-of-the-art feature selection methods. We have open-sourced all the source codes of this article at https://www.cquptshuyinxia.com/GBRS.html, https://github.com/syxiaa/GBRS.
Collapse
|
5
|
Cheng D, Li Y, Xia S, Wang G, Huang J, Zhang S. A Fast Granular-Ball-Based Density Peaks Clustering Algorithm for Large-Scale Data. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:17202-17215. [PMID: 37566496 DOI: 10.1109/tnnls.2023.3300916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/13/2023]
Abstract
Density peaks clustering algorithm (DP) has difficulty in clustering large-scale data, because it requires the distance matrix to compute the density and -distance for each object, which has time complexity. Granular ball (GB) is a coarse-grained representation of data. It is based on the fact that an object and its local neighbors have similar distribution and they have high possibility of belonging to the same class. It has been introduced into supervised learning by Xia et al. to improve the efficiency of supervised learning, such as support vector machine, -nearest neighbor classification, rough set, etc. Inspired by the idea of GB, we introduce it into unsupervised learning for the first time and propose a GB-based DP algorithm, called GB-DP. First, it generates GBs from the original data with an unsupervised partitioning method. Then, it defines the density of GBs, instead of the density of objects, according to the centers, radius, and distances between its members and centers, without setting any parameters. After that, it computes the distance between the centers of GBs as the distance between GBs and defines the -distance of GBs. Finally, it uses GBs' density and -distance to plot the decision graph, employs DP algorithm to cluster them, and expands the clustering result to the original data. Since there is no need to calculate the distance between any two objects and the number of GBs is far less than the scale of a data, it greatly reduces the running time of DP algorithm. By comparing with -means, ball -means, DP, DPC-KNN-PCA, FastDPeak, and DLORE-DP, GB-DP can get similar or even better clustering results in much less running time without setting any parameters. The source code is available at https://github.com/DongdongCheng/GB-DP.
Collapse
|
6
|
Guo J, Jiang Z, Ying J, Feng X, Zheng F. Optimal allocation model of port emergency resources based on the improved multi-objective particle swarm algorithm and TOPSIS method. MARINE POLLUTION BULLETIN 2024; 209:117214. [PMID: 39500175 DOI: 10.1016/j.marpolbul.2024.117214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Revised: 10/25/2024] [Accepted: 10/25/2024] [Indexed: 12/12/2024]
Abstract
The busy maritime traffic and occurrence of ship accidents have led to a growing recognition of the necessity to maritime emergency resources allocation. The port emergency resource allocation is of significant importance for the maritime safety. This paper presents an optimized allocation model for port emergency resources based on the improved multi-objective particle swarm optimization (IMOPSO). The model introduces the crowding distance and improves the external archive update strategy. The particle inertia weight is adjusted and a dynamic mutation operator is incorporated. The entropy-weighted technique for order preference by similarity to an ideal solution method is also employed to identify the optimal solution. A comprehensive comparison with MOPSO has been presented and discussed. Three metrics of generational distance (GD), spacing (SP) and delta indicator (Δ) were employed for performance evaluation. The results demonstrated that the proposed IMOPSO algorithm exhibited superior performance and robustness, with average values of GD = 0.0386, SP = 0.0023 and Δ = 0.6468 for ZDT test functions. The model efficacy is further validated by a case study of oil spill dispersant configuration at Zhanjiang Port, China. Seven alternative schemes have been obtained, among which the optimal scheme is selected by the entropy-weighted TOPSIS method. The overall cost is potentially to be reduced by approximately 33.03 %. The present study would provide a reference for the water pollutant control and environmental management in port waters.
Collapse
Affiliation(s)
- Jianqun Guo
- State Key Laboratory of Maritime Technology and Safety, Wuhan University of Technology, Wuhan 430063, China; National Engineering Research Center for Water Transport Safety, Wuhan University of Technology, Wuhan 430063, China
| | - Zhonglian Jiang
- State Key Laboratory of Maritime Technology and Safety, Wuhan University of Technology, Wuhan 430063, China; National Engineering Research Center for Water Transport Safety, Wuhan University of Technology, Wuhan 430063, China.
| | - Jianglong Ying
- State Key Laboratory of Maritime Technology and Safety, Wuhan University of Technology, Wuhan 430063, China; National Engineering Research Center for Water Transport Safety, Wuhan University of Technology, Wuhan 430063, China
| | - Xuejun Feng
- Institute of Maritime Logistics and Green Development, Hohai University, Nanjing 210098, China
| | - Fengfan Zheng
- Wuhan Rules and Research Institute, China Classification Society, Wuhan 430022, China
| |
Collapse
|
7
|
Zhang Q, Zhao F, Cheng Y, Gao M, Wang G, Xia S, Ding W. Effective Value Analysis of Fuzzy Similarity Relation in HQSS for Efficient Granulation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:12849-12863. [PMID: 37058387 DOI: 10.1109/tnnls.2023.3265310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Hierarchical quotient space structure (HQSS), as a typical description of granular computing (GrC), focuses on hierarchically granulating fuzzy data and mining hidden knowledge. The key step of constructing HQSS is to transform the fuzzy similarity relation into fuzzy equivalence relation. However, on one hand, the transformation process has high time complexity. On the other hand, it is difficult to mine knowledge directly from fuzzy similarity relation due to its information redundancy, i.e., sparsity of effective information. Therefore, this article mainly focuses on proposing an efficient granulation approach for constructing HQSS by quickly extracting the effective value of fuzzy similarity relation. First, the effective value and effective position of fuzzy similarity relation are defined according to whether they could be retained in fuzzy equivalence relation. Second, the number and composition of effective values are presented to confirm that which elements are effective values. Based on these above theories, redundant information and sparse effective information in fuzzy similarity relation could be completely distinguished. Next, both isomorphism and similarity between two fuzzy similarity relations are researched based on the effective value. The isomorphism between two fuzzy equivalence relations is discussed based on the effective value. Then, the algorithm with low time complexity for extracting effective values of fuzzy similarity relation is introduced. On the basis, the algorithm for constructing HQSS is presented to realize efficient granulation of fuzzy data. The proposed algorithms could accurately extract effective information from the fuzzy similarity relation and construct the same HQSS with the fuzzy equivalence relation while greatly reducing the time complexity. Finally, relevant experiments on 15 UCI datasets, 3 UKB datasets, and 5 image datasets are shown and analyzed to verify the effectiveness and efficiency of the proposed algorithm.
Collapse
|
8
|
Chen J, Yang S, Xiong J, Xiong Y. An effective emotion tendency perception model in empathic dialogue. PLoS One 2023; 18:e0282926. [PMID: 36897862 PMCID: PMC10004494 DOI: 10.1371/journal.pone.0282926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2022] [Accepted: 02/27/2023] [Indexed: 03/11/2023] Open
Abstract
The effectiveness of open-domain dialogue systems depends heavily on emotion. In dialogue systems, previous models primarily detected emotions by looking for emotional words embedded in sentences. However, they did not precisely quantify the association of all words with emotions, which has led to a certain bias. To overcome this issue, we propose an emotion tendency perception model. The model uses an emotion encoder to accurately quantify the emotional tendencies of all words. Meanwhile, it uses a shared fusion decoder to equip the decoder with the sentiment and semantic capabilities of the encoder. We conducted extensive evaluations on Empathetic Dialogue. Experimental results demonstrate its efficacy. Compared with the state of the art, our approach has distinctive advantages.
Collapse
Affiliation(s)
- Jiancu Chen
- College of Computer Science and Engineering, Chongqing Three Gorges University, Chongqing, China
| | - Siyuan Yang
- College of Computer and Big Data, Fuzhou University, Fuzhou, China
| | - Jiang Xiong
- College of Computer Science and Engineering, Chongqing Three Gorges University, Chongqing, China
- * E-mail:
| | - Yiping Xiong
- College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
| |
Collapse
|