1
|
Wu J, Yang B, Xue Z, Zhang X, Lin Z, Chen B. Fast multi-view clustering via correntropy-based orthogonal concept factorization. Neural Netw 2024; 173:106170. [PMID: 38387199 DOI: 10.1016/j.neunet.2024.106170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 01/15/2024] [Accepted: 02/08/2024] [Indexed: 02/24/2024]
Abstract
Owing to its ability to handle negative data and promising clustering performance, concept factorization (CF), an improved version of non-negative matrix factorization, has been incorporated into multi-view clustering recently. Nevertheless, existing CF-based multi-view clustering methods still have the following issues: (1) they directly conduct factorization in the original data space, which means its efficiency is sensitive to the feature dimension; (2) they ignore the high degree of factorization freedom of standard CF, which may lead to non-uniqueness factorization thereby causing reduced effectiveness; (3) traditional robust norms they used are unable to handle complex noises, significantly challenging their robustness. To address these issues, we establish a fast multi-view clustering via correntropy-based orthogonal concept factorization (FMVCCF). Specifically, FMVCCF executes factorization on a learned consensus anchor graph rather than directly decomposing the original data, lessening the dimensionality sensitivity. Then, a lightweight graph regularization term is incorporated to refine the factorization process with a low computational burden. Moreover, an improved multi-view correntropy-based orthogonal CF model is developed, which can enhance the effectiveness and robustness under the orthogonal constraint and correntropy criterion, respectively. Extensive experiments demonstrate that FMVCCF can achieve promising effectiveness and robustness on various real-world datasets with high efficiency.
Collapse
Affiliation(s)
- Jinghan Wu
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi'an 710049, China; National Engineering Research Center for Visual Information and Applications, Xi'an 710049, China; Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an 710049, China
| | - Ben Yang
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi'an 710049, China; National Engineering Research Center for Visual Information and Applications, Xi'an 710049, China; Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an 710049, China
| | - Zhiyuan Xue
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi'an 710049, China; National Engineering Research Center for Visual Information and Applications, Xi'an 710049, China; Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an 710049, China
| | - Xuetao Zhang
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi'an 710049, China; National Engineering Research Center for Visual Information and Applications, Xi'an 710049, China; Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an 710049, China.
| | - Zhiping Lin
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Badong Chen
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi'an 710049, China; National Engineering Research Center for Visual Information and Applications, Xi'an 710049, China; Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an 710049, China
| |
Collapse
|
2
|
Xu Y, Zhang W, Zheng X, Cai X. Combining Global-Constrained Concept Factorization and a Regularized Gaussian Graphical Model for Clustering Single-Cell RNA-seq Data. Interdiscip Sci 2024; 16:1-15. [PMID: 37815679 DOI: 10.1007/s12539-023-00587-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 09/14/2023] [Accepted: 09/17/2023] [Indexed: 10/11/2023]
Abstract
Single-cell RNA sequencing technology is one of the most cost-effective ways to uncover transcriptomic heterogeneity. With the rapid rise of this technology, enormous amounts of scRNA-seq data have been produced. Due to the high dimensionality, noise, sparsity and missing features of the available scRNA-seq data, accurately clustering the scRNA-seq data for downstream analysis is a significant challenge. Many computational methods have been designed to address this issue; nevertheless, the efficacy of the available methods is still inadequate. In addition, most similarity-based methods require a number of clusters as input, which is difficult to achieve in real applications. In this study, we developed a novel computational method for clustering scRNA-seq data by considering both global and local information, named GCFG. This method characterizes the global properties of data by applying concept factorization, and the regularized Gaussian graphical model is utilized to evaluate the local embedding relationship of data. To learn the cell-cell similarity matrix, we integrated the two components, and an iterative optimization algorithm was developed. The categorization of single cells is obtained by applying Louvain, a modularity-based community discovery algorithm, to the similarity matrix. The behavior of the GCFG approach is assessed on 14 real scRNA-seq datasets in terms of ACC and ARI, and comparison results with 17 other competitive methods suggest that GCFG is effective and robust.
Collapse
Affiliation(s)
- Yaxin Xu
- School of Sciences, East China Jiaotong University, Nanchang, 330013, China
| | - Wei Zhang
- School of Sciences, East China Jiaotong University, Nanchang, 330013, China.
| | - Xiaoying Zheng
- Operations Research and Planning Department, Naval University of Engineering, Wuhan, 430033, China
| | - Xianxian Cai
- School of Sciences, East China Jiaotong University, Nanchang, 330013, China
| |
Collapse
|
3
|
Peng S, Yang Z, Nie F, Chen B, Lin Z. Correntropy based semi-supervised concept factorization with adaptive neighbors for clustering. Neural Netw 2022; 154:203-217. [PMID: 35907358 DOI: 10.1016/j.neunet.2022.07.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 07/11/2022] [Accepted: 07/16/2022] [Indexed: 10/17/2022]
Abstract
Concept factorization (CF) has shown the effectiveness in the field of data clustering. In this paper, a novel and robust semi-supervised CF method, called correntropy based semi-supervised concept factorization with adaptive neighbors (CSCF), is proposed with improved performance in clustering applications. Specifically, on the one hand, the CSCF method adopts correntropy as the cost function to increase the robustness for non-Gaussian noise and outliers, and combines two different types of supervised information simultaneously for obtaining a compact low-dimensional representation of the original data. On the other hand, CSCF assigns the adaptive neighbors for each data point to construct a good data similarity matrix for reducing the sensitiveness of data. Moreover, a generalized version of CSCF is derived for enlarging the clustering application ranges. Analysis is also presented for the relationship of CSCF with several typical CF methods. Experimental results have shown that CSCF has better clustering performance than several state-of-the-art CF methods.
Collapse
Affiliation(s)
- Siyuan Peng
- School of Information Engineering, Guangdong University of Technology, 510006, China
| | - Zhijing Yang
- School of Information Engineering, Guangdong University of Technology, 510006, China.
| | - Feiping Nie
- School of Computer Science and Center for OPTical IMagery Analysis and Learning, Northwestern Polytechnical University, Xi'an 710072, China
| | - Badong Chen
- Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an 710049, China
| | - Zhiping Lin
- School of Electrical and Electronic Engineering, Nanyang Technological University, 639798, Singapore
| |
Collapse
|
4
|
Wang J, Lu CH, Kong XZ, Dai LY, Yuan S, Zhang X. Multi-view manifold regularized compact low-rank representation for cancer samples clustering on multi-omics data. BMC Bioinformatics 2022; 22:334. [PMID: 35057729 PMCID: PMC8772048 DOI: 10.1186/s12859-021-04220-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 05/27/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The identification of cancer types is of great significance for early diagnosis and clinical treatment of cancer. Clustering cancer samples is an important means to identify cancer types, which has been paid much attention in the field of bioinformatics. The purpose of cancer clustering is to find expression patterns of different cancer types, so that the samples with similar expression patterns can be gathered into the same type. In order to improve the accuracy and reliability of cancer clustering, many clustering methods begin to focus on the integration analysis of cancer multi-omics data. Obviously, the methods based on multi-omics data have more advantages than those using single omics data. However, the high heterogeneity and noise of cancer multi-omics data pose a great challenge to the multi-omics analysis method. RESULTS In this study, in order to extract more complementary information from cancer multi-omics data for cancer clustering, we propose a low-rank subspace clustering method called multi-view manifold regularized compact low-rank representation (MmCLRR). In MmCLRR, each omics data are regarded as a view, and it learns a consistent subspace representation by imposing a consistence constraint on the low-rank affinity matrix of each view to balance the agreement between different views. Moreover, the manifold regularization and concept factorization are introduced into our method. Relying on the concept factorization, the dictionary can be updated in the learning, which greatly improves the subspace learning ability of low-rank representation. We adopt linearized alternating direction method with adaptive penalty to solve the optimization problem of MmCLRR method. CONCLUSIONS Finally, we apply MmCLRR into the clustering of cancer samples based on multi-omics data, and the clustering results show that our method outperforms the existing multi-view methods.
Collapse
Affiliation(s)
- Juan Wang
- School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Cong-Hai Lu
- School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Xiang-Zhen Kong
- School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Ling-Yun Dai
- School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Shasha Yuan
- School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Xiaofeng Zhang
- School of Information and Electrical Engineering, Ludong University, Yantai, 264025 China
| |
Collapse
|
5
|
He Y, Lu H, Huang L, Xie S. Pairwise constrained concept factorization for data representation. Neural Netw 2014; 52:1-17. [PMID: 24413280 DOI: 10.1016/j.neunet.2013.12.007] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2012] [Revised: 09/27/2013] [Accepted: 12/18/2013] [Indexed: 11/30/2022]
Abstract
Concept factorization (CF) is a variant of non-negative matrix factorization (NMF). In CF, each concept is represented by a linear combination of data points, and each data point is represented by a linear combination of concepts. More specifically, each concept is represented by more than one data point with different weights, and each data point carries various weights called membership to represent their degrees belonging to that concept. However, CF is actually an unsupervised method without making use of prior information of the data. In this paper, we propose a novel semi-supervised concept factorization method, called Pairwise Constrained Concept Factorization (PCCF), which incorporates pairwise constraints into the CF framework. We expect that data points which have pairwise must-link constraints should have the same class label as much as possible, while data points with pairwise cannot-link constraints will have different class labels as much as possible. Due to the incorporation of the pairwise constraints, the learning quality of the CF has been significantly enhanced. Experimental results show the effectiveness of our proposed novel method in comparison to the state-of-the-art algorithms on several real world applications.
Collapse
Affiliation(s)
- Yangcheng He
- MOE-Microsoft Laboratory for Intelligent Computing and Intelligent Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, PR China.
| | - Hongtao Lu
- MOE-Microsoft Laboratory for Intelligent Computing and Intelligent Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, PR China
| | - Lei Huang
- MOE-Microsoft Laboratory for Intelligent Computing and Intelligent Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, PR China
| | - Saining Xie
- MOE-Microsoft Laboratory for Intelligent Computing and Intelligent Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, PR China
| |
Collapse
|