1
|
Li Y, Luo P, Lu Y, Wu FX. Identifying cell types from single-cell data based on similarities and dissimilarities between cells. BMC Bioinformatics 2021; 22:255. [PMID: 34006217 PMCID: PMC8132444 DOI: 10.1186/s12859-020-03873-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 11/09/2020] [Indexed: 12/15/2022] Open
Abstract
Background With the development of the technology of single-cell sequence, revealing homogeneity and heterogeneity between cells has become a new area of computational systems biology research. However, the clustering of cell types becomes more complex with the mutual penetration between different types of cells and the instability of gene expression. One way of overcoming this problem is to group similar, related single cells together by the means of various clustering analysis methods. Although some methods such as spectral clustering can do well in the identification of cell types, they only consider the similarities between cells and ignore the influence of dissimilarities on clustering results. This methodology may limit the performance of most of the conventional clustering algorithms for the identification of clusters, it needs to develop special methods for high-dimensional sparse categorical data. Results Inspired by the phenomenon that same type cells have similar gene expression patterns, but different types of cells evoke dissimilar gene expression patterns, we improve the existing spectral clustering method for clustering single-cell data that is based on both similarities and dissimilarities between cells. The method first measures the similarity/dissimilarity among cells, then constructs the incidence matrix by fusing similarity matrix with dissimilarity matrix, and, finally, uses the eigenvalues of the incidence matrix to perform dimensionality reduction and employs the K-means algorithm in the low dimensional space to achieve clustering. The proposed improved spectral clustering method is compared with the conventional spectral clustering method in recognizing cell types on several real single-cell RNA-seq datasets. Conclusions In summary, we show that adding intercellular dissimilarity can effectively improve accuracy and achieve robustness and that improved spectral clustering method outperforms the traditional spectral clustering method in grouping cells.
Collapse
Affiliation(s)
- Yuanyuan Li
- School of Mathematics and Physics, Wuhan Institute of Technology, No.206, Guanggu 1st road, Wuhan, 430205, Hubei, China. .,Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada.
| | - Ping Luo
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
| | - Yi Lu
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada.,Department of Mechanical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada.,Department of Computer Science, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
| |
Collapse
|
2
|
Zhang W, Li Y, Zou X. SCCLRR: A Robust Computational Method for Accurate Clustering Single Cell RNA-Seq Data. IEEE J Biomed Health Inform 2021; 25:247-256. [PMID: 32356764 DOI: 10.1109/jbhi.2020.2991172] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Single-cell RNA transcriptome data present a tremendous opportunity for studying the cellular heterogeneity. Identifying subpopulations based on scRNA-seq data is a hot topic in recent years, although many researchers have been focused on designing elegant computational methods for identifying new cell types; however, the performance of these methods is still unsatisfactory due to the high dimensionality, sparsity and noise of scRNA-seq data. In this study, we propose a new cell type detection method by learning a robust and accurate similarity matrix, named SCCLRR. The method simultaneously captures both global and local intrinsic properties of data based on a low rank representation (LRR) framework mathematical model. The integrated normalized Euclidean distance and cosine similarity are used to balance the intrinsic linear and nonlinear manifold of data in the local regularization term. To solve the non-convex optimization model, we present an iterative optimization procedure using the alternating direction method of multipliers (ADMM) algorithm. We evaluate the performance of the SCCLRR method on nine real scRNA-seq datasets and compare it with seven state-of-the-art methods. The simulation results show that the SCCLRR outperforms other methods and is robust and effective for clustering scRNA-seq data. (The code of SCCLRR is free available for academic https://github.com/wzhangwhu/SCCLRR).
Collapse
|
3
|
Zhu X, Li J, Li HD, Xie M, Wang J. Sc-GPE: A Graph Partitioning-Based Cluster Ensemble Method for Single-Cell. Front Genet 2020; 11:604790. [PMID: 33384718 PMCID: PMC7770236 DOI: 10.3389/fgene.2020.604790] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 11/23/2020] [Indexed: 01/23/2023] Open
Abstract
Clustering is an efficient way to analyze single-cell RNA sequencing data. It is commonly used to identify cell types, which can help in understanding cell differentiation processes. However, different clustering results can be obtained from different single-cell clustering methods, sometimes including conflicting conclusions, and biologists will often fail to get the right clustering results and interpret the biological significance. The cluster ensemble strategy can be an effective solution for the problem. As the graph partitioning-based clustering methods are good at clustering single-cell, we developed Sc-GPE, a novel cluster ensemble method combining five single-cell graph partitioning-based clustering methods. The five methods are SNN-cliq, PhenoGraph, SC3, SSNN-Louvain, and MPGS-Louvain. In Sc-GPE, a consensus matrix is constructed based on the five clustering solutions by calculating the probability that the cell pairs are divided into the same cluster. It solved the problem in the hypergraph-based ensemble approach, including the different cluster labels that were assigned in the individual clustering method, and it was difficult to find the corresponding cluster labels across all methods. Then, to distinguish the different importance of each method in a clustering ensemble, a weighted consensus matrix was constructed by designing an importance score strategy. Finally, hierarchical clustering was performed on the weighted consensus matrix to cluster cells. To evaluate the performance, we compared Sc-GPE with the individual clustering methods and the state-of-the-art SAME-clustering on 12 single-cell RNA-seq datasets. The results show that Sc-GPE obtained the best average performance, and achieved the highest NMI and ARI value in five datasets.
Collapse
Affiliation(s)
- Xiaoshu Zhu
- School of Computer Science and Engineering, Yulin Normal University, Yulin, China.,Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Jian Li
- School of Computer Science and Engineering, Yulin Normal University, Yulin, China
| | - Hong-Dong Li
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Miao Xie
- School of Computer Science and Engineering, Yulin Normal University, Yulin, China
| | - Jianxin Wang
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
4
|
Hao Shi, Yan KK, Ding L, Qian C, Chi H, Yu J. Network Approaches for Dissecting the Immune System. iScience 2020; 23:101354. [PMID: 32717640 PMCID: PMC7390880 DOI: 10.1016/j.isci.2020.101354] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Revised: 06/21/2020] [Accepted: 07/08/2020] [Indexed: 02/06/2023] Open
Abstract
The immune system is a complex biological network composed of hierarchically organized genes, proteins, and cellular components that combat external pathogens and monitor the onset of internal disease. To meet and ultimately defeat these challenges, the immune system orchestrates an exquisitely complex interplay of numerous cells, often with highly specialized functions, in a tissue-specific manner. One of the major methodologies of systems immunology is to measure quantitatively the components and interaction levels in the immunologic networks to construct a computational network and predict the response of the components to perturbations. The recent advances in high-throughput sequencing techniques have provided us with a powerful approach to dissecting the complexity of the immune system. Here we summarize the latest progress in integrating omics data and network approaches to construct networks and to infer the underlying signaling and transcriptional landscape, as well as cell-cell communication, in the immune system, with a focus on hematopoiesis, adaptive immunity, and tumor immunology. Understanding the network regulation of immune cells has provided new insights into immune homeostasis and disease, with important therapeutic implications for inflammation, cancer, and other immune-mediated disorders.
Collapse
Affiliation(s)
- Hao Shi
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Koon-Kiu Yan
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Liang Ding
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Chenxi Qian
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Hongbo Chi
- Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Jiyang Yu
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA.
| |
Collapse
|
5
|
Zhang J, Feng J, Wu FX. Finding Community of Brain Networks Based on Neighbor Index and DPSO with Dynamic Crossover. Curr Bioinform 2020. [DOI: 10.2174/1574893614666191017100657] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background: :
The brain networks can provide us an effective way to analyze brain
function and brain disease detection. In brain networks, there exist some import neural unit modules,
which contain meaningful biological insights.
Objective::
Therefore, we need to find the optimal neural unit modules effectively and efficiently.
Method::
In this study, we propose a novel algorithm to find community modules of brain networks
by combining Neighbor Index and Discrete Particle Swarm Optimization (DPSO) with dynamic
crossover, abbreviated as NIDPSO. The differences between this study and the existing
ones lie in that NIDPSO is proposed first to find community modules of brain networks, and dose
not need to predefine and preestimate the number of communities in advance.
Results: :
We generate a neighbor index table to alleviate and eliminate ineffective searches and
design a novel coding by which we can determine the community without computing the distances
amongst vertices in brain networks. Furthermore, dynamic crossover and mutation operators are
designed to modify NIDPSO so as to alleviate the drawback of premature convergence in DPSO.
Conclusion:
The numerical results performing on several resting-state functional MRI brain networks
demonstrate that NIDPSO outperforms or is comparable with other competing methods in
terms of modularity, coverage and conductance metrics.
Collapse
Affiliation(s)
- Jie Zhang
- School of Computer Science and Engineering; Guangxi Colleges and Universities Key Lab of Complex System Optimization and Big Data Processing, Yulin Normal University, Yulin 537000, Guangxi, China
| | - Junhong Feng
- School of Computer Science and Engineering; Guangxi Colleges and Universities Key Lab of Complex System Optimization and Big Data Processing, Yulin Normal University, Yulin 537000, Guangxi, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, S7N5A9, Saskatchewan, Canada
| |
Collapse
|
6
|
Zhang J, Feng J, Yang Y, Wang JH. Finding Community Modules for Brain Networks Combined Uniform Design with Fruit Fly Optimization Algorithm. Interdiscip Sci 2020; 12:178-192. [PMID: 32424670 DOI: 10.1007/s12539-020-00371-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Revised: 02/13/2020] [Accepted: 04/24/2020] [Indexed: 11/29/2022]
Abstract
There are a huge amount of neural units in brain networks. Some of the neural units have tight connection and form neural unit modules. These unit modules are helpful to the disease detection and target therapy. A good method can find neural unit modules accurately and effectively. The study proposes a new algorithm to analyze a brain network and obtain its neural unit modules. The proposed algorithm combines the uniform design and the fruit fly optimization algorithm (FOA); therefore, we called it as UFOA. It makes the utmost of their respective merits of the uniform design and the FOA, so as to acquire the feasible solutions scattered uniformly over the vector domain and find the optimal solution as quickly as possible. When compared with other existing methods, FOA and the uniform design are integrated first, and UFOA is first utilized to find unit modules from brain networks. 37 TD resting-state functional MRI brain networks are used to testify the performance of UFOA. The obtained experimental results manifest that UFOA is clearly superior to the other five methods in terms of modularity, and is comparable with the other five methods in terms of conductance. Additionally, the comparative analysis of UFOA and FOA also demonstrates that the uniform design brings benefit to the improvement of UFOA.
Collapse
Affiliation(s)
- Jie Zhang
- School of Computer Science and Engineering, Yulin Normal University, Yulin, 537000, Guangxi, People's Republic of China
| | - Junhong Feng
- School of Computer Science and Engineering, Yulin Normal University, Yulin, 537000, Guangxi, People's Republic of China.
| | - Yifang Yang
- College of Science of Xi'an Shiyou University, Xi'an, 710065, Shaanxi, People's Republic of China
| | - Jian-Hong Wang
- School of Computer Science and Engineering, Yulin Normal University, Yulin, 537000, Guangxi, People's Republic of China
| |
Collapse
|
7
|
Zheng R, Li M, Liang Z, Wu FX, Pan Y, Wang J. SinNLRR: a robust subspace clustering method for cell type detection by non-negative and low-rank representation. Bioinformatics 2019; 35:3642-3650. [DOI: 10.1093/bioinformatics/btz139] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2018] [Revised: 02/13/2019] [Accepted: 02/24/2019] [Indexed: 12/29/2022] Open
Abstract
Abstract
Motivation
The development of single-cell RNA-sequencing (scRNA-seq) provides a new perspective to study biological problems at the single-cell level. One of the key issues in scRNA-seq analysis is to resolve the heterogeneity and diversity of cells, which is to cluster the cells into several groups. However, many existing clustering methods are designed to analyze bulk RNA-seq data, it is urgent to develop the new scRNA-seq clustering methods. Moreover, the high noise in scRNA-seq data also brings a lot of challenges to computational methods.
Results
In this study, we propose a novel scRNA-seq cell type detection method based on similarity learning, called SinNLRR. The method is motivated by the self-expression of the cells with the same group. Specifically, we impose the non-negative and low rank structure on the similarity matrix. We apply alternating direction method of multipliers to solve the optimization problem and propose an adaptive penalty selection method to avoid the sensitivity to the parameters. The learned similarity matrix could be incorporated with spectral clustering, t-distributed stochastic neighbor embedding for visualization and Laplace score for prioritizing gene markers. In contrast to other scRNA-seq clustering methods, our method achieves more robust and accurate results on different datasets.
Availability and implementation
Our MATLAB implementation of SinNLRR is available at, https://github.com/zrq0123/SinNLRR.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ruiqing Zheng
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Zhenlan Liang
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Fang-Xiang Wu
- School of Computer Science and Engineering, Central South University, Changsha, China
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, Canada
| | - Yi Pan
- School of Computer Science and Engineering, Central South University, Changsha, China
- Department of Computer Science, Georgia State University, Atlanta, GA, USA
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|