Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Wei X, Wu J, Li G, Liu J, Wu X, He C. scPEDSSC: proximity enhanced deep sparse subspace clustering method for scRNA-seq data. PLoS Comput Biol 2025;21:e1012924. [PMID: 40294099 PMCID: PMC12036905 DOI: 10.1371/journal.pcbi.1012924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Accepted: 03/03/2025] [Indexed: 04/30/2025] Open

Lan J, Zhuo X, Ye S, Deng J. A semi-supervised non-negative matrix factorization model for scRNA-seq data analysis. Appl Soft Comput 2025;174:112982. [DOI: 10.1016/j.asoc.2025.112982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]

Liu X, Chapple RH, Bennett D, Wright WC, Sanjali A, Culp E, Zhang Y, Pan M, Geeleher P. CSI-GEP: A GPU-based unsupervised machine learning approach for recovering gene expression programs in atlas-scale single-cell RNA-seq data. CELL GENOMICS 2025;5:100739. [PMID: 39788105 PMCID: PMC11770216 DOI: 10.1016/j.xgen.2024.100739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Revised: 11/06/2024] [Accepted: 12/13/2024] [Indexed: 01/12/2025]

Anter JM, Yakimovich A. Artificial Intelligence Methods in Infection Biology Research. Methods Mol Biol 2025;2890:291-333. [PMID: 39890733 DOI: 10.1007/978-1-0716-4326-6_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2025]

Xu Y, Lv D, Zou X, Wu L, Xu X, Zhao X. BFAST: joint dimension reduction and spatial clustering with Bayesian factor analysis for zero-inflated spatial transcriptomics data. Brief Bioinform 2024;25:bbae594. [PMID: 39552067 PMCID: PMC11570543 DOI: 10.1093/bib/bbae594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 09/03/2024] [Accepted: 11/01/2024] [Indexed: 11/19/2024] Open

Rana V, Peng J, Pan C, Lyu H, Cheng A, Kim M, Milenkovic O. Interpretable online network dictionary learning for inferring long-range chromatin interactions. PLoS Comput Biol 2024;20:e1012095. [PMID: 38753877 PMCID: PMC11135774 DOI: 10.1371/journal.pcbi.1012095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 05/29/2024] [Accepted: 04/20/2024] [Indexed: 05/18/2024] Open

Abstract

Dictionary learning (DL), implemented via matrix factorization (MF), is commonly used in computational biology to tackle ubiquitous clustering problems. The method is favored due to its conceptual simplicity and relatively low computational complexity. However, DL algorithms produce results that lack interpretability in terms of real biological data. Additionally, they are not optimized for graph-structured data and hence often fail to handle them in a scalable manner. In order to address these limitations, we propose a novel DL algorithm called online convex network dictionary learning (online cvxNDL). Unlike classical DL algorithms, online cvxNDL is implemented via MF and designed to handle extremely large datasets by virtue of its online nature. Importantly, it enables the interpretation of dictionary elements, which serve as cluster representatives, through convex combinations of real measurements. Moreover, the algorithm can be applied to data with a network structure by incorporating specialized subnetwork sampling techniques. To demonstrate the utility of our approach, we apply cvxNDL on 3D-genome RNAPII ChIA-Drop data with the goal of identifying important long-range interaction patterns (long-range dictionary elements). ChIA-Drop probes higher-order interactions, and produces data in the form of hypergraphs whose nodes represent genomic fragments. The hyperedges represent observed physical contacts. Our hypergraph model analysis has the objective of creating an interpretable dictionary of long-range interaction patterns that accurately represent global chromatin physical contact maps. Through the use of dictionary information, one can also associate the contact maps with RNA transcripts and infer cellular functions. To accomplish the task at hand, we focus on RNAPII-enriched ChIA-Drop data from Drosophila Melanogaster S2 cell lines. Our results offer two key insights. First, we demonstrate that online cvxNDL retains the accuracy of classical DL (MF) methods while simultaneously ensuring unique interpretability and scalability. Second, we identify distinct collections of proximal and distal interaction patterns involving chromatin elements shared by related processes across different chromosomes, as well as patterns unique to specific chromosomes. To associate the dictionary elements with biological properties of the corresponding chromatin regions, we employ Gene Ontology (GO) enrichment analysis and perform multiple RNA coexpression studies.

Collapse

Xu Y, Zhang W, Zheng X, Cai X. Combining Global-Constrained Concept Factorization and a Regularized Gaussian Graphical Model for Clustering Single-Cell RNA-seq Data. Interdiscip Sci 2024;16:1-15. [PMID: 37815679 DOI: 10.1007/s12539-023-00587-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 09/14/2023] [Accepted: 09/17/2023] [Indexed: 10/11/2023]

Zhang H, Lu X, Lu B, Gullo G, Chen L. Measuring the composition of the tumor microenvironment with transcriptome analysis: past, present and future. Future Oncol 2024;20:1207-1220. [PMID: 38362731 PMCID: PMC11318690 DOI: 10.2217/fon-2023-0658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 01/24/2024] [Indexed: 02/17/2024] Open

Lan W, Liu M, Chen J, Ye J, Zheng R, Zhu X, Peng W. JLONMFSC: Clustering scRNA-seq data based on joint learning of non-negative matrix factorization and subspace clustering. Methods 2024;222:1-9. [PMID: 38128706 DOI: 10.1016/j.ymeth.2023.11.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 11/07/2023] [Accepted: 11/29/2023] [Indexed: 12/23/2023] Open

Johnson JAI, Tsang AP, Mitchell JT, Zhou DL, Bowden J, Davis-Marcisak E, Sherman T, Liefeld T, Loth M, Goff LA, Zimmerman JW, Kinny-Köster B, Jaffee EM, Tamayo P, Mesirov JP, Reich M, Fertig EJ, Stein-O'Brien GL. Inferring cellular and molecular processes in single-cell data with non-negative matrix factorization using Python, R and GenePattern Notebook implementations of CoGAPS. Nat Protoc 2023;18:3690-3731. [PMID: 37989764 PMCID: PMC10961825 DOI: 10.1038/s41596-023-00892-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 07/21/2023] [Indexed: 11/23/2023]

Affiliation(s)

Jeanette A I Johnson Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA
Ashley P Tsang Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
Jacob T Mitchell Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
David L Zhou Department of Neuroscience, Johns Hopkins University, Baltimore, MD, USA
Julia Bowden Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA
Emily Davis-Marcisak Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
Thomas Sherman Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA
Ted Liefeld Department of Medicine, Moores Cancer Center, University of California San Diego, San Diego, CA, USA
Melanie Loth Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA
Loyal A Goff Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA Department of Neuroscience, Johns Hopkins University, Baltimore, MD, USA Kavli Neurodiscovery Institute, Johns Hopkins University, Baltimore, MD, USA Single Cell Training and Analysis Center, Johns Hopkins University, Baltimore, MD, USA
Jacquelyn W Zimmerman Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA
Ben Kinny-Köster Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, USA
Elizabeth M Jaffee Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA
Pablo Tamayo Department of Medicine, Moores Cancer Center, University of California San Diego, San Diego, CA, USA
Jill P Mesirov Department of Medicine, Moores Cancer Center, University of California San Diego, San Diego, CA, USA
Michael Reich Department of Medicine, Moores Cancer Center, University of California San Diego, San Diego, CA, USA
Elana J Fertig Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA. Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA. Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA. Single Cell Training and Analysis Center, Johns Hopkins University, Baltimore, MD, USA. Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, USA.
Genevieve L Stein-O'Brien Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA. Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA. Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA. Department of Neuroscience, Johns Hopkins University, Baltimore, MD, USA. Kavli Neurodiscovery Institute, Johns Hopkins University, Baltimore, MD, USA. Single Cell Training and Analysis Center, Johns Hopkins University, Baltimore, MD, USA.

Collapse

Yoon SH, Nam JW. Clustering malignant cell states using universally variable genes. Brief Bioinform 2023;25:bbad460. [PMID: 38084922 PMCID: PMC10783859 DOI: 10.1093/bib/bbad460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/20/2023] [Accepted: 11/22/2023] [Indexed: 12/18/2023] Open

Li R, Guan J, Wang Z, Zhou S. A new and effective two-step clustering approach for single cell RNA sequencing data. BMC Genomics 2023;23:864. [PMID: 37946133 PMCID: PMC10636845 DOI: 10.1186/s12864-023-09577-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 08/10/2023] [Indexed: 11/12/2023] Open

Wu W, Zhang W, Hou W, Ma X. Multi-View Clustering With Graph Learning for scRNA-Seq Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:3535-3546. [PMID: 37486829 DOI: 10.1109/tcbb.2023.3298334] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/26/2023]

Carbonetto P, Luo K, Sarkar A, Hung A, Tayeb K, Pott S, Stephens M. GoM DE: interpreting structure in sequence count data with differential expression analysis allowing for grades of membership. Genome Biol 2023;24:236. [PMID: 37858253 PMCID: PMC10588049 DOI: 10.1186/s13059-023-03067-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 09/20/2023] [Indexed: 10/21/2023] Open

Wrobel TJ, Brilhaus D, Stefanski A, Stühler K, Weber APM, Linka N. Mapping the castor bean endosperm proteome revealed a metabolic interaction between plastid, mitochondria, and peroxisomes to optimize seedling growth. FRONTIERS IN PLANT SCIENCE 2023;14:1182105. [PMID: 37868318 PMCID: PMC10588648 DOI: 10.3389/fpls.2023.1182105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 08/07/2023] [Indexed: 10/24/2023]

Gunawan I, Vafaee F, Meijering E, Lock JG. An introduction to representation learning for single-cell data analysis. CELL REPORTS METHODS 2023;3:100547. [PMID: 37671013 PMCID: PMC10475795 DOI: 10.1016/j.crmeth.2023.100547] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]

Zhang H, Lu X, Lu B, Chen L. scGEM: Unveiling the Nested Tree-Structured Gene Co-Expressing Modules in Single Cell Transcriptome Data. Cancers (Basel) 2023;15:4277. [PMID: 37686554 PMCID: PMC10486867 DOI: 10.3390/cancers15174277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 08/22/2023] [Accepted: 08/25/2023] [Indexed: 09/10/2023] Open

Su Y, Lin R, Wang J, Tan D, Zheng C. Denoising adaptive deep clustering with self-attention mechanism on single-cell sequencing data. Brief Bioinform 2023;24:7008799. [PMID: 36715275 DOI: 10.1093/bib/bbad021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 12/20/2022] [Accepted: 01/05/2023] [Indexed: 01/31/2023] Open

Cheng X, Yan C, Jiang H, Qiu Y. scHOIS: Determining Cell Heterogeneity Through Hierarchical Clustering Based on Optimal Imputation Strategy. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:1431-1444. [PMID: 37815942 DOI: 10.1109/tcbb.2022.3203592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/12/2023]

Jee DJ, Kong Y, Chun H. Deep Nonnegative Matrix Factorization Using a Variational Autoencoder With Application to Single-Cell RNA Sequencing Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:883-893. [PMID: 35511832 DOI: 10.1109/tcbb.2022.3172723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Ning Z, Dai Z, Zhang H, Chen Y, Yuan Z. A clustering method for small scRNA-seq data based on subspace and weighted distance. PeerJ 2023;11:e14706. [PMID: 36710872 PMCID: PMC9879162 DOI: 10.7717/peerj.14706] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 12/15/2022] [Indexed: 01/24/2023] Open

Wu W, Ma X. Network-Based Structural Learning Nonnegative Matrix Factorization Algorithm for Clustering of scRNA-Seq Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:566-575. [PMID: 35316190 DOI: 10.1109/tcbb.2022.3161131] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]

Abstract

Single-cell RNA sequencing (scRNA-seq) measures expression profiles at the single-cell level, which sheds light on revealing the heterogeneity and functional diversity among cell populations. The vast majority of current algorithms identify cell types by directly clustering transcriptional profiles, which ignore indirect relations among cells, resulting in an undesirable performance on cell type discovery and trajectory inference. Therefore, there is a critical need for inferring cell types and trajectories by exploiting the interactions among cells. In this study, we propose a network-based structural learning nonnegative matrix factorization algorithm (aka SLNMF) for the identification of cell types in scRNA-seq, which is transformed into a constrained optimization problem. SLNMF first constructs the similarity network for cells and then extracts latent features of the cells by exploiting the topological structure of the cell-cell network. To improve the clustering performance, the structural constraint is imposed on the model to learn the latent features of cells by preserving the structural information of the networks, thereby significantly improving the performance of algorithms. Finally, we track the trajectory of cells by exploring the relationships among cell types. Fourteen scRNA-seq datasets are adopted to validate the performance of algorithms with the number of single cells varying from 49 to 26,484. The experimental results demonstrate that SLNMF significantly outperforms fifteen state-of-the-art methods with 15.32% improvement in terms of accuracy, and it accurately identifies the trajectories of cells. The proposed model and methods provide an effective strategy to analyze scRNA-seq data. (The software is coded using matlab, and is freely available for academic https://github.com/xkmaxidian/SLNMF).

Collapse

Shu Z, Long Q, Zhang L, Yu Z, Wu XJ. Robust Graph Regularized NMF with Dissimilarity and Similarity Constraints for ScRNA-seq Data Clustering. J Chem Inf Model 2022;62:6271-6286. [PMID: 36459053 DOI: 10.1021/acs.jcim.2c01305] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]

Su M, Pan T, Chen QZ, Zhou WW, Gong Y, Xu G, Yan HY, Li S, Shi QZ, Zhang Y, He X, Jiang CJ, Fan SC, Li X, Cairns MJ, Wang X, Li YS. Data analysis guidelines for single-cell RNA-seq in biomedical studies and clinical applications. Mil Med Res 2022;9:68. [PMID: 36461064 PMCID: PMC9716519 DOI: 10.1186/s40779-022-00434-8] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 11/18/2022] [Indexed: 12/03/2022] Open

Affiliation(s)

Min Su State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
Tao Pan College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
Qiu-Zhen Chen State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
Wei-Wei Zhou College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081 Heilongjiang China
Yi Gong State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China Department of Immunology, Nanjing Medical University, Nanjing, 211166 China
Gang Xu College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
Huan-Yu Yan State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
Si Li College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
Qiao-Zhen Shi State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
Ya Zhang College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
Xiao He Department of Laboratory Medicine, Women and Children’s Hospital of Chongqing Medical University, Chongqing, 401174 China
Chun-Jie Jiang Baylor College of Medicine, Houston, TX 77030 USA
Shi-Cai Fan Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, 518110 Guangdong China
Xia Li College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081 Heilongjiang China
Murray J. Cairns School of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, the University of Newcastle, University Drive, Callaghan, NSW 2308 Australia Precision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
Xi Wang State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
Yong-Sheng Li College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China

Collapse

Li RY, Wang Z, Guan J, Zhou S. Effectively Clustering Single Cell RNA Sequencing Data by Sparse Representation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:3425-3434. [PMID: 34788219 DOI: 10.1109/tcbb.2021.3128576] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Cuevas-Diaz Duran R, González-Orozco JC, Velasco I, Wu JQ. Single-cell and single-nuclei RNA sequencing as powerful tools to decipher cellular heterogeneity and dysregulation in neurodegenerative diseases. Front Cell Dev Biol 2022;10:884748. [PMID: 36353512 PMCID: PMC9637968 DOI: 10.3389/fcell.2022.884748] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Accepted: 10/06/2022] [Indexed: 08/10/2023] Open

Breitenbach T, Schmitt MJ, Dandekar T. Optimization of synthetic molecular reporters for a mesenchymal glioblastoma transcriptional program by integer programing. Bioinformatics 2022;38:4162-4171. [PMID: 35809064 DOI: 10.1093/bioinformatics/btac488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 06/05/2022] [Accepted: 07/07/2022] [Indexed: 12/24/2022] Open

Abstract

MOTIVATION

A recent approach to perform genetic tracing of complex biological problems involves the generation of synthetic deoxyribonucleic acid (DNA) probes that specifically mark cells with a phenotype of interest. These synthetic locus control regions (sLCRs), in turn, drive the expression of a reporter gene, such as fluorescent protein. To build functional and specific sLCRs, it is critical to accurately select multiple bona fide cis-regulatory elements from the target cell phenotype cistrome. This selection occurs by maximizing the number and diversity of transcription factors (TFs) within the sLCR, yet the size of the final sLCR should remain limited.

RESULTS

In this work, we discuss how optimization, in particular integer programing, can be used to systematically address the construction of a specific sLCR and optimize pre-defined properties of the sLCR. Our presented instance of a linear optimization problem maximizes the activation potential of the sLCR such that its size is limited to a pre-defined length and a minimum number of all TFs deemed sufficiently characteristic for the phenotype of interest is covered. We generated an sLCR to trace the mesenchymal glioblastoma program in patients by solving our corresponding linear program with the software optimizer Gurobi. Considering the binding strength of transcription factor binding sites (TFBSs) with their TFs as a proxy for activation potential, the optimized sLCR scores similarly to an sLCR experimentally validated in vivo, and is smaller in size while having the same coverage of TFBSs.

AVAILABILITY AND IMPLEMENTATION

We provide a Python implementation of the presented framework in the Supplementary Material with which an optimal selection of cis-regulatory elements can be calculated once the target set of TFs and their binding strength with their TFBSs is known.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Unified K-means coupled self-representation and neighborhood kernel learning for clustering single-cell RNA-sequencing data. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.06.046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Single-cell multiomics analysis reveals regulatory programs in clear cell renal cell carcinoma. Cell Discov 2022;8:68. [PMID: 35853872 PMCID: PMC9296597 DOI: 10.1038/s41421-022-00415-0] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 04/26/2022] [Indexed: 01/01/2023] Open

Liang Z, Zheng R, Chen S, Yan X, Li M. A deep matrix factorization based approach for single-cell RNA-seq data clustering. Methods 2022;205:114-122. [PMID: 35777719 DOI: 10.1016/j.ymeth.2022.06.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 05/28/2022] [Accepted: 06/24/2022] [Indexed: 11/17/2022] Open

Zeira R, Land M, Strzalkowski A, Raphael BJ. Alignment and integration of spatial transcriptomics data. Nat Methods 2022;19:567-575. [PMID: 35577957 PMCID: PMC9334025 DOI: 10.1038/s41592-022-01459-6] [Citation(s) in RCA: 116] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 03/17/2022] [Indexed: 01/05/2023]

Wu W, Zhang W, Ma X. Network-based integrative analysis of single-cell transcriptomic and epigenomic data for cell types. Brief Bioinform 2022;23:bbab546. [PMID: 35043143 DOI: 10.1093/bib/bbab546] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 11/09/2021] [Accepted: 11/27/2021] [Indexed: 02/02/2023] Open

Ou-Yang L, Lu F, Zhang ZC, Wu M. Matrix factorization for biomedical link prediction and scRNA-seq data imputation: an empirical survey. Brief Bioinform 2021;23:6447434. [PMID: 34864871 DOI: 10.1093/bib/bbab479] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 09/25/2021] [Accepted: 10/18/2021] [Indexed: 02/02/2023] Open

Oh S, Park H, Zhang X. Hybrid Clustering of Single-Cell Gene Expression and Spatial Information via Integrated NMF and K-Means. Front Genet 2021;12:763263. [PMID: 34819947 PMCID: PMC8606648 DOI: 10.3389/fgene.2021.763263] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 10/13/2021] [Indexed: 11/13/2022] Open

Fang Q, Su D, Ng W, Feng J. An Effective Biclustering-Based Framework for Identifying Cell Subpopulations From scRNA-seq Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:2249-2260. [PMID: 32167906 DOI: 10.1109/tcbb.2020.2979717] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Shiga M, Seno S, Onizuka M, Matsuda H. SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization. PeerJ 2021;9:e12087. [PMID: 34532161 PMCID: PMC8404576 DOI: 10.7717/peerj.12087] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 08/07/2021] [Indexed: 11/20/2022] Open

Abstract

Single-cell RNA-sequencing is a rapidly evolving technology that enables us to understand biological processes at unprecedented resolution. Single-cell expression analysis requires a complex data processing pipeline, and the pipeline is divided into two main parts: The quantification part, which converts the sequence information into gene-cell matrix data; the analysis part, which analyzes the matrix data using statistics and/or machine learning techniques. In the analysis part, unsupervised cell clustering plays an important role in identifying cell types and discovering cell diversity and subpopulations. Identified cell clusters are also used for subsequent analysis, such as finding differentially expressed genes and inferring cell trajectories. However, single-cell clustering using gene expression profiles shows different results depending on the quantification methods. Clustering results are greatly affected by the quantification method used in the upstream process. In other words, even if the original RNA-sequence data is the same, gene expression profiles processed by different quantification methods will produce different clusters. In this article, we propose a robust and highly accurate clustering method based on joint non-negative matrix factorization (joint-NMF) by utilizing the information from multiple gene expression profiles quantified using different methods from the same RNA-sequence data. Our joint-NMF can extract common factors among multiple gene expression profiles by applying each NMF under the constraint that one of the factorized matrices is shared among multiple NMFs. The joint-NMF determines more robust and accurate cell clustering results by leveraging multiple quantification methods compared to conventional clustering methods, which use only a single gene expression profile. Additionally, we showed the usefulness of discovering marker genes with the extracted features using our method.

Collapse

A Multiple Comprehensive Analysis of scATAC-seq Based on Auto-Encoder and Matrix Decomposition. Symmetry (Basel) 2021. [DOI: 10.3390/sym13081467] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Zhao Y, Fang ZY, Lin CX, Deng C, Xu YP, Li HD. RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest. Front Genet 2021;12:665843. [PMID: 34386033 PMCID: PMC8354212 DOI: 10.3389/fgene.2021.665843] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 04/01/2021] [Indexed: 11/13/2022] Open

Zhang W, Xue X, Zheng X, Fan Z. NMFLRR: Clustering scRNA-seq data by integrating non-negative matrix factorization with low rank representation. IEEE J Biomed Health Inform 2021;26:1394-1405. [PMID: 34310328 DOI: 10.1109/jbhi.2021.3099127] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Zhu YL, Yuan SS, Liu JX. Similarity and Dissimilarity Regularized Nonnegative Matrix Factorization for Single-Cell RNA-seq Analysis. Interdiscip Sci 2021;14:45-54. [PMID: 34231183 DOI: 10.1007/s12539-021-00457-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 06/24/2021] [Accepted: 06/27/2021] [Indexed: 10/20/2022]

Abstract

In traditional sequencing techniques, the different functions of cells and the different roles they play in differentiation are often ignored. With the advancement of single-cell RNA sequencing (scRNA-seq) techniques, scientists can measure the gene expression value at the single-cell level, and it is helping to understand the heterogeneity hidden in cells. One of the most powerful ways to find heterogeneity is using the unsupervised clustering method to get separate subpopulations. In this paper, we propose a novel clustering method Similarity and Dissimilarity Regularized Nonnegative Matrix Factorization (SDCNMF) that simultaneously impose similarity and dissimilarity constraints on low-dimensional representations. SDCNMF both considers the similarity of closer cells and the dissimilarity of cells that are farther away. It can not only keep the similar cells getting closer in low-dimensional space, but also can push the dissimilar cells away from each other. We test the validity of our proposed method on five scRNA-seq datasets. Clustering results show that SDCNMF is better than other comparative methods, and the gene markers we find are also consistent with previous studies. Therefore, we can conclude that SDCNMF is effective in scRNA-seq data analysis. This paper proposes a novel clustering method Similarity and Dissimilarity Regularized Nonnegative Matrix Factorization (SDCNMF) that simultaneously impose similarity and dissimilarity constraints on low-dimensional representations. SDCNMF both considers the similarity of closer cells and the dissimilarity of cells that are farther away. It can not only keep the similar cells getting closer in low-dimensional space, but also can push the dissimilar cells away from each other. Clustering results show that SDCNMF is better than other comparative methods, and the gene markers we find are also consistent with previous studies.

Collapse

Kharchenko PV. The triumphs and limitations of computational methods for scRNA-seq. Nat Methods 2021;18:723-732. [PMID: 34155396 DOI: 10.1038/s41592-021-01171-x] [Citation(s) in RCA: 154] [Impact Index Per Article: 38.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Accepted: 04/29/2021] [Indexed: 02/05/2023]

Li HD, Xu Y, Zhu X, Liu Q, Omenn GS, Wang J. ClusterMine: A knowledge-integrated clustering approach based on expression profiles of gene sets. J Bioinform Comput Biol 2021;18:2040009. [PMID: 32698720 DOI: 10.1142/s0219720020400090] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Li RY, Guan J, Zhou S. Boosting scRNA-seq data clustering by cluster-aware feature weighting. BMC Bioinformatics 2021;22:130. [PMID: 34078287 PMCID: PMC8171019 DOI: 10.1186/s12859-021-04033-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 02/16/2021] [Indexed: 12/26/2022] Open

Li Y, Luo P, Lu Y, Wu FX. Identifying cell types from single-cell data based on similarities and dissimilarities between cells. BMC Bioinformatics 2021;22:255. [PMID: 34006217 PMCID: PMC8132444 DOI: 10.1186/s12859-020-03873-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 11/09/2020] [Indexed: 12/15/2022] Open

Abstract

Background

With the development of the technology of single-cell sequence, revealing homogeneity and heterogeneity between cells has become a new area of computational systems biology research. However, the clustering of cell types becomes more complex with the mutual penetration between different types of cells and the instability of gene expression. One way of overcoming this problem is to group similar, related single cells together by the means of various clustering analysis methods. Although some methods such as spectral clustering can do well in the identification of cell types, they only consider the similarities between cells and ignore the influence of dissimilarities on clustering results. This methodology may limit the performance of most of the conventional clustering algorithms for the identification of clusters, it needs to develop special methods for high-dimensional sparse categorical data.

Results

Inspired by the phenomenon that same type cells have similar gene expression patterns, but different types of cells evoke dissimilar gene expression patterns, we improve the existing spectral clustering method for clustering single-cell data that is based on both similarities and dissimilarities between cells. The method first measures the similarity/dissimilarity among cells, then constructs the incidence matrix by fusing similarity matrix with dissimilarity matrix, and, finally, uses the eigenvalues of the incidence matrix to perform dimensionality reduction and employs the K-means algorithm in the low dimensional space to achieve clustering. The proposed improved spectral clustering method is compared with the conventional spectral clustering method in recognizing cell types on several real single-cell RNA-seq datasets.

Conclusions

In summary, we show that adding intercellular dissimilarity can effectively improve accuracy and achieve robustness and that improved spectral clustering method outperforms the traditional spectral clustering method in grouping cells.

Collapse

Liang Z, Li M, Zheng R, Tian Y, Yan X, Chen J, Wu FX, Wang J. SSRE: Cell Type Detection Based on Sparse Subspace Representation and Similarity Enhancement. GENOMICS PROTEOMICS & BIOINFORMATICS 2021;19:282-291. [PMID: 33647482 PMCID: PMC8602764 DOI: 10.1016/j.gpb.2020.09.004] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 08/13/2020] [Accepted: 10/29/2020] [Indexed: 11/25/2022]

Xu Y, Li HD, Pan Y, Luo F, Wu FX, Wang J. A Gene Rank Based Approach for Single Cell Similarity Assessment and Clustering. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:431-442. [PMID: 31369384 DOI: 10.1109/tcbb.2019.2931582] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Zhang W, Li Y, Zou X. SCCLRR: A Robust Computational Method for Accurate Clustering Single Cell RNA-Seq Data. IEEE J Biomed Health Inform 2021;25:247-256. [PMID: 32356764 DOI: 10.1109/jbhi.2020.2991172] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Yu B, Chen C, Qi R, Zheng R, Skillman-Lawrence PJ, Wang X, Ma A, Gu H. scGMAI: a Gaussian mixture model for clustering single-cell RNA-Seq data based on deep autoencoder. Brief Bioinform 2020;22:6029147. [PMID: 33300547 DOI: 10.1093/bib/bbaa316] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 10/19/2020] [Indexed: 01/01/2023] Open

Wu P, An M, Zou HR, Zhong CY, Wang W, Wu CP. A robust semi-supervised NMF model for single cell RNA-seq data. PeerJ 2020;8:e10091. [PMID: 33088619 PMCID: PMC7571410 DOI: 10.7717/peerj.10091] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 09/13/2020] [Indexed: 11/20/2022] Open

Sun YS, Ou-Yang L, Dai DQ. LRSK: a low-rank self-representation K-means method for clustering single-cell RNA-sequencing data. Mol Omics 2020;16:465-473. [PMID: 32572422 DOI: 10.1039/d0mo00034e] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]