1
|
Lall S, Ray S, Bandyopadhyay S. Enhancing Single-Cell RNA-seq Data Completeness with a Graph Learning Framework. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; PP:64-72. [PMID: 39504287 DOI: 10.1109/tcbb.2024.3492384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2024]
Abstract
Single cell RNA sequencing (scRNA-seq) is a powerful tool to capture gene expression snapshots in individual cells. However, a low amount of RNA in the individual cells results in dropout events, which introduce huge zero counts in the single cell expression matrix. We have developed VAImpute, a variational graph autoencoder based imputation technique that learns the inherent distribution of a large network/graph constructed from the scRNA-seq data leveraging copula correlation () among cells/genes. The trained model is utilized to predict the dropouts events by computing the probability of all non-edges (cell-gene) in the network. We devise an algorithm to impute the missing expression values of the detected dropouts. The performance of the proposed model is assessed on both simulated and real scRNA-seq datasets, comparing it to established single-cell imputation methods. VAImpute yields significant improvements to detect dropouts, thereby achieving superior performance in cell clustering, detecting rare cells, and differential expression. All codes and datasets are given in the github link: https://github.com/sumantaray/VAImputeAvailability.
Collapse
|
2
|
Lac L, Leung CK, Hu P. Computational frameworks integrating deep learning and statistical models in mining multimodal omics data. J Biomed Inform 2024; 152:104629. [PMID: 38552994 DOI: 10.1016/j.jbi.2024.104629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 02/26/2024] [Accepted: 03/25/2024] [Indexed: 04/04/2024]
Abstract
BACKGROUND In health research, multimodal omics data analysis is widely used to address important clinical and biological questions. Traditional statistical methods rely on the strong assumptions of distribution. Statistical methods such as testing and differential expression are commonly used in omics analysis. Deep learning, on the other hand, is an advanced computer science technique that is powerful in mining high-dimensional omics data for prediction tasks. Recently, integrative frameworks or methods have been developed for omics studies that combine statistical models and deep learning algorithms. METHODS AND RESULTS The aim of these integrative frameworks is to combine the strengths of both statistical methods and deep learning algorithms to improve prediction accuracy while also providing interpretability and explainability. This review report discusses the current state-of-the-art integrative frameworks, their limitations, and potential future directions in survival and time-to-event longitudinal analysis, dimension reduction and clustering, regression and classification, feature selection, and causal and transfer learning.
Collapse
Affiliation(s)
- Leann Lac
- Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada; Department of Statistics, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Carson K Leung
- Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Pingzhao Hu
- Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada; Department of Biochemistry, Western University, London, Ontario, Canada; Department of Computer Science, Western University, London, Ontario, Canada; Department of Oncology, Western University, London, Ontario, Canada; Department of Epidemiology and Biostatistics, Western University, London, Ontario, Canada; The Children's Health Research Institute, Lawson Health Research Institute, London, Ontario, Canada.
| |
Collapse
|
3
|
Zhang C, Gao J, Chen HY, Kong L, Cao G, Guo X, Liu W, Ren B, Wei DQ. STGIC: A graph and image convolution-based method for spatial transcriptomic clustering. PLoS Comput Biol 2024; 20:e1011935. [PMID: 38416785 PMCID: PMC10927115 DOI: 10.1371/journal.pcbi.1011935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 03/11/2024] [Accepted: 02/20/2024] [Indexed: 03/01/2024] Open
Abstract
Spatial transcriptomic (ST) clustering employs spatial and transcription information to group spots spatially coherent and transcriptionally similar together into the same spatial domain. Graph convolution network (GCN) and graph attention network (GAT), fed with spatial coordinates derived adjacency and transcription profile derived feature matrix are often used to solve the problem. Our proposed method STGIC (spatial transcriptomic clustering with graph and image convolution) is designed for techniques with regular lattices on chips. It utilizes an adaptive graph convolution (AGC) to get high quality pseudo-labels and then resorts to dilated convolution framework (DCF) for virtual image converted from gene expression information and spatial coordinates of spots. The dilation rates and kernel sizes are set appropriately and updating of weight values in the kernels is made to be subject to the spatial distance from the position of corresponding elements to kernel centers so that feature extraction of each spot is better guided by spatial distance to neighbor spots. Self-supervision realized by Kullback-Leibler (KL) divergence, spatial continuity loss and cross entropy calculated among spots with high confidence pseudo-labels make up the training objective of DCF. STGIC attains state-of-the-art (SOTA) clustering performance on the benchmark dataset of 10x Visium human dorsolateral prefrontal cortex (DLPFC). Besides, it's capable of depicting fine structures of other tissues from other species as well as guiding the identification of marker genes. Also, STGIC is expandable to Stereo-seq data with high spatial resolution.
Collapse
Affiliation(s)
- Chen Zhang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Junhui Gao
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Hong-Yu Chen
- College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Lingxin Kong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Guangshuo Cao
- State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang
| | - Xiangyu Guo
- Smart-Health Initiative, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Wei Liu
- Marine Science and Technology College, Zhejiang Ocean University, Zhoushan, China
| | - Bin Ren
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
4
|
Lan W, Liu M, Chen J, Ye J, Zheng R, Zhu X, Peng W. JLONMFSC: Clustering scRNA-seq data based on joint learning of non-negative matrix factorization and subspace clustering. Methods 2024; 222:1-9. [PMID: 38128706 DOI: 10.1016/j.ymeth.2023.11.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 11/07/2023] [Accepted: 11/29/2023] [Indexed: 12/23/2023] Open
Abstract
The development of single cell RNA sequencing (scRNA-seq) has provided new perspectives to study biological problems at the single cell level. One of the key issues in scRNA-seq data analysis is to divide cells into several clusters for discovering the heterogeneity and diversity of cells. However, the existing scRNA-seq data are high-dimensional, sparse, and noisy, which challenges the existing single-cell clustering methods. In this study, we propose a joint learning framework (JLONMFSC) for clustering scRNA-seq data. In our method, the dimension of the original data is reduced to minimize the effect of noise. In addition, the graph regularized matrix factorization is used to learn the local features. Further, the Low-Rank Representation (LRR) subspace clustering is utilized to learn the global features. Finally, the joint learning of local features and global features is performed to obtain the results of clustering. We compare the proposed algorithm with eight state-of-the-art algorithms for clustering performance on six datasets, and the experimental results demonstrate that the JLONMFSC achieves better performance in all datasets. The code is avalable at https://github.com/lanbiolab/JLONMFSC.
Collapse
Affiliation(s)
- Wei Lan
- School of Computer, Electronic and Information, Guangxi University, Nanning, China; Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, China.
| | - Mingyang Liu
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Jianwei Chen
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Jin Ye
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Ruiqing Zheng
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Xiaoshu Zhu
- School of Computer Science and Information Security, Guilin University of Science and Technology, Guilin, China
| | - Wei Peng
- School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China
| |
Collapse
|
5
|
Yuan LL, Chen Z, Qin J, Qin CJ, Bian J, Dong RF, Yuan TB, Xu YT, Kong LY, Xia YZ. Single-cell sequencing reveals the landscape of the tumor microenvironment in a skeletal undifferentiated pleomorphic sarcoma patient. Front Immunol 2022; 13:1019870. [PMID: 36466840 PMCID: PMC9709471 DOI: 10.3389/fimmu.2022.1019870] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 10/25/2022] [Indexed: 09/12/2024] Open
Abstract
Skeletal undifferentiated pleomorphic sarcoma (SUPS) is an invasive pleomorphic soft tissue sarcoma with a high degree of malignancy and poor prognosis. It is prone to recur and metastasize. The tumor microenvironment (TME) and the pathophysiology of SUPS are barely described. Single-cell RNA sequencing (scRNA-seq) provides an opportunity to dissect the landscape of human diseases at an unprecedented resolution, particularly in diseases lacking animal models, such as SUPS. We performed scRNA-seq to analyze tumor tissues and paracancer tissues from a SUPS patient. We identified the cell types and the corresponding marker genes in this SUPS case. We further showed that CD8+ exhausted T cells and Tregs highly expressed PDCD1, CTLA4 and TIGIT. Thus, PDCD1, CTLA4 and TIGIT were identified as potential targets in this case. We applied copy number karyotyping of aneuploid tumors (CopyKAT) to distinguish malignant cells from normal cells in fibroblasts. Our study identified eight malignant fibroblast subsets in SUPS with distinct gene expression profiles. C1-malignant Fibroblast and C6-malignant Fibroblast in the TME play crucial roles in tumor growth, angiogenesis, metastasis and immune response. Hence, targeting malignant fibroblasts could represent a potential strategy for this SUPS therapy. Intervention via tirelizumab enabled disease control, and immune checkpoint inhibitors (ICIs) of PD-1 may be considered as the first-line option in patients with SUPS. Taken together, scRNA-seq analyses provided a powerful basis for this SUPS treatment, improved our understanding of complex human diseases, and may afforded an alternative approach for personalized medicine in the future.
Collapse
Affiliation(s)
- Liu-Liu Yuan
- Jiangsu Key Laboratory of Bioactive Natural Product Research and State Key Laboratory of Natural Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing, China
| | - Zhong Chen
- Department of Orthopedics, Sir Run Run Hospital, Nanjing Medical University, Nanjing, China
| | - Jian Qin
- Department of Orthopedics, Sir Run Run Hospital, Nanjing Medical University, Nanjing, China
| | - Cheng-Jiao Qin
- Jiangsu Key Laboratory of Bioactive Natural Product Research and State Key Laboratory of Natural Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing, China
| | - Jing Bian
- Jiangsu Key Laboratory of Bioactive Natural Product Research and State Key Laboratory of Natural Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing, China
| | - Rui-Fang Dong
- Jiangsu Key Laboratory of Bioactive Natural Product Research and State Key Laboratory of Natural Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing, China
| | - Tang-Bo Yuan
- Department of Orthopedics, Sir Run Run Hospital, Nanjing Medical University, Nanjing, China
| | - Yi-Ting Xu
- Jiangsu Key Laboratory of Bioactive Natural Product Research and State Key Laboratory of Natural Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing, China
| | - Ling-Yi Kong
- Jiangsu Key Laboratory of Bioactive Natural Product Research and State Key Laboratory of Natural Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing, China
| | - Yuan-Zheng Xia
- Jiangsu Key Laboratory of Bioactive Natural Product Research and State Key Laboratory of Natural Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing, China
- Key Laboratory of Early Prevention and Treatment for Regional High Frequency Tumor (Guangxi Medical University), Ministry of Education and Guangxi Key Laboratory of Early Prevention and Treatment for Regional High Frequency Tumor, Nanning, China
| |
Collapse
|