1
|
Jogani S, Pol AS, Prajapati M, Samal A, Bhatia K, Parmar J, Patel U, Shah F, Vyas N, Gupta S. scaLR: a low-resource deep neural network-based platform for single cell analysis and biomarker discovery. Brief Bioinform 2025; 26:bbaf243. [PMID: 40439670 PMCID: PMC12121358 DOI: 10.1093/bib/bbaf243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2024] [Revised: 04/14/2025] [Accepted: 05/02/2025] [Indexed: 06/02/2025] Open
Abstract
Single-cell ribonucleic acid (RNA) sequencing (scRNA-seq) produces vast amounts of individual cell profiling data. Its analysis presents a significant challenge in accurately annotating cell types and their associated biomarkers. Different pipelines based on deep neural network (DNN) methods have been employed to tackle these issues. These pipelines have arisen as a promising resource and can extract meaningful and concise features from noisy, diverse, and high-dimensional data to enhance annotations and subsequent analysis. Existing tools require high computational resources to execute large sample datasets. We have developed a cutting-edge platform known as scaLR (Single-cell analysis using low resource) that efficiently processes data into feature subsets, samples in batches to reduce the required memory for processing large datasets, and running DNN models in multiple central processing units. scaLR is equipped with data processing, feature extraction, training, evaluation, and downstream analysis. Its novel feature extraction algorithm first trains the model on a feature subset and stores the importance of the features for all the features in that subset. At the end of the training of all subsets, the top-K features are selected based on their importance. The final model is trained on top-K features; its performance evaluation and associated downstream analysis provide significant biomarkers for different cell types and diseases/traits. Our findings indicate that scaLR offers comparable prediction accuracy and requires less model training time and computational resources than existing Python-based pipelines. We present scaLR, a Python-based platform, engineered to utilize minimal computational resources while maintaining comparable execution times and analysis costs to existing frameworks.
Collapse
Affiliation(s)
- Saiyam Jogani
- Department of Generative AI & Bioinformatics, Infocusp Innovations, Laxman Nagar Baner, Pune 411045, Maharashtra, India
| | - Anand Santosh Pol
- Department of Generative AI & Bioinformatics, Infocusp Innovations, Laxman Nagar Baner, Pune 411045, Maharashtra, India
| | - Mayur Prajapati
- Department of Generative AI & Bioinformatics, Infocusp Innovations, Gala-hub, Bopal, Ahmedabad 380058, Gujarat, India
| | - Amit Samal
- Department of Generative AI & Bioinformatics, Infocusp Innovations, Gala-hub, Bopal, Ahmedabad 380058, Gujarat, India
| | - Kriti Bhatia
- Department of Generative AI & Bioinformatics, Infocusp Innovations, Laxman Nagar Baner, Pune 411045, Maharashtra, India
| | - Jayendra Parmar
- Department of Generative AI & Bioinformatics, Infocusp Innovations, Gala-hub, Bopal, Ahmedabad 380058, Gujarat, India
| | - Urvik Patel
- Department of Generative AI & Bioinformatics, Infocusp Innovations, Gala-hub, Bopal, Ahmedabad 380058, Gujarat, India
| | - Falak Shah
- Department of Generative AI & Bioinformatics, Infocusp Innovations, Gala-hub, Bopal, Ahmedabad 380058, Gujarat, India
| | - Nisarg Vyas
- Department of Generative AI & Bioinformatics, Infocusp Innovations, Gala-hub, Bopal, Ahmedabad 380058, Gujarat, India
| | - Saurabh Gupta
- Department of Generative AI & Bioinformatics, Infocusp Innovations, Gala-hub, Bopal, Ahmedabad 380058, Gujarat, India
| |
Collapse
|
2
|
Dai Y, Li Q, Deng J, Wu S, Zhang G, Hu Y, Shen Y, Liu D, Wu H, Gong J. Rhpn2 regulates the development and function of vestibular sensory hair cells through the RhoA signaling in zebrafish. J Genet Genomics 2025:S1673-8527(25)00115-8. [PMID: 40254160 DOI: 10.1016/j.jgg.2025.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2025] [Revised: 04/10/2025] [Accepted: 04/10/2025] [Indexed: 04/22/2025]
Abstract
Hearing and balance disorders are significant health issues primarily caused by developmental defects or the irreversible loss of sensory hair cells (HCs). Identifying the underlying genes involved in the morphogenesis and development of HCs is crucial. Our current study highlights rhpn2, a member of rho-binding proteins, as essential for vestibular HC development. The rhpn2 gene is highly expressed in the crista and macula HCs. Loss of rhpn2 function in zebrafish reduces the otic vesicle area and vestibular HC number, accompanied by vestibular dysfunction. Shorter stereocilia and compromised mechanotransduction channel function are found in the crista HCs of rhpn2 mutants. Transcriptome RNA sequencing analysis predicts the potential interaction of rhpn2 with rhoab. Furthermore, co-immunoprecipitation confirms that Rhpn2 directly binds to RhoA, validating the interaction of the two proteins. rhpn2 knockout leads to a decreased expression of rock2b, a canonical RhoA signaling pathway gene. Treatment with the RhoA activator or exogenous rock2b mRNA injection mitigates crista HC stereocilia defects in rhpn2 mutants. This study uncovers the role of rhpn2 in vestibular HC development and stereocilia formation via mediating the RhoA signaling pathway, providing a target for the treatment of balance disorders.
Collapse
Affiliation(s)
- Yubei Dai
- Department of Clinical and Translational Research Center, Department of Gastrointestinal Surgery, Affiliated Hospital of Nantong University, Nantong Laboratory of Development and Diseases, School of Life Science, Nantong University, Nantong, Jiangsu 226001, China
| | - Qianqian Li
- Department of Clinical and Translational Research Center, Department of Gastrointestinal Surgery, Affiliated Hospital of Nantong University, Nantong Laboratory of Development and Diseases, School of Life Science, Nantong University, Nantong, Jiangsu 226001, China
| | - Jiaju Deng
- Department of Clinical and Translational Research Center, Department of Gastrointestinal Surgery, Affiliated Hospital of Nantong University, Nantong Laboratory of Development and Diseases, School of Life Science, Nantong University, Nantong, Jiangsu 226001, China
| | - Sihang Wu
- Department of Clinical and Translational Research Center, Department of Gastrointestinal Surgery, Affiliated Hospital of Nantong University, Nantong Laboratory of Development and Diseases, School of Life Science, Nantong University, Nantong, Jiangsu 226001, China
| | - Guiyi Zhang
- Department of Clinical and Translational Research Center, Department of Gastrointestinal Surgery, Affiliated Hospital of Nantong University, Nantong Laboratory of Development and Diseases, School of Life Science, Nantong University, Nantong, Jiangsu 226001, China
| | - Yuebo Hu
- Department of Clinical and Translational Research Center, Department of Gastrointestinal Surgery, Affiliated Hospital of Nantong University, Nantong Laboratory of Development and Diseases, School of Life Science, Nantong University, Nantong, Jiangsu 226001, China
| | - Yuqian Shen
- Department of Clinical and Translational Research Center, Department of Gastrointestinal Surgery, Affiliated Hospital of Nantong University, Nantong Laboratory of Development and Diseases, School of Life Science, Nantong University, Nantong, Jiangsu 226001, China
| | - Dong Liu
- Department of Clinical and Translational Research Center, Department of Gastrointestinal Surgery, Affiliated Hospital of Nantong University, Nantong Laboratory of Development and Diseases, School of Life Science, Nantong University, Nantong, Jiangsu 226001, China.
| | - Han Wu
- Department of Clinical and Translational Research Center, Department of Gastrointestinal Surgery, Affiliated Hospital of Nantong University, Nantong Laboratory of Development and Diseases, School of Life Science, Nantong University, Nantong, Jiangsu 226001, China.
| | - Jie Gong
- Department of Clinical and Translational Research Center, Department of Gastrointestinal Surgery, Affiliated Hospital of Nantong University, Nantong Laboratory of Development and Diseases, School of Life Science, Nantong University, Nantong, Jiangsu 226001, China.
| |
Collapse
|
3
|
Dai Q, Liu W, Yu X, Duan X, Liu Z. Self-Supervised Graph Representation Learning for Single-Cell Classification. Interdiscip Sci 2025:10.1007/s12539-025-00700-y. [PMID: 40180773 DOI: 10.1007/s12539-025-00700-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Revised: 03/02/2025] [Accepted: 03/04/2025] [Indexed: 04/05/2025]
Abstract
Accurately identifying cell types in single-cell RNA sequencing data is critical for understanding cellular differentiation and pathological mechanisms in downstream analysis. As traditional biological approaches are laborious and time-intensive, it is imperative to develop computational biology methods for cell classification. However, it remains a challenge for existing methods to adequately utilize the potential gene expression information within the vast amount of unlabeled cell data, which limits their classification and generalization performance. Therefore, we propose a novel self-supervised graph representation learning framework for single-cell classification, named scSSGC. Specifically, in the pre-training stage of self-supervised learning, multiple K-means clustering tasks conducted on unlabeled cell data are jointly employed for model training, thereby mitigating the issue of limited labeled data. To effectively capture the potential interactions among cells, we introduce a locally augmented graph neural network to enhance the information aggregation capability for nodes with fewer neighbors in the cell graph. A range of benchmark experiments demonstrates that scSSGC outperforms existing state-of-the-art cell classification methods. More importantly, scSSGC provides stable performance when faced with cross-datasets, indicating better generalization ability.
Collapse
Affiliation(s)
- Qiguo Dai
- School of Computer Science and Engineering, Dalian Minzu University, Dalian, 116650, China.
- SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, Dalian, 116650, China.
| | - Wuhao Liu
- School of Computer Science and Engineering, Dalian Minzu University, Dalian, 116650, China
- SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, Dalian, 116650, China
| | - Xianhai Yu
- School of Computer Science and Engineering, Dalian Minzu University, Dalian, 116650, China
- SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, Dalian, 116650, China
| | - Xiaodong Duan
- SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, Dalian, 116650, China
| | - Ziqiang Liu
- SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, Dalian, 116650, China
- Hangzhou Institute of Medicine, Chinese Academy of Sciences, Hangzhou, 310018, China
| |
Collapse
|
4
|
Fievet G, Broséus J, Meyre D, Hergalant S. adverSCarial: assessing the vulnerability of single-cell RNA-sequencing classifiers to adversarial attacks. Bioinformatics 2025; 41:btaf168. [PMID: 40234247 PMCID: PMC12036967 DOI: 10.1093/bioinformatics/btaf168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/01/2025] [Accepted: 04/13/2025] [Indexed: 04/17/2025] Open
Abstract
MOTIVATION Several machine learning (ML) algorithms dedicated to the detection of healthy and diseased cell types from single-cell RNA sequencing (scRNA-seq) data have been proposed for biomedical purposes. This raises concerns about their vulnerability to adversarial attacks, exploiting threats causing malicious alterations of the classifiers' output with defective and well-crafted input. RESULTS With adverSCarial, adversarial attacks of single-cell transcriptomic data can easily be simulated in a range of ways, from expanded but undetectable modifications to aggressive and targeted ones, enabling vulnerability assessment of scRNA-seq classifiers to variations of gene expression, whether technical, biological, or intentional. We exemplify the usefulness and performance with a panel of attack modes proposed in adverSCarial by assessing the robustness of five scRNA-seq classifiers, each belonging to a distinct class of ML algorithm, and explore the potential unlocked by exposing their inner workings and sensitivities on four different datasets. These analyses can guide the development of more reliable models, with improved interpretability, usable in biomedical research and future clinical applications. AVAILABILITY AND IMPLEMENTATION adverSCarial is a freely available R package accessible from Bioconductor: https://bioconductor.org/packages/adverSCarial/ or https://doi.org/10.18129/B9.bioc.adverSCarial. A development version is available at https://github.com/GhislainFievet/adverSCarial.
Collapse
Affiliation(s)
- Ghislain Fievet
- INSERM U1256, Nutrition, Genetics, and Environmental Risk Exposure (NGERE), University of Lorraine, Nancy, 54500, France
| | - Julien Broséus
- INSERM U1256, Nutrition, Genetics, and Environmental Risk Exposure (NGERE), University of Lorraine, Nancy, 54500, France
- Department of Biological Hematology, Laboratory Center, University Hospital of Nancy, Nancy, 54500, France
| | - David Meyre
- INSERM U1256, Nutrition, Genetics, and Environmental Risk Exposure (NGERE), University of Lorraine, Nancy, 54500, France
- Department of Molecular Medicine, Division of Biochemistry, Molecular Biology, and Nutrition, University Hospital of Nancy, Nancy, 54500, France
| | - Sébastien Hergalant
- INSERM U1256, Nutrition, Genetics, and Environmental Risk Exposure (NGERE), University of Lorraine, Nancy, 54500, France
| |
Collapse
|
5
|
Liu M, Zheng S, Li H, Budowle B, Wang L, Lou Z, Ge J. High resolution tissue and cell type identification via single cell transcriptomic profiling. PLoS One 2025; 20:e0318151. [PMID: 40138334 PMCID: PMC11940611 DOI: 10.1371/journal.pone.0318151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Accepted: 01/11/2025] [Indexed: 03/29/2025] Open
Abstract
Tissue identification can be instrumental in reconstructing a crime scene but remains a challenging task in forensic investigations. Conventionally, identifying the presence of certain tissue from tissue mixture by predefined cell type markers in bulk fashion is challenging due to limitations in sensitivity and accuracy. In contrast, single-cell RNA sequencing (scRNA-Seq) is a promising technology that has the potential to enhance or even revolutionize tissue and cell type identification. In this study, we developed a high sensitive general purpose single cell annotation pipeline, scTissueID, to accurately evaluate the single cell profile quality and precisely determine the cell and tissue types based on scRNA profiles. By incorporating a crucial and unique reference cell quality differentiation phase of targeting only high confident cells as reference, scTissueID achieved better and consistent performance in determining cell and tissue types compared to 8 state-of-art single cell annotation pipelines and 6 widely adopted machine learning algorithms, as demonstrated through a large-scale and comprehensive comparison study using both forensic-relevant and Human Cell Atlas (HCA) data. We highlighted the significance of cell quality differentiation, a previously undervalued factor. Thus, this study offers a tool capable of accurately and efficiently identifying cell and tissue types, with broad applicability to forensic investigations and other biomedical research endeavors.
Collapse
Affiliation(s)
- Muyi Liu
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, Texas, United States of America
- Department of Cell Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Suilan Zheng
- Department of Chemistry, Purdue University, West Lafayette, Indiana, United States of America
| | - Hongmin Li
- Department of Computer Science, California State University, East Bay, Hayward, California, United States of America
| | - Bruce Budowle
- Department of Forensic Medicine, University of Helsinki, Finland
| | - Le Wang
- Department of Electronic and Information Engineering, North China University of Technology, Beijing, China
| | - Zhaohuan Lou
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou, China
| | - Jianye Ge
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, Texas, United States of America
| |
Collapse
|
6
|
Liang DM, Du PF. scMUG: deep clustering analysis of single-cell RNA-seq data on multiple gene functional modules. Brief Bioinform 2025; 26:bbaf138. [PMID: 40188497 PMCID: PMC11972635 DOI: 10.1093/bib/bbaf138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2024] [Revised: 02/11/2025] [Accepted: 03/09/2025] [Indexed: 04/08/2025] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) has revolutionized our understanding of cellular heterogeneity by providing gene expression data at the single-cell level. Unlike bulk RNA-seq, scRNA-seq allows identification of different cell types within a given tissue, leading to a more nuanced comprehension of cell functions. However, the analysis of scRNA-seq data presents challenges due to its sparsity and high dimensionality. Since bioinformatics plays an important role in the analysis of big data and its utility for the welfare of living beings, it has been widely applied in analyzing scRNA-seq data. To address these challenges, we introduce the scMUG computational pipeline, which incorporates gene functional module information to enhance scRNA-seq clustering analysis. The pipeline includes data preprocessing, cell representation generation, cell-cell similarity matrix construction, and clustering analysis. The scMUG pipeline also introduces a novel similarity measure that combines local density and global distribution in the latent cell representation space. As far as we can tell, this is the first attempt to integrate gene functional associations into scRNA-seq clustering analysis. We curated nine human scRNA-seq datasets to evaluate our scMUG pipeline. With the help of gene functional information and the novel similarity measure, the clustering results from scMUG pipeline present deep insights into functional relationships between gene expression patterns and cellular heterogeneity. In addition, our scMUG pipeline also presents comparable or better clustering performances than other state-of-the-art methods. All source codes of scMUG have been deposited in a GitHub repository with instructions for reproducing all results (https://github.com/degiminnal/scMUG).
Collapse
Affiliation(s)
- De-Min Liang
- College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
| | - Pu-Feng Du
- College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
| |
Collapse
|
7
|
Zhao J, Wang Y, Feng C, Yin M, Gao Y, Wei L, Song C, Ai B, Wang Q, Zhang J, Zhu J, Li C. SCInter: A comprehensive single-cell transcriptome integration database for human and mouse. Comput Struct Biotechnol J 2024; 23:77-86. [PMID: 38125297 PMCID: PMC10731004 DOI: 10.1016/j.csbj.2023.11.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 11/12/2023] [Accepted: 11/13/2023] [Indexed: 12/23/2023] Open
Abstract
Single-cell RNA sequencing (scRNA-seq), which profiles gene expression at the cellular level, has effectively explored cell heterogeneity and reconstructed developmental trajectories. With the increasing research on diseases and biological processes, scRNA-seq datasets are accumulating rapidly, highlighting the urgent need for collecting and processing these data to support comprehensive and effective annotation and analysis. Here, we have developed a comprehensive Single-Cell transcriptome integration database for human and mouse (SCInter, https://bio.liclab.net/SCInter/index.php), which aims to provide a manually curated database that supports the provision of gene expression profiles across various cell types at the sample level. The current version of SCInter includes 115 integrated datasets and 1016 samples, covering nearly 150 tissues/cell lines. It contains 8016,646 cell markers in 457 identified cell types. SCInter enabled comprehensive analysis of cataloged single-cell data encompassing quality control (QC), clustering, cell markers, multi-method cell type automatic annotation, predicting cell differentiation trajectories and so on. At the same time, SCInter provided a user-friendly interface to query, browse, analyze and visualize each integrated dataset and single cell sample, along with comprehensive QC reports and processing results. It will facilitate the identification of cell type in different cell subpopulations and explore developmental trajectories, enhancing the study of cell heterogeneity in the fields of immunology and oncology.
Collapse
Affiliation(s)
- Jun Zhao
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Yuezhu Wang
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Chenchen Feng
- School of Computer, University of South China, Hengyang, Hunan 421001, China
| | - Mingxue Yin
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Hunan Provincial Key Laboratory of Multi-omics And Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan, 421001, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Department of Cell Biology and Genetics, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Yu Gao
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Ling Wei
- Institute of Medical Innovation and Research, Peking University Third Hospital, Beijing 100191, China
- Cancer Center, Peking University Third Hospital, Beijing 100191, China
| | - Chao Song
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics And Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan, 421001, China
| | - Bo Ai
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Qiuyu Wang
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics And Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan, 421001, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Department of Cell Biology and Genetics, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Jian Zhang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Jiang Zhu
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Chunquan Li
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics And Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan, 421001, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Department of Cell Biology and Genetics, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- National Health Commission Key Laboratory of Birth Defect Research and Prevention, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| |
Collapse
|
8
|
Shao T, Gao Q, Tang W, Ma Y, Gu J, Yu Z. The Role of Immunocyte Infiltration Regulatory Network Based on hdWGCNA and Single-Cell Bioinformatics Analysis in Intervertebral Disc Degeneration. Inflammation 2024; 47:1987-1999. [PMID: 38630169 DOI: 10.1007/s10753-024-02020-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 04/02/2024] [Accepted: 04/03/2024] [Indexed: 11/30/2024]
Abstract
Immune infiltration plays a crucial role in intervertebral disc degeneration (IDD). In this study, we explored the immune microenvironment of IDD through single-cell bioinformatics analysis. Three single-cell datasets were integrated into this study. Nucleus pulposus cells (NPCs) were divided into subgroups based on characteristic genes, and the role of each subgroup in the IDD process was analyzed through pseudo-time trajectory analysis. The hub genes were obtained using hdWGCNA, further identified by bulk datasets and pseudo-time sequence. The expression of the hub genes defined the NPCs related to immune infiltration, and the interaction between these NPCs and immunocytes was explored. The NPCs were divided into four subgroups: reserve NPCs, HCL-NPCs, response NPCs, and support NPCs, which, respectively, dominate the four processes of IDD: non, mild, moderate, and severe degeneration. SPP1 and ICAM1 were identified as the nucleus pulposus immune infiltration hub genes. Macrophages and myelocytes played pro-inflammatory roles in the SPP1-ICAM both-up NPC group through the SPP1-CD44 pathway and ICAM1-ITGB2 ligand-receptor pathway, respectively. At the same time, both-up NPCs sought self-help inflammation remission from neutrophils through the ANXA1-FPR1 pathway. The systematic analysis of the differentiation and immune infiltration landscapes helps to understand IDD's overall development process. Our data suggest that SPP1 and ICAM1 may be new targets for the treatment of inflammatory infiltration in IDD.
Collapse
Affiliation(s)
- Tuo Shao
- Department of Spinal Surgery, First Affiliated Hospital of Harbin Medical University, No.23 Youzheng Street, Harbin, 150001, China
| | - Qichang Gao
- Department of Spinal Surgery, First Affiliated Hospital of Harbin Medical University, No.23 Youzheng Street, Harbin, 150001, China
| | - Weilong Tang
- Department of Spinal Surgery, First Affiliated Hospital of Harbin Medical University, No.23 Youzheng Street, Harbin, 150001, China
| | - Yiming Ma
- Department of Spinal Surgery, First Affiliated Hospital of Harbin Medical University, No.23 Youzheng Street, Harbin, 150001, China
| | - Jiaao Gu
- Department of Spinal Surgery, First Affiliated Hospital of Harbin Medical University, No.23 Youzheng Street, Harbin, 150001, China
| | - Zhange Yu
- Department of Spinal Surgery, First Affiliated Hospital of Harbin Medical University, No.23 Youzheng Street, Harbin, 150001, China.
| |
Collapse
|
9
|
He X, Ma J, Yan X, Yang X, Wang P, Zhang L, Li N, Shi Z. CDT1 is a Potential Therapeutic Target for the Progression of NAFLD to HCC and the Exacerbation of Cancer. Curr Genomics 2024; 26:225-243. [PMID: 40433415 PMCID: PMC12107793 DOI: 10.2174/0113892029313473240919105819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 06/26/2024] [Accepted: 09/06/2024] [Indexed: 05/29/2025] Open
Abstract
Aims This study aimed to identify potential therapeutic targets in the progression from non-alcoholic fatty liver disease (NAFLD) to hepatocellular carcinoma (HCC), with a focus on genes that could influence disease development and progression. Background Hepatocellular carcinoma, significantly driven by non-alcoholic fatty liver disease, represents a major global health challenge due to late-stage diagnosis and limited treatment options. This study utilized bioinformatics to analyze data from GEO and TCGA, aiming to uncover molecular biomarkers that bridge NAFLD to HCC. Through identifying critical genes and pathways, our research seeks to advance early detection and develop targeted therapies, potentially improving prognosis and personalizing treatment for NAFLD-HCC patients. Objectives Identify key genes that differ between NAFLD and HCC; Analyze these genes to understand their roles in disease progression; Validate the functions of these genes in NAFLD to HCC transition. Methods Initially, we identified a set of genes differentially expressed in both NAFLD and HCC using second-generation sequencing data from the GEO and TCGA databases. We then employed a Cox proportional hazards model and a Lasso regression model, applying machine learning techniques to the large sample data from TCGA. This approach was used to screen for key disease-related genes, and an external dataset was utilized for model validation. Additionally, pseudo-temporal sequencing analysis of single-cell sequencing data was performed to further examine the variations in these genes in NAFLD and HCC. Results The machine learning analysis identified IGSF3, CENPW, CDT1, and CDC6 as key genes. Furthermore, constructing a machine learning model for CDT1 revealed it to be the most critical gene, with model validation yielding an ROC value greater than 0.80. The single-cell sequencing data analysis confirmed significant variations in the four predicted key genes between the NAFLD and HCC groups. Conclusion Our study underscores the pivotal role of CDT1 in the progression from NAFLD to HCC. This finding opens new avenues for early diagnosis and targeted therapy of HCC, highlighting CDT1 as a potential therapeutic target.
Collapse
Affiliation(s)
- Xingyu He
- Clinical Medical College & Affiliated Hospital of Chengdu University, Chengdu University, 610083, P.R. China
| | - Jun Ma
- Clinical Medical College & Affiliated Hospital of Chengdu University, Chengdu University, 610083, P.R. China
| | - Xue Yan
- Clinical Medical College & Affiliated Hospital of Chengdu University, Chengdu University, 610083, P.R. China
| | - Xiangyu Yang
- West China Hospital, Sichuan University, 610083, P.R. China
| | - Ping Wang
- West China Hospital, Sichuan University, 610083, P.R. China
| | - Lijie Zhang
- Clinical Medical College & Affiliated Hospital of Chengdu University, Chengdu University, 610083, P.R. China
| | - Na Li
- Clinical Medical College & Affiliated Hospital of Chengdu University, Chengdu University, 610083, P.R. China
| | - Zheng Shi
- Clinical Medical College & Affiliated Hospital of Chengdu University, Chengdu University, 610083, P.R. China
| |
Collapse
|
10
|
Bi X, Zhu S, Liu F, Wu X. Dynamics of alternative polyadenylation in single root cells of Arabidopsis thaliana. FRONTIERS IN PLANT SCIENCE 2024; 15:1437118. [PMID: 39372861 PMCID: PMC11449893 DOI: 10.3389/fpls.2024.1437118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Accepted: 09/02/2024] [Indexed: 10/08/2024]
Abstract
Introduction Single-cell RNA-seq (scRNA-seq) technologies have been widely used to reveal the diversity and complexity of cells, and pioneering studies on scRNA-seq in plants began to emerge since 2019. However, existing studies on plants utilized scRNA-seq focused only on the gene expression regulation. As an essential post-transcriptional mechanism for regulating gene expression, alternative polyadenylation (APA) generates diverse mRNA isoforms with distinct 3' ends through the selective use of different polyadenylation sites in a gene. APA plays important roles in regulating multiple developmental processes in plants, such as flowering time and stress response. Methods In this study, we developed a pipeline to identify and integrate APA sites from different scRNA-seq data and analyze APA dynamics in single cells. First, high-confidence poly(A) sites in single root cells were identified and quantified. Second, three kinds of APA markers were identified for exploring APA dynamics in single cells, including differentially expressed poly(A) sites based on APA site expression, APA markers based on APA usages, and APA switching genes based on 3' UTR (untranslated region) length change. Moreover, cell type annotations of single root cells were refined by integrating both the APA information and the gene expression profile. Results We comprehensively compiled a single-cell APA atlas from five scRNA-seq studies, covering over 150,000 cells spanning four major tissue branches, twelve cell types, and three developmental stages. Moreover, we quantified the dynamic APA usages in single cells and identified APA markers across tissues and cell types. Further, we integrated complementary information of gene expression and APA profiles to annotate cell types and reveal subtle differences between cell types. Discussion This study reveals that APA provides an additional layer of information for determining cell identity and provides a landscape of APA dynamics during Arabidopsis root development.
Collapse
Affiliation(s)
- Xingyu Bi
- Cancer Institute, Suzhou Medical College, Soochow University, Suzhou, China
| | - Sheng Zhu
- Operational Technology Research and Evaluation Center, China Nuclear Power Operation Technology Corporation, Ltd, Wuhan, China
| | - Fei Liu
- Cancer Institute, Suzhou Medical College, Soochow University, Suzhou, China
| | - Xiaohui Wu
- Cancer Institute, Suzhou Medical College, Soochow University, Suzhou, China
- Jiangsu Key Laboratory of Infection and Immunity, Soochow University, Suzhou, China
| |
Collapse
|
11
|
Qin J, Huang X, Gou S, Zhang S, Gou Y, Zhang Q, Chen H, Sun L, Chen M, Liu D, Han C, Tang M, Feng Z, Niu S, Zhao L, Tu Y, Liu Z, Xuan W, Dai L, Jia D, Xue Y. Ketogenic diet reshapes cancer metabolism through lysine β-hydroxybutyrylation. Nat Metab 2024; 6:1505-1528. [PMID: 39134903 DOI: 10.1038/s42255-024-01093-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 07/02/2024] [Indexed: 08/29/2024]
Abstract
Lysine β-hydroxybutyrylation (Kbhb) is a post-translational modification induced by the ketogenic diet (KD), a diet showing therapeutic effects on multiple human diseases. Little is known how cellular processes are regulated by Kbhb. Here we show that protein Kbhb is strongly affected by the KD through a multi-omics analysis of mouse livers. Using a small training dataset with known functions, we developed a bioinformatics method for the prediction of functionally important lysine modification sites (pFunK), which revealed functionally relevant Kbhb sites on various proteins, including aldolase B (ALDOB) Lys108. KD consumption or β-hydroxybutyrate supplementation in hepatocellular carcinoma cells increases ALDOB Lys108bhb and inhibits the enzymatic activity of ALDOB. A Kbhb-mimicking mutation (p.Lys108Gln) attenuates ALDOB activity and its binding to substrate fructose-1,6-bisphosphate, inhibits mammalian target of rapamycin signalling and glycolysis, and markedly suppresses cancer cell proliferation. Our study reveals a critical role of Kbhb in regulating cancer cell metabolism and provides a generally applicable algorithm for predicting functionally important lysine modification sites.
Collapse
Affiliation(s)
- Junhong Qin
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Xinhe Huang
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Shengsong Gou
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Sitao Zhang
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Yujie Gou
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Qian Zhang
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Hongyu Chen
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Lin Sun
- Frontiers Science Center for Synthetic Biology, Tianjin Key Laboratory of Function and Application of Biological Macromolecular Structures, School of Life Sciences, Tianjin University, Tianjin, China
| | - Miaomiao Chen
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Dan Liu
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Cheng Han
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Min Tang
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Zihao Feng
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Shenghui Niu
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Lin Zhao
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Yingfeng Tu
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Zexian Liu
- Department of Medical Oncology, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University, Guangzhou, China
| | - Weimin Xuan
- Frontiers Science Center for Synthetic Biology, Tianjin Key Laboratory of Function and Application of Biological Macromolecular Structures, School of Life Sciences, Tianjin University, Tianjin, China
| | - Lunzhi Dai
- National Clinical Research Center for Geriatrics and Department of General Practice, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, and Collaborative Innovation Center of Biotherapy, Chengdu, China
| | - Da Jia
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China.
| | - Yu Xue
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China.
- Nanjing University Institute of Artificial Intelligence Biomedicine, Nanjing, China.
| |
Collapse
|
12
|
Zhang Y, Sun H, Zhang W, Fu T, Huang S, Mou M, Zhang J, Gao J, Ge Y, Yang Q, Zhu F. CellSTAR: a comprehensive resource for single-cell transcriptomic annotation. Nucleic Acids Res 2024; 52:D859-D870. [PMID: 37855686 PMCID: PMC10767908 DOI: 10.1093/nar/gkad874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/12/2023] [Accepted: 09/27/2023] [Indexed: 10/20/2023] Open
Abstract
Large-scale studies of single-cell sequencing and biological experiments have successfully revealed expression patterns that distinguish different cell types in tissues, emphasizing the importance of studying cellular heterogeneity and accurately annotating cell types. Analysis of gene expression profiles in these experiments provides two essential types of data for cell type annotation: annotated references and canonical markers. In this study, the first comprehensive database of single-cell transcriptomic annotation resource (CellSTAR) was thus developed. It is unique in (a) offering the comprehensive expertly annotated reference data for annotating hundreds of cell types for the first time and (b) enabling the collective consideration of reference data and marker genes by incorporating tens of thousands of markers. Given its unique features, CellSTAR is expected to attract broad research interests from the technological innovations in single-cell transcriptomics, the studies of cellular heterogeneity & dynamics, and so on. It is now publicly accessible without any login requirement at: https://idrblab.org/cellstar.
Collapse
Affiliation(s)
- Ying Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Huaicheng Sun
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Wei Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Tingting Fu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Shijie Huang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Jinsong Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Jianqing Gao
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Yichao Ge
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Qingxia Yang
- Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
13
|
Yao Y, Wang D, Zhang Y, Tang Q, Xu Z, Qin L, Qu Y, Yan Z. Peroxisome proliferator-activated receptors signature reveal the head and neck squamous cell carcinoma energy metabolism phenotype and clinical outcome. J Gene Med 2024; 26:e3605. [PMID: 37932968 DOI: 10.1002/jgm.3605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 09/13/2023] [Accepted: 09/20/2023] [Indexed: 11/08/2023] Open
Abstract
BACKGROUND Peroxisome proliferator activating receptors (PPARs) are important regulators of nuclear hormone receptor function, and they play a key role in biological processes such as lipid metabolism, inflammation and cell proliferation. However, their role in head and neck squamous cell carcinoma (HNSC) is unclear. METHODS We used multiple datasets, including TCGA-HNSC, GSE41613, GSE139324, PRJEB23709 and IMVigor, to perform a comprehensive analysis of PPAR-related genes in HNSC. Single-cell sequencing data were preprocessed using Seurat packets, and intercellular communication was analyzed using CellChat packets. Functional enrichment analysis of PPAR-related genes was performed using ClusterProfile and GSEA. Prognostic models were constructed using LASSO and Cox regression models, and immunohistochemical analyses were performed using human protein mapping (The Human Protein Atlas). RESULTS Our single-cell RNA sequencing analysis revealed distinct cell populations in HNSC, with T cells having the most significant transcriptome differences between tumors and normal tissues. The PPAR features were higher in most cell types in tumor tissues compared with normal tissues. We identified 17 PPAR-associated differentially expressed genes between tumors and normal tissues. A prognostic model based on seven PPAR-associated genes was constructed with high accuracy in predicting 1, 2 and 3 year survival in patients with HNSC. In addition, patients with a low risk score had a higher immune score and a higher proportion of T cells, CD8+ T cells and cytotoxic lymphocytes. They also showed higher immune checkpoint gene expression, suggesting that they might benefit from immunotherapy. PPAR-related genes were found to be closely related to energy metabolism. CONCLUSIONS Our study provides a comprehensive understanding of the role of PPAR related genes in HNSC. The identified PPAR features and constructed prognostic models may serve as potential biomarkers for HNSC prognosis and treatment response. In addition, our study found that PPAR-related genes can differentiate energy metabolism and distinguish energy metabolic heterogeneity in HNSC, providing new insights into the molecular mechanisms of HNSC progression and therapeutic response.
Collapse
Affiliation(s)
- Yuan Yao
- Department of Interventional Radiology, The People's Hospital of Liaoning Province, Shenyang, China
| | - Di Wang
- Otolaryngology, The Second Affiliated Hospital of Shenyang Medical College, Shenyang, China
| | - Yu Zhang
- Pharmacy Department, General Hospital of Northern Theater Command, Shenyang, China
| | - Qiaofei Tang
- Otolaryngology, The Second Affiliated Hospital of Shenyang Medical College, Shenyang, China
| | - Zhi Xu
- Otolaryngology, The Second Affiliated Hospital of Shenyang Medical College, Shenyang, China
| | | | | | - Zhiyong Yan
- Otolaryngology, The Second Affiliated Hospital of Shenyang Medical College, Shenyang, China
| |
Collapse
|
14
|
Li X, Qu X, Li S, Lin K, Yao N, Wang N, Shi Y. Development of a Novel CD8 + T Cell-Associated Signature for Prognostic Assessment in Hepatocellular Carcinoma. Cancer Control 2024; 31:10732748241270583. [PMID: 39152700 PMCID: PMC11331481 DOI: 10.1177/10732748241270583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 05/22/2024] [Accepted: 06/13/2024] [Indexed: 08/19/2024] Open
Abstract
OBJECTIVE The aim of this study was to analyze the clinical significance and prognostic value of CD8+ T cell-related regulatory genes in hepatocellular carcinoma (HCC). METHODS This was a retrospective study. We combined TCGA-LIHC and single-cell RNA sequencing data for Lasso-Cox regression analysis to screen for CD8+ T cell-associated genes to construct a novel signature. The expression of the signature genes was detected at cellular and tissue levels using qRT-PCR, immunohistochemistry, and tissue microarrays. The CIBERSORT algorithm was then used to assess the immune microenvironmental differences between the different risk groups and a drug sensitivity analysis was performed to screen for potential HCC therapeutic agents. RESULTS An 8-gene CD8 + T cell-associated signature (FABP5, GZMH, ANXA2, KLRB1, CD7, IL7R, BATF, and RGS2) was constructed. Survival analysis showed that high-risk patients had a poorer prognosis in all cohorts. Tumor immune microenvironment analysis revealed 22 immune cell types that differed significantly between patients in different risk groups, with patients in the low-risk group having an immune system that was more active in terms of immune function. Patients in the high-risk group were more prone to immune escape and had a poorer response to immunotherapy, and AZD7762 was screened as the most sensitive drug in the high-risk group. Finally, preliminary experiments have shown that BATF has a promoting effect on the proliferation, migration and invasion of HuH-7 cells. CONCLUSIONS The CD8+ T-cell-associated signature is expected to be a tool for optimizing individual patient decision-making and monitoring protocols, and to provide new ideas for treatment and prognostic assessment of HCC.
Collapse
Affiliation(s)
- Xuezhi Li
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, China
| | - Xiaodong Qu
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, China
| | - Songbo Li
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, China
| | - Kexin Lin
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, China
| | - Nuo Yao
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, China
| | - Na Wang
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, China
| | - Yongquan Shi
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, China
| |
Collapse
|
15
|
Du ZH, Hu WL, Li JQ, Shang X, You ZH, Chen ZZ, Huang YA. scPML: pathway-based multi-view learning for cell type annotation from single-cell RNA-seq data. Commun Biol 2023; 6:1268. [PMID: 38097699 PMCID: PMC10721875 DOI: 10.1038/s42003-023-05634-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 11/24/2023] [Indexed: 12/17/2023] Open
Abstract
Recent developments in single-cell technology have enabled the exploration of cellular heterogeneity at an unprecedented level, providing invaluable insights into various fields, including medicine and disease research. Cell type annotation is an essential step in its omics research. The mainstream approach is to utilize well-annotated single-cell data to supervised learning for cell type annotation of new singlecell data. However, existing methods lack good generalization and robustness in cell annotation tasks, partially due to difficulties in dealing with technical differences between datasets, as well as not considering the heterogeneous associations of genes in regulatory mechanism levels. Here, we propose the scPML model, which utilizes various gene signaling pathway data to partition the genetic features of cells, thus characterizing different interaction maps between cells. Extensive experiments demonstrate that scPML performs better in cell type annotation and detection of unknown cell types from different species, platforms, and tissues.
Collapse
Affiliation(s)
- Zhi-Hua Du
- College of Computer Science and Software Engineering, ShenZhen University, 3688 Nanhai Avenue, Shenzhen, China
| | - Wei-Lin Hu
- College of Computer Science and Software Engineering, ShenZhen University, 3688 Nanhai Avenue, Shenzhen, China
| | - Jian-Qiang Li
- College of Computer Science and Software Engineering, ShenZhen University, 3688 Nanhai Avenue, Shenzhen, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Zhuang-Zhuang Chen
- College of Computer Science and Software Engineering, ShenZhen University, 3688 Nanhai Avenue, Shenzhen, China
| | - Yu-An Huang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China.
| |
Collapse
|
16
|
Zhang X, Hong B, Sun Z, Zhao J, Li M, Wei D, Wang Y, Zhang N. Development and validation of a circulating tumor cells-related signature focusing on biochemical recurrence and immunotherapy response in prostate cancer. Heliyon 2023; 9:e22648. [PMID: 38107322 PMCID: PMC10724679 DOI: 10.1016/j.heliyon.2023.e22648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 11/11/2023] [Accepted: 11/15/2023] [Indexed: 12/19/2023] Open
Abstract
Background Studies have shown that the circulating tumor cells (CTCs) play a key role for invasion and formation of distant metastases in prostate cancer (PCa). However, few CTCs-related genes (CRGs) have been developed for biochemical recurrence (BCR) prediction and clinical applications of PCa patients. Materials and methods Bioinformatics analysis with public PCa datasets were used to investigate the relationship between the differentially expressed CRGs and BCR. Lasso-COX regression analysis was used to constructed and validated a CRGs-based BCR prediction signature for PCa. Single-cell data were used to validate the expression levels of signature genes in different cell types and then explored the cell-cell communication relationships. Finally, the expression levels of signature genes were verified and the CRGs involved in immunotherapy response were further identified. Results Thirteen CRGs were differentially expressed and closely associated with BCR in PCa. Then we constructed and validated a BCR prediction signature for PCa patients based on 3 differentially expressed CRGs (EMID1, SPP1 and UBE2C), and the signature was an independent factor to predict BCR for PCa. Single-cell data showed the specific expression patterns of the signature genes, while the SPP1 pathway plays an important role in cell-cell communication. Further analyses suggested UBE2C was highly expressed in BCR group and high expression of UBE2C had a better response for patients who received immunotherapy. Moreover, the expression levels of UBE2C in CTCs were higher than other cells and tissues, indicated that UBE2C may affect the BCR event of PCa patients through CTCs. Conclusion Our findings demonstrated that CRGs were significantly associated with BCR and immunotherapy efficacy in PCa and CRGs may influence the BCR event through CTCs.
Collapse
Affiliation(s)
| | | | - Zhipeng Sun
- Department of Urology, Beijing Anzhen Hospital, Capital Medical University, Beijing, China
| | - Jiahui Zhao
- Department of Urology, Beijing Anzhen Hospital, Capital Medical University, Beijing, China
| | - Mingchuan Li
- Department of Urology, Beijing Anzhen Hospital, Capital Medical University, Beijing, China
| | - Dechao Wei
- Department of Urology, Beijing Anzhen Hospital, Capital Medical University, Beijing, China
| | - Yongxing Wang
- Department of Urology, Beijing Anzhen Hospital, Capital Medical University, Beijing, China
| | - Ning Zhang
- Department of Urology, Beijing Anzhen Hospital, Capital Medical University, Beijing, China
| |
Collapse
|
17
|
Quan F, Liang X, Cheng M, Yang H, Liu K, He S, Sun S, Deng M, He Y, Liu W, Wang S, Zhao S, Deng L, Hou X, Zhang X, Xiao Y. Annotation of cell types (ACT): a convenient web server for cell type annotation. Genome Med 2023; 15:91. [PMID: 37924118 PMCID: PMC10623726 DOI: 10.1186/s13073-023-01249-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 10/18/2023] [Indexed: 11/06/2023] Open
Abstract
BACKGROUND The advancement of single-cell sequencing has progressed our ability to solve biological questions. Cell type annotation is of vital importance to this process, allowing for the analysis and interpretation of enormous single-cell datasets. At present, however, manual cell annotation which is the predominant approach remains limited by both speed and the requirement of expert knowledge. METHODS To address these challenges, we constructed a hierarchically organized marker map through manually curating over 26,000 cell marker entries from about 7000 publications. We then developed WISE, a weighted and integrated gene set enrichment method, to integrate the prevalence of canonical markers and ordered differentially expressed genes of specific cell types in the marker map. Benchmarking analysis suggested that our method outperformed state-of-the-art methods. RESULTS By integrating the marker map and WISE, we developed a user-friendly and convenient web server, ACT ( http://xteam.xbio.top/ACT/ or http://biocc.hrbmu.edu.cn/ACT/ ), which only takes a simple list of upregulated genes as input and provides interactive hierarchy maps, together with well-designed charts and statistical information, to accelerate the assignment of cell identities and made the results comparable to expert manual annotation. Besides, a pan-tissue marker map was constructed to assist in cell assignments in less-studied tissues. Applying ACT to three case studies showed that all cell clusters were quickly and accurately annotated, and multi-level and more refined cell types were identified. CONCLUSIONS We developed a knowledge-based resource and a corresponding method, together with an intuitive graphical web interface, for cell type annotation. We believe that ACT, emerging as a powerful tool for cell type annotation, would be widely used in single-cell research and considerably accelerate the process of cell type identification.
Collapse
Affiliation(s)
- Fei Quan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Xin Liang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Mingjiang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Huan Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Kun Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Shengyuan He
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Shangqin Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Menglan Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Yanzhen He
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Wei Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Shuai Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Shuxiang Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Lantian Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Xiaobo Hou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Xinxin Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China.
| | - Yun Xiao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China.
| |
Collapse
|
18
|
Zhu Q, Chai Y, Jin L, Ma Y, Lu H, Chen Y, Feng W. Construction and validation of a novel prognostic model of neutrophil‑related genes signature of lung adenocarcinoma. Sci Rep 2023; 13:18226. [PMID: 37880277 PMCID: PMC10600204 DOI: 10.1038/s41598-023-45289-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Accepted: 10/18/2023] [Indexed: 10/27/2023] Open
Abstract
Lung adenocarcinoma (LUAD) remains an incurable disease with a poor prognosis. This study aimed to explore neutrophil‑related genes (NRGs) and develop a prognostic signature for predicting the prognosis of LUAD. NRGs were obtained by intersecting modular genes identified by weighted gene co-expression network analysis (WGCNA) using bulk RNA-seq data and the marker genes of neutrophils identified from single-cell RNA-sequencing(scRNA-seq) data. Univariate Cox regression, least absolute shrinkage and selection operator (LASSO), and multivariate Cox analyses were run to construct a prognostic signature, follow by delineation of risk groups, and external validation. Analyses of ESTIMAT, immune function, Tumor Immune Dysfunction and Exclusion (TIDE) scores, Immune cell Proportion Score (IPS), and immune checkpoint genes between high- and low-risk groups were performed, and then analyses of drug sensitivity to screen for sensitive anticancer drugs in high-risk groups. A total of 45 candidate NRGs were identified, of which PLTP, EREG, CD68, CD69, PLAUR, and CYP27A1 were considered to be significantly associated with prognosis in LUAD and were used to construct a prognostic signature. Correlation analysis showed significant differences in the immune landscape between high- and low-risk groups. In addition, our prognostic signature was important for predicting drug sensitivity in the high-risk group. Our study screened for NRGs in LUAD and constructed a novel and effective signature, revealing the immune landscape and providing more appropriate guidance protocols in LUAD treatment.
Collapse
Affiliation(s)
- Qianjun Zhu
- Department of Cardiothoracic Surgery, Third Xiangya Hospital, Central South University, Changsha, 410013, Hunan, China
| | - Yanfei Chai
- Department of Cardiothoracic Surgery, Third Xiangya Hospital, Central South University, Changsha, 410013, Hunan, China
- Center for Experimental Medicine, Third Xiangya Hospital, Central South University, Changsha, 410013, Hunan, China
| | - Longyu Jin
- Department of Cardiothoracic Surgery, Third Xiangya Hospital, Central South University, Changsha, 410013, Hunan, China
| | - Yuchao Ma
- Department of Cardiothoracic Surgery, Third Xiangya Hospital, Central South University, Changsha, 410013, Hunan, China
| | - Hongwei Lu
- Center for Experimental Medicine, Third Xiangya Hospital, Central South University, Changsha, 410013, Hunan, China
| | - Yingji Chen
- Department of Cardiothoracic Surgery, Third Xiangya Hospital, Central South University, Changsha, 410013, Hunan, China
| | - Wei Feng
- Department of Cardiothoracic Surgery, Third Xiangya Hospital, Central South University, Changsha, 410013, Hunan, China.
| |
Collapse
|
19
|
Guo G, Fan L, Yan Y, Xu Y, Deng Z, Tian M, Geng Y, Xia Z, Xu Y. Shared metabolic shifts in endothelial cells in stroke and Alzheimer's disease revealed by integrated analysis. Sci Data 2023; 10:666. [PMID: 37775708 PMCID: PMC10542331 DOI: 10.1038/s41597-023-02512-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 08/30/2023] [Indexed: 10/01/2023] Open
Abstract
Since metabolic dysregulation is a hallmark of both stroke and Alzheimer's disease (AD), mining shared metabolic patterns in these diseases will help to identify their possible pathogenic mechanisms and potential intervention targets. However, a systematic integration analysis of the metabolic networks of the these diseases is still lacking. In this study, we integrated single-cell RNA sequencing datasets of ischemic stroke (IS), hemorrhagic stroke (HS) and AD models to construct metabolic flux profiles at the single-cell level. We discovered that the three disorders cause shared metabolic shifts in endothelial cells. These altered metabolic modules were mainly enriched in the transporter-related pathways and were predicted to potentially lead to a decrease in metabolites such as pyruvate and fumarate. We further found that Lef1, Elk3 and Fosl1 may be upstream transcriptional regulators causing metabolic shifts and may be possible targets for interventions that halt the course of neurodegeneration.
Collapse
Affiliation(s)
- Guangyu Guo
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
- NHC Key Laboratory of Prevention and treatment of Cerebrovascular Diseases, Zhengzhou, China
- Clinical Systems Biology Laboratories, the First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Liyuan Fan
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
- Academy of Medical Sciences of Zhengzhou University, Zhengzhou, China
| | - Yingxue Yan
- Clinical Systems Biology Laboratories, the First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Academy of Medical Sciences of Zhengzhou University, Zhengzhou, China
| | - Yunhao Xu
- Clinical Systems Biology Laboratories, the First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Academy of Medical Sciences of Zhengzhou University, Zhengzhou, China
| | - Zhifen Deng
- Clinical Systems Biology Laboratories, the First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Miaomiao Tian
- Clinical Systems Biology Laboratories, the First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yaoqi Geng
- Clinical Systems Biology Laboratories, the First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Department of Endocrinology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Zongping Xia
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China.
- NHC Key Laboratory of Prevention and treatment of Cerebrovascular Diseases, Zhengzhou, China.
- Clinical Systems Biology Laboratories, the First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
| | - Yuming Xu
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China.
- NHC Key Laboratory of Prevention and treatment of Cerebrovascular Diseases, Zhengzhou, China.
| |
Collapse
|
20
|
Fiannaca A, La Rosa M, La Paglia L, Gaglio S, Urso A. GOWDL: gene ontology-driven wide and deep learning model for cell typing of scRNA-seq data. Brief Bioinform 2023; 24:bbad332. [PMID: 37756593 PMCID: PMC10530315 DOI: 10.1093/bib/bbad332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 08/17/2023] [Accepted: 09/04/2023] [Indexed: 09/29/2023] Open
Abstract
Single-cell RNA-sequencing (scRNA-seq) allows for obtaining genomic and transcriptomic profiles of individual cells. That data make it possible to characterize tissues at the cell level. In this context, one of the main analyses exploiting scRNA-seq data is identifying the cell types within tissue to estimate the quantitative composition of cell populations. Due to the massive amount of available scRNA-seq data, automatic classification approaches for cell typing, based on the most recent deep learning technology, are needed. Here, we present the gene ontology-driven wide and deep learning (GOWDL) model for classifying cell types in several tissues. GOWDL implements a hybrid architecture that considers the functional annotations found in Gene Ontology and the marker genes typical of specific cell types. We performed cross-validation and independent external testing, comparing our algorithm with 12 other state-of-the-art predictors. Classification scores demonstrated that GOWDL reached the best results over five different tissues, except for recall, where we got about 92% versus 97% of the best tool. Finally, we presented a case study on classifying immune cell populations in breast cancer using a hierarchical approach based on GOWDL.
Collapse
Affiliation(s)
- Antonino Fiannaca
- ICAR-CNR, National Research Council of Italy, Via Ugo La Malfa 153, 90146, Palermo, Italy
| | - Massimo La Rosa
- ICAR-CNR, National Research Council of Italy, Via Ugo La Malfa 153, 90146, Palermo, Italy
| | - Laura La Paglia
- ICAR-CNR, National Research Council of Italy, Via Ugo La Malfa 153, 90146, Palermo, Italy
| | - Salvatore Gaglio
- ICAR-CNR, National Research Council of Italy, Via Ugo La Malfa 153, 90146, Palermo, Italy
- Dipartimento di Ingegneria, Università degli studi di Palermo, Viale Delle Scienze, ed. 6, 90128, Palermo, Italy
| | - Alfonso Urso
- ICAR-CNR, National Research Council of Italy, Via Ugo La Malfa 153, 90146, Palermo, Italy
| |
Collapse
|
21
|
Aybey B, Zhao S, Brors B, Staub E. Immune cell type signature discovery and random forest classification for analysis of single cell gene expression datasets. Front Immunol 2023; 14:1194745. [PMID: 37609075 PMCID: PMC10441575 DOI: 10.3389/fimmu.2023.1194745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 07/14/2023] [Indexed: 08/24/2023] Open
Abstract
Background Robust immune cell gene expression signatures are central to the analysis of single cell studies. Nearly all known sets of immune cell signatures have been derived by making use of only single gene expression datasets. Utilizing the power of multiple integrated datasets could lead to high-quality immune cell signatures which could be used as superior inputs to machine learning-based cell type classification approaches. Results We established a novel workflow for the discovery of immune cell type signatures based primarily on gene-versus-gene expression similarity. It leverages multiple datasets, here seven single cell expression datasets from six different cancer types and resulted in eleven immune cell type-specific gene expression signatures. We used these to train random forest classifiers for immune cell type assignment for single-cell RNA-seq datasets. We obtained similar or better prediction results compared to commonly used methods for cell type assignment in independent benchmarking datasets. Our gene signature set yields higher prediction scores than other published immune cell type gene sets in random forest-based cell type classification. We further demonstrate how our approach helps to avoid bias in downstream statistical analyses by re-analysis of a published IFN stimulation experiment. Discussion and conclusion We demonstrated the quality of our immune cell signatures and their strong performance in a random forest-based cell typing approach. We argue that classifying cells based on our comparably slim sets of genes accompanied by a random forest-based approach not only matches or outperforms widely used published approaches. It also facilitates unbiased downstream statistical analyses of differential gene expression between cell types for significantly more genes compared to previous cell classification algorithms.
Collapse
Affiliation(s)
- Bogac Aybey
- Oncology Data Science, Merck Healthcare KGaA, Darmstadt, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - Sheng Zhao
- Oncology Data Science, Merck Healthcare KGaA, Darmstadt, Germany
| | - Benedikt Brors
- Division of Applied Bioinformatics, German Cancer Research Center, Heidelberg, Germany
- German Cancer Consortium, German Cancer Research Center, Heidelberg, Germany
| | - Eike Staub
- Oncology Data Science, Merck Healthcare KGaA, Darmstadt, Germany
| |
Collapse
|
22
|
Nicholas CA, Smith MJ. Application of single-cell RNA sequencing methods to develop B cell targeted treatments for autoimmunity. Front Immunol 2023; 14:1103690. [PMID: 37520578 PMCID: PMC10382068 DOI: 10.3389/fimmu.2023.1103690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Accepted: 06/29/2023] [Indexed: 08/01/2023] Open
Abstract
The COVID-19 pandemic coincided with several transformative advances in single-cell analysis. These new methods along with decades of research and trials with antibody therapeutics and RNA based technologies allowed for highly effective vaccines and treatments to be produced at astonishing speeds. While these tools were initially focused on models of infection, they also show promise in an autoimmune setting. Self-reactive B cells play important roles as antigen-presenting cells and cytokine and autoantibody producers for many autoimmune diseases. Yet, current therapies to target autoreactive B cells deplete all B cells irrespective of their pathogenicity. Development of self-reactive B cell targeting therapies that would spare non-pathogenic B cells are needed to treat disease while allowing effective immune responses to other ailments. Single-cell RNA sequencing (scRNA-seq) approaches will aid in identification of the pathogenic self-reactive B cells operative in autoimmunity and help with development of more favorable precision targeted therapies.
Collapse
Affiliation(s)
- Catherine A. Nicholas
- Barbara Davis Center for Diabetes, University of Colorado School of Medicine, Aurora, CO, United States
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO, United States
| | - Mia J. Smith
- Barbara Davis Center for Diabetes, University of Colorado School of Medicine, Aurora, CO, United States
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, United States
- Department of Immunology and Microbiology, University of Colorado School of Medicine, Aurora, CO, United States
| |
Collapse
|
23
|
Liu Y, Shen D, Wang HY, Qi MY, Zeng QY. Development and validation to predict visual acuity and keratometry two years after corneal crosslinking with progressive keratoconus by machine learning. Front Med (Lausanne) 2023; 10:1146529. [PMID: 37534322 PMCID: PMC10393251 DOI: 10.3389/fmed.2023.1146529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 06/16/2023] [Indexed: 08/04/2023] Open
Abstract
Purpose To explore and validate the utility of machine learning (ML) methods using a limited sample size to predict changes in visual acuity and keratometry 2 years following corneal crosslinking (CXL) for progressive keratoconus. Methods The study included all consecutive patients with progressive keratoconus who underwent CXL from July 2014 to December 2020, with a 2 year follow-up period before July 2022 to develop the model. Variables collected included patient demographics, visual acuity, spherical equivalence, and Pentacam parameters. Available case data were divided into training and testing data sets. Three ML models were evaluated based on their performance in predicting case corrected distance visual acuity (CDVA) and maximum keratometry (Kmax) changes compared to actual values, as indicated by average root mean squared error (RMSE) and R-squared (R2) values. Patients followed from July 2022 to December 2022 were included in the validation set. Results A total of 277 eyes from 195 patients were included in training and testing sets and 43 eyes from 35 patients were included in the validation set. The baseline CDVA (26.7%) and the ratio of steep keratometry to flat keratometry (K2/K1; 13.8%) were closely associated with case CDVA changes. The baseline ratio of Kmax to mean keratometry (Kmax/Kmean; 20.9%) was closely associated with case Kmax changes. Using these metrics, the best-performing ML model was XGBoost, which produced predicted values closest to the actual values for both CDVA and Kmax changes in testing set (R2 = 0.9993 and 0.9888) and validation set (R2 = 0.8956 and 0.8382). Conclusion Application of a ML approach using XGBoost, and incorporation of identifiable parameters, considerably improved variation prediction accuracy of both CDVA and Kmax 2 years after CXL for treatment of progressive keratoconus.
Collapse
Affiliation(s)
- Yu Liu
- Aier School of Ophthalmology, Central South University, Changsha, China
- Aier Eye Hospital of Wuhan University, Wuhan, China
| | - Dan Shen
- Aier Eye Hospital of Wuhan University, Wuhan, China
| | - Hao-yu Wang
- Aier Eye Hospital of Wuhan University, Wuhan, China
| | - Meng-ying Qi
- Aier Eye Hospital of Wuhan University, Wuhan, China
| | - Qing-yan Zeng
- Aier School of Ophthalmology, Central South University, Changsha, China
- Aier Eye Hospital of Wuhan University, Wuhan, China
- Aier Cornea Institute, Beijing, China
- Aier School of Ophthalmology and Optometry, Hubei University of Science and Technology, Xianning, China
| |
Collapse
|
24
|
Zhao HC, Chen CZ, Tian YZ, Song HQ, Wang XX, Li YJ, He JF, Zhao HL. CD168+ macrophages promote hepatocellular carcinoma tumor stemness and progression through TOP2A/β-catenin/ YAP1 axis. iScience 2023; 26:106862. [PMID: 37275516 PMCID: PMC10238939 DOI: 10.1016/j.isci.2023.106862] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 03/20/2023] [Accepted: 05/08/2023] [Indexed: 06/07/2023] Open
Abstract
Liver cancer stem-like cells (LCSCs) are the main cause of heterogeneity and poor prognosis in hepatocellular carcinoma (HCC). In this study, we aimed to explore the origin of LCSCs and the role of the TOP2A/β-catenin/YAP1 axis in tumor stemness and progression. Using single-cell RNA-seq analysis, we identified TOP2A+CENPF+ LCSCs, which were mainly regulated by CD168+ M2-like macrophages. Furthermore, spatial location analysis and fluorescent staining confirmed that LCSCs were enriched at tumor margins, constituting the spatial heterogeneity of HCC. Mechanistically, TOP2A competitively binds to β-catenin, leading to disassociation of β-catenin from YAP1, promoting HCC stemness and overgrowth. Our study provides valuable insights into the spatial transcriptome heterogeneity of the HCC microenvironment and the critical role of TOP2A/β-catenin/YAP1 axis in HCC stemness and progression.
Collapse
Affiliation(s)
- Hai-Chao Zhao
- Third Hospital of Shanxi Medical University, Shanxi Academy of Medical Sciences, Taiyuan 030032, China
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China
| | - Chang-Zhou Chen
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China
| | - Yan-Zhang Tian
- Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Taiyuan 030032, China
| | - Huang-Qin Song
- Third Hospital of Shanxi Medical University, Shanxi Academy of Medical Sciences, Taiyuan 030032, China
| | - Xiao-Xiao Wang
- Third Hospital of Shanxi Medical University, Shanxi Academy of Medical Sciences, Taiyuan 030032, China
| | - Yan-Jun Li
- Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Taiyuan 030032, China
| | - Jie-Feng He
- Third Hospital of Shanxi Medical University, Shanxi Academy of Medical Sciences, Taiyuan 030032, China
- Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Taiyuan 030032, China
| | - Hao-Liang Zhao
- Third Hospital of Shanxi Medical University, Shanxi Academy of Medical Sciences, Taiyuan 030032, China
- Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Taiyuan 030032, China
| |
Collapse
|
25
|
Wang K, Cai B, Song Y, Chen Y, Zhang X. Somatosensory neuron types and their neural networks as revealed via single-cell transcriptomics. Trends Neurosci 2023:S0166-2236(23)00130-3. [PMID: 37268541 DOI: 10.1016/j.tins.2023.05.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 04/24/2023] [Accepted: 05/06/2023] [Indexed: 06/04/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) has allowed profiling cell types of the dorsal root ganglia (DRG) and their transcriptional states in physiology and chronic pain. However, the evaluation criteria used in previous studies to classify DRG neurons varied, which presents difficulties in determining the various types of DRG neurons. In this review, we aim to integrate findings from previous transcriptomic studies of the DRG. We first briefly introduce the history of DRG-neuron cell-type profiling, and discuss the advantages and disadvantages of different scRNA-seq methods. We then examine the classification of DRG neurons based on single-cell profiling under physiological and pathological conditions. Finally, we propose further studies on the somatosensory system at the molecular, cellular, and neural network levels.
Collapse
Affiliation(s)
- Kaikai Wang
- Guangdong Institute of Intelligence Science and Technology, Hengqin 519031, Zhuhai, Guangdong, China; Research Unit of Pain Medicine, Chinese Academy of Medical Sciences, Hengqin, Zhuhai, China
| | - Bing Cai
- Guangdong Institute of Intelligence Science and Technology, Hengqin 519031, Zhuhai, Guangdong, China; Research Unit of Pain Medicine, Chinese Academy of Medical Sciences, Hengqin, Zhuhai, China
| | - Yurang Song
- Guangdong Institute of Intelligence Science and Technology, Hengqin 519031, Zhuhai, Guangdong, China; Research Unit of Pain Medicine, Chinese Academy of Medical Sciences, Hengqin, Zhuhai, China
| | - Yan Chen
- Guangdong Institute of Intelligence Science and Technology, Hengqin 519031, Zhuhai, Guangdong, China; Research Unit of Pain Medicine, Chinese Academy of Medical Sciences, Hengqin, Zhuhai, China; Xuhui Central Hospital, Shanghai, 200031, China
| | - Xu Zhang
- Guangdong Institute of Intelligence Science and Technology, Hengqin 519031, Zhuhai, Guangdong, China; SIMR Joint Lab of Drug Innovation, Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai, 201210, China; Research Unit of Pain Medicine, Chinese Academy of Medical Sciences, Hengqin, Zhuhai, China; Xuhui Central Hospital, Shanghai, 200031, China.
| |
Collapse
|
26
|
Xu J, Zhang A, Liu F, Chen L, Zhang X. CIForm as a Transformer-based model for cell-type annotation of large-scale single-cell RNA-seq data. Brief Bioinform 2023:7169137. [PMID: 37200157 DOI: 10.1093/bib/bbad195] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 04/03/2023] [Accepted: 04/30/2023] [Indexed: 05/20/2023] Open
Abstract
Single-cell omics technologies have made it possible to analyze the individual cells within a biological sample, providing a more detailed understanding of biological systems. Accurately determining the cell type of each cell is a crucial goal in single-cell RNA-seq (scRNA-seq) analysis. Apart from overcoming the batch effects arising from various factors, single-cell annotation methods also face the challenge of effectively processing large-scale datasets. With the availability of an increase in the scRNA-seq datasets, integrating multiple datasets and addressing batch effects originating from diverse sources are also challenges in cell-type annotation. In this work, to overcome the challenges, we developed a supervised method called CIForm based on the Transformer for cell-type annotation of large-scale scRNA-seq data. To assess the effectiveness and robustness of CIForm, we have compared it with some leading tools on benchmark datasets. Through the systematic comparisons under various cell-type annotation scenarios, we exhibit that the effectiveness of CIForm is particularly pronounced in cell-type annotation. The source code and data are available at https://github.com/zhanglab-wbgcas/CIForm.
Collapse
Affiliation(s)
- Jing Xu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Aidi Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
| | - Fang Liu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
| | - Liang Chen
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
| | - Xiujun Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
| |
Collapse
|
27
|
Yan D, Sun Z, Fang J, Cao S, Wang W, Chang X, Badirli S, Fu H, Liu Y. scRAA: the development of a robust and automatic annotation procedure for single-cell RNA sequencing data. J Biopharm Stat 2023:1-14. [PMID: 37162278 DOI: 10.1080/10543406.2023.2208671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
A critical task in single-cell RNA sequencing (scRNA-Seq) data analysis is to identify cell types from heterogeneous tissues. While the majority of classification methods demonstrated high performance in scRNA-Seq annotation problems, a robust and accurate solution is desired to generate reliable outcomes for downstream analyses, for instance, marker genes identification, differentially expressed genes, and pathway analysis. It is hard to establish a universally good metric. Thus, a universally good classification method for all kinds of scenarios does not exist. In addition, reference and query data in cell classification are usually from different experimental batches, and failure to consider batch effects may result in misleading conclusions. To overcome this bottleneck, we propose a robust ensemble approach to classify cells and utilize a batch correction method between reference and query data. We simulated four scenarios that comprise simple to complex batch effect and account for varying cell-type proportions. We further tested our approach on both lung and pancreas data. We found improved prediction accuracy and robust performance across simulation scenarios and real data. The incorporation of batch effect correction between reference and query, and the ensemble approach improve cell-type prediction accuracy while maintaining robustness. We demonstrated these through simulated and real scRNA-Seq data.
Collapse
Affiliation(s)
- Dongyan Yan
- Global Statistical Science, Eli Lilly & Co, Indianapolis, Indiana, USA
| | - Zhe Sun
- Global Statistical Science, Eli Lilly & Co, Indianapolis, Indiana, USA
| | - Jiyuan Fang
- Global Statistical Science, Eli Lilly & Co, Indianapolis, Indiana, USA
| | - Shanshan Cao
- Global Statistical Science, Eli Lilly & Co, Indianapolis, Indiana, USA
| | - Wenjie Wang
- Advance Analytics and Data Science, Eli Lilly & Co, Indianapolis, Indiana, USA
| | - Xinyue Chang
- Advance Analytics and Data Science, Eli Lilly & Co, Indianapolis, Indiana, USA
| | - Sarkhan Badirli
- Advance Analytics and Data Science, Eli Lilly & Co, Indianapolis, Indiana, USA
| | - Haoda Fu
- Advance Analytics and Data Science, Eli Lilly & Co, Indianapolis, Indiana, USA
| | - Yushi Liu
- Global Statistical Science, Eli Lilly & Co, Indianapolis, Indiana, USA
| |
Collapse
|
28
|
Li M, Song J, Yin P, Chen H, Wang Y, Xu C, Jiang F, Wang H, Han B, Du X, Wang W, Li G, Zhong D. Single-cell analysis reveals novel clonally expanded monocytes associated with IL1β-IL1R2 pair in acute inflammatory demyelinating polyneuropathy. Sci Rep 2023; 13:5862. [PMID: 37041166 PMCID: PMC10088807 DOI: 10.1038/s41598-023-32427-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 03/27/2023] [Indexed: 04/13/2023] Open
Abstract
Guillain-Barré syndrome (GBS) is an autoimmune disorder wherein the composition and gene expression patterns of peripheral blood immune cells change significantly. It is triggered by antigens with similar epitopes to Schwann cells that stimulate a maladaptive immune response against peripheral nerves. However, an atlas for peripheral blood immune cells in patients with GBS has not yet been constructed. This is a monocentric, prospective study. We collected 5 acute inflammatory demyelinating polyneuropathy (AIDP) patients and 3 healthy controls hospitalized in the First Affiliated Hospital of Harbin Medical University from December 2020 to May 2021, 3 AIDP patients were in the peak stage and 2 were in the convalescent stage. We performed single-cell RNA sequencing (scRNA-seq) of peripheral blood mononuclear cells (PBMCs) from these patients. Furthermore, we performed cell clustering, cell annotation, cell-cell communication, differentially expressed genes (DEGs) identification and pseudotime trajectory analysis. Our study identified a novel clonally expanded CD14+ CD163+ monocyte subtype in the peripheral blood of patients with AIDP, and it was enriched in cellular response to IL1 and chemokine signaling pathways. Furthermore, we observed increased IL1β-IL1R2 cell-cell communication between CD14+ and CD16+ monocytes. In short, by analyzing the single-cell landscape of the PBMCs in patients with AIDP we hope to widen our understanding of the composition of peripheral immune cells in patients with GBS and provide a theoretical basis for future studies.
Collapse
Affiliation(s)
- Meng Li
- Department of Neurology, First Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Jihe Song
- Department of Neurology, First Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Pengqi Yin
- Department of Neurology, First Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Hongping Chen
- Department of Neurology, First Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Yingju Wang
- Department of Neurology, First Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Chen Xu
- Department of Neurology, First Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Fangchao Jiang
- Department of Neurology, First Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Haining Wang
- Department of Neurology, First Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Baichao Han
- Department of Neurology, First Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Xinshu Du
- Department of Neurology, First Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Wei Wang
- Department of Neurology, First Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Guozhong Li
- Department of Neurology, Heilongjiang Provincial Hospital, Harbin, 150081, Heilongjiang, China.
| | - Di Zhong
- Department of Neurology, First Affiliated Hospital, Harbin Medical University, Harbin, 150081, Heilongjiang, China.
| |
Collapse
|
29
|
Christensen E, Luo P, Turinsky A, Husić M, Mahalanabis A, Naidas A, Diaz-Mejia JJ, Brudno M, Pugh T, Ramani A, Shooshtari P. Evaluation of single-cell RNAseq labelling algorithms using cancer datasets. Brief Bioinform 2022; 24:6965910. [PMID: 36585784 PMCID: PMC9851326 DOI: 10.1093/bib/bbac561] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 09/19/2022] [Accepted: 11/01/2022] [Indexed: 01/01/2023] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) clustering and labelling methods are used to determine precise cellular composition of tissue samples. Automated labelling methods rely on either unsupervised, cluster-based approaches or supervised, cell-based approaches to identify cell types. The high complexity of cancer poses a unique challenge, as tumor microenvironments are often composed of diverse cell subpopulations with unique functional effects that may lead to disease progression, metastasis and treatment resistance. Here, we assess 17 cell-based and 9 cluster-based scRNA-seq labelling algorithms using 8 cancer datasets, providing a comprehensive large-scale assessment of such methods in a cancer-specific context. Using several performance metrics, we show that cell-based methods generally achieved higher performance and were faster compared to cluster-based methods. Cluster-based methods more successfully labelled non-malignant cell types, likely because of a lack of gene signatures for relevant malignant cell subpopulations. Larger cell numbers present in some cell types in training data positively impacted prediction scores for cell-based methods. Finally, we examined which methods performed favorably when trained and tested on separate patient cohorts in scenarios similar to clinical applications, and which were able to accurately label particularly small or under-represented cell populations in the given datasets. We conclude that scPred and SVM show the best overall performances with cancer-specific data and provide further suggestions for algorithm selection. Our analysis pipeline for assessing the performance of cell type labelling algorithms is available in https://github.com/shooshtarilab/scRNAseq-Automated-Cell-Type-Labelling.
Collapse
Affiliation(s)
| | | | - Andrei Turinsky
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Mia Husić
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Alaina Mahalanabis
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Alaine Naidas
- Children’s Health Research Institute, Lawson Research Institute, London, ON, Canada
- Department of Pathology and Lab Medicine, University of Western Ontario, London, ON, Canada
| | | | - Michael Brudno
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Trevor Pugh
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
| | - Arun Ramani
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Parisa Shooshtari
- Corresponding author: Parisa Shooshtari, Department of Pathology and Lab Medicine, University of Western Ontario, London, ON, Canada. Tel.: +1 (519) 685-8500 x55427. E-mail:
| |
Collapse
|
30
|
Yang F, Wang W, Wang F, Fang Y, Tang D, Huang J, Lu H, Yao J. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00534-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
31
|
Zhou Y, Peng M, Yang B, Tong T, Zhang B, Tang N. scDLC: a deep learning framework to classify large sample single-cell RNA-seq data. BMC Genomics 2022; 23:504. [PMID: 35831808 PMCID: PMC9281153 DOI: 10.1186/s12864-022-08715-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 06/21/2022] [Indexed: 11/10/2022] Open
Abstract
Background Using single-cell RNA sequencing (scRNA-seq) data to diagnose disease is an effective technique in medical research. Several statistical methods have been developed for the classification of RNA sequencing (RNA-seq) data, including, for example, Poisson linear discriminant analysis (PLDA), negative binomial linear discriminant analysis (NBLDA), and zero-inflated Poisson logistic discriminant analysis (ZIPLDA). Nevertheless, few existing methods perform well for large sample scRNA-seq data, in particular when the distribution assumption is also violated. Results We propose a deep learning classifier (scDLC) for large sample scRNA-seq data, based on the long short-term memory recurrent neural networks (LSTMs). Our new scDLC does not require a prior knowledge on the data distribution, but instead, it takes into account the dependency of the most outstanding feature genes in the LSTMs model. LSTMs is a special recurrent neural network, which can learn long-term dependencies of a sequence. Conclusions Simulation studies show that our new scDLC performs consistently better than the existing methods in a wide range of settings with large sample sizes. Four real scRNA-seq datasets are also analyzed, and they coincide with the simulation results that our new scDLC always performs the best. The code named “scDLC” is publicly available at https://github.com/scDLC-code/code. Supplementary Information The online version contains supplementary material available at (10.1186/s12864-022-08715-1).
Collapse
Affiliation(s)
- Yan Zhou
- College of Mathematics and Statistics, Institute of Statistical Sciences, Shenzhen Key Laboratory of Advanced Machine Learning and Applications, Shenzhen University, Shenzhen, China
| | - Minjiao Peng
- College of Mathematics and Statistics, Institute of Statistical Sciences, Shenzhen Key Laboratory of Advanced Machine Learning and Applications, Shenzhen University, Shenzhen, China
| | - Bin Yang
- College of Mathematics and Statistics, Institute of Statistical Sciences, Shenzhen Key Laboratory of Advanced Machine Learning and Applications, Shenzhen University, Shenzhen, China
| | - Tiejun Tong
- Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| | - Baoxue Zhang
- School of Statistics, Capital University of Economics and Business, Beijing, China
| | - Niansheng Tang
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, Yunnan University, Kunming, China.
| |
Collapse
|
32
|
Wang Z, Zhao Z, Xia Y, Cai Z, Wang C, Shen Y, Liu R, Qin H, Jia J, Yuan G. Potential biomarkers in the fibrosis progression of nonalcoholic steatohepatitis (NASH). J Endocrinol Invest 2022; 45:1379-1392. [PMID: 35226336 DOI: 10.1007/s40618-022-01773-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 02/17/2022] [Indexed: 12/13/2022]
Abstract
PURPOSE Fibrosis is the only histological feature reflecting the severity and prognosis of nonalcoholic steatohepatitis (NASH). We aim to explore novel genes associated with fibrosis progression in NASH. METHODS Two human RNA-seq datasets were downloaded from the public database. Weighted gene co-expression network analysis (WGCNA) was used to identify their co-expressed modules and further bioinformatics analysis was performed to identify hub genes within the modules. Finally, based on two single-cell RNA-seq datasets from mice and one microarray dataset from human, we further observed the expression of hub genes in different cell clusters and liver tissues. RESULTS 7 hub genes (SPP1, PROM1, SOX9, EPCAM, THY1, CD34 and MCAM) associated with fibrosis progression were identified. Single-cell RNA-seq analysis revealed that those hub genes were expressed by different cell clusters such as cholangiocytes, natural killer (NK) cells, and hepatic stellate cells (HSCs). We also found that SPP1 and CD34 serve as markers of different HSCs clusters, which are associated with inflammatory response and fibrogenesis, respectively. Further study suggested that SPP1, SOX9, MCAM and THY1 might be related to NASH-associated hepatocellular carcinoma (HCC). Receiver operating characteristic (ROC) analysis showed that the high expression of these genes could well predict the occurrence of HCC. At the same time, there were significant differences in metabolism-related pathway changes between different HCC subtypes, and SOX9 may be involved in these changes. CONCLUSIONS The present study identified novel genes associated with NASH fibrosis and explored their effects on fibrosis from a single-cell perspective that might provide new ideas for the early diagnosis, monitoring, evaluation, and prediction of fibrosis progression in NASH.
Collapse
Affiliation(s)
- Z Wang
- Department of Endocrinology, Affiliated Hospital of Jiangsu University, Zhenjiang, 212001, Jiangsu, China
| | - Z Zhao
- Department of Endocrinology, Affiliated Hospital of Jiangsu University, Zhenjiang, 212001, Jiangsu, China
| | - Y Xia
- Department of Endocrinology, Affiliated Hospital of Jiangsu University, Zhenjiang, 212001, Jiangsu, China
| | - Z Cai
- Department of Endocrinology, Affiliated Hospital of Jiangsu University, Zhenjiang, 212001, Jiangsu, China
| | - C Wang
- Department of Endocrinology, Affiliated Hospital of Jiangsu University, Zhenjiang, 212001, Jiangsu, China
| | - Y Shen
- Department of Endocrinology, Affiliated Hospital of Jiangsu University, Zhenjiang, 212001, Jiangsu, China
| | - R Liu
- Department of Endocrinology, Affiliated Hospital of Jiangsu University, Zhenjiang, 212001, Jiangsu, China
| | - H Qin
- Department of Endocrinology, Affiliated Hospital of Jiangsu University, Zhenjiang, 212001, Jiangsu, China
| | - J Jia
- Department of Endocrinology, Affiliated Hospital of Jiangsu University, Zhenjiang, 212001, Jiangsu, China.
| | - G Yuan
- Department of Endocrinology, Affiliated Hospital of Jiangsu University, Zhenjiang, 212001, Jiangsu, China.
| |
Collapse
|
33
|
Caligola S, De Sanctis F, Canè S, Ugel S. Breaking the Immune Complexity of the Tumor Microenvironment Using Single-Cell Technologies. Front Genet 2022; 13:867880. [PMID: 35651929 PMCID: PMC9149246 DOI: 10.3389/fgene.2022.867880] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 04/27/2022] [Indexed: 12/31/2022] Open
Abstract
Tumors are not a simple aggregate of transformed cells but rather a complicated ecosystem containing various components, including infiltrating immune cells, tumor-related stromal cells, endothelial cells, soluble factors, and extracellular matrix proteins. Profiling the immune contexture of this intricate framework is now mandatory to develop more effective cancer therapies and precise immunotherapeutic approaches by identifying exact targets or predictive biomarkers, respectively. Conventional technologies are limited in reaching this goal because they lack high resolution. Recent developments in single-cell technologies, such as single-cell RNA transcriptomics, mass cytometry, and multiparameter immunofluorescence, have revolutionized the cancer immunology field, capturing the heterogeneity of tumor-infiltrating immune cells and the dynamic complexity of tenets that regulate cell networks in the tumor microenvironment. In this review, we describe some of the current single-cell technologies and computational techniques applied for immune-profiling the cancer landscape and discuss future directions of how integrating multi-omics data can guide a new "precision oncology" advancement.
Collapse
Affiliation(s)
| | | | | | - Stefano Ugel
- Immunology Section, Department of Medicine, University of Verona, Verona, Italy
| |
Collapse
|
34
|
Zhang Y, Zhang F, Wang Z, Wu S, Tian W. scMAGIC: accurately annotating single cells using two rounds of reference-based classification. Nucleic Acids Res 2022; 50:e43. [PMID: 34986249 PMCID: PMC9071478 DOI: 10.1093/nar/gkab1275] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 11/08/2021] [Accepted: 12/14/2021] [Indexed: 11/21/2022] Open
Abstract
Here, we introduce scMAGIC (Single Cell annotation using MArker Genes Identification and two rounds of reference-based Classification [RBC]), a novel method that uses well-annotated single-cell RNA sequencing (scRNA-seq) data as the reference to assist in the classification of query scRNA-seq data. A key innovation in scMAGIC is the introduction of a second-round RBC in which those query cells whose cell identities are confidently validated in the first round are used as a new reference to again classify query cells, therefore eliminating the batch effects between the reference and the query data. scMAGIC significantly outperforms 13 competing RBC methods with their optimal parameter settings across 86 benchmark tests, especially when the cell types in the query dataset are not completely covered by the reference dataset and when there exist significant batch effects between the reference and the query datasets. Moreover, when no reference dataset is available, scMAGIC can annotate query cells with reasonably high accuracy by using an atlas dataset as the reference.
Collapse
Affiliation(s)
- Yu Zhang
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai 200438, P.R. China
| | - Feng Zhang
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai 200438, P.R. China
- Department of Histoembryology, Genetics and Developmental Biology, Shanghai Key Laboratory of Reproductive Medicine, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| | - Zekun Wang
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai 200438, P.R. China
| | - Siyi Wu
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai 200438, P.R. China
| | - Weidong Tian
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai 200438, P.R. China
- Qilu Children's Hospital of Shandong University, No 23976 Jingshi Road, Jinan, Shandong, China
- Children’s Hospital of Fudan University, Shanghai 201102, China
| |
Collapse
|
35
|
CASSL: A cell-type annotation method for single cell transcriptomics data using semi-supervised learning. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03440-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
36
|
Chen C, Chen Y, Jin X, Ding Y, Jiang J, Wang H, Yang Y, Lin W, Chen X, Huang Y, Teng L. Identification of Tumor Mutation Burden, Microsatellite Instability, and Somatic Copy Number Alteration Derived Nine Gene Signatures to Predict Clinical Outcomes in STAD. Front Mol Biosci 2022; 9:793403. [PMID: 35480879 PMCID: PMC9037630 DOI: 10.3389/fmolb.2022.793403] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 03/14/2022] [Indexed: 12/11/2022] Open
Abstract
Genomic features, including tumor mutation burden (TMB), microsatellite instability (MSI), and somatic copy number alteration (SCNA), had been demonstrated to be involved with the tumor microenvironment (TME) and outcome of gastric cancer (GC). We obtained profiles of TMB, MSI, and SCNA by processing 405 GC data from The Cancer Genome Atlas (TCGA) and then conducted a comprehensive analysis though “iClusterPlus.” A total of two subgroups were generated, with distinguished prognosis, somatic mutation burden, copy number changes, and immune landscape. We revealed that Cluster1 was marked by a better prognosis, accompanied by higher TMB, MSIsensor score, TMEscore, and lower SCNA burden. Based on these clusters, we screened 196 differentially expressed genes (DEGs), which were subsequently projected into univariate Cox survival analysis. We constructed a 9-gene immune risk score (IRS) model using LASSO-penalized logistic regression. Moreover, the prognostic prediction of IRS was verified by receiver operating characteristic (ROC) curve analysis and nomogram plot. Another independent Gene Expression Omnibus (GEO) contained specimens from 109 GC patients was designed as an external validation. Our works suggested that the 9‐gene‐signature prediction model, which was derived from TMB, MSI, and SCNA, was a promising predictive tool for clinical outcomes in GC patients. This novel methodology may help clinicians uncover the underlying mechanisms and guide future treatment strategies.
Collapse
Affiliation(s)
- Chuanzhi Chen
- Department of Surgical Oncology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Yi Chen
- Department of Oncology-Pathology, Karolinska Institute, Solna, Sweden
| | - Xin Jin
- Department of Breast Surgery, Zhuji Affiliated Hospital of Shaoxing University, Zhuji, China
| | - Yongfeng Ding
- Department of Surgical Oncology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Junjie Jiang
- Department of Surgical Oncology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Haohao Wang
- Department of Surgical Oncology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Yan Yang
- Department of Surgical Oncology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Wu Lin
- Department of Surgical Oncology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Xiangliu Chen
- Department of Surgical Oncology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Yingying Huang
- Department of Surgical Oncology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Lisong Teng
- Department of Surgical Oncology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
- *Correspondence: Lisong Teng,
| |
Collapse
|
37
|
Deng Y, Choi J, Lê Cao KA. Sincast: a computational framework to predict cell identities in single-cell transcriptomes using bulk atlases as references. Brief Bioinform 2022; 23:6561437. [PMID: 35362513 PMCID: PMC9155616 DOI: 10.1093/bib/bbac088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 02/18/2022] [Accepted: 02/21/2022] [Indexed: 11/23/2022] Open
Abstract
Characterizing the molecular identity of a cell is an essential step in single-cell RNA sequencing (scRNA-seq) data analysis. Numerous tools exist for predicting cell identity using single-cell reference atlases. However, many challenges remain, including correcting for inherent batch effects between reference and query data andinsufficient phenotype data from the reference. One solution is to project single-cell data onto established bulk reference atlases to leverage their rich phenotype information. Sincast is a computational framework to query scRNA-seq data by projection onto bulk reference atlases. Prior to projection, single-cell data are transformed to be directly comparable to bulk data, either with pseudo-bulk aggregation or graph-based imputation to address sparse single-cell expression profiles. Sincast avoids batch effect correction, and cell identity is predicted along a continuum to highlight new cell states not found in the reference atlas. In several case study scenarios, we show that Sincast projects single cells into the correct biological niches in the expression space of the bulk reference atlas. We demonstrate the effectiveness of our imputation approach that was specifically developed for querying scRNA-seq data based on bulk reference atlases. We show that Sincast is an efficient and powerful tool for single-cell profiling that will facilitate downstream analysis of scRNA-seq data.
Collapse
Affiliation(s)
- Yidi Deng
- Melbourne Integrative Genomics, School of Mathematics and Statistics, The University of Melbourne, Parkville, 3010, VIC, Australia.,Centre for Stem Cell Systems, School of Biomedical Sciences, The University of Melbourne, Parkville, 3010, VIC, Country
| | - Jarny Choi
- Melbourne Integrative Genomics, School of Mathematics and Statistics, The University of Melbourne, Parkville, 3010, VIC, Australia
| | - Kim-Anh Lê Cao
- Melbourne Integrative Genomics, School of Mathematics and Statistics, The University of Melbourne, Parkville, 3010, VIC, Australia
| |
Collapse
|
38
|
Sun X, Lin X, Li Z, Wu H. A comprehensive comparison of supervised and unsupervised methods for cell type identification in single-cell RNA-seq. Brief Bioinform 2022; 23:6502554. [PMID: 35021202 PMCID: PMC8921620 DOI: 10.1093/bib/bbab567] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 11/19/2021] [Accepted: 12/11/2021] [Indexed: 01/26/2023] Open
Abstract
The cell type identification is among the most important tasks in single-cell RNA-sequencing (scRNA-seq) analysis. Many in silico methods have been developed and can be roughly categorized as either supervised or unsupervised. In this study, we investigated the performances of 8 supervised and 10 unsupervised cell type identification methods using 14 public scRNA-seq datasets of different tissues, sequencing protocols and species. We investigated the impacts of a number of factors, including total amount of cells, number of cell types, sequencing depth, batch effects, reference bias, cell population imbalance, unknown/novel cell type, and computational efficiency and scalability. Instead of merely comparing individual methods, we focused on factors' impacts on the general category of supervised and unsupervised methods. We found that in most scenarios, the supervised methods outperformed the unsupervised methods, except for the identification of unknown cell types. This is particularly true when the supervised methods use a reference dataset with high informational sufficiency, low complexity and high similarity to the query dataset. However, such outperformance could be undermined by some undesired dataset properties investigated in this study, which lead to uninformative and biased reference datasets. In these scenarios, unsupervised methods could be comparable to supervised methods. Our study not only explained the cell typing methods' behaviors under different experimental settings but also provided a general guideline for the choice of method according to the scientific goal and dataset properties. Finally, our evaluation workflow is implemented as a modularized R pipeline that allows future evaluation of new methods. Availability: All the source codes are available at https://github.com/xsun28/scRNAIdent.
Collapse
Affiliation(s)
- Xiaobo Sun
- Department of Statistics, School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, Hubei, China
| | - Xiaochu Lin
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | - Ziyi Li
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas, U.S
| | - Hao Wu
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| |
Collapse
|
39
|
Tang H, Yu X, Liu R, Zeng T. Vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion. Brief Bioinform 2022; 23:6518046. [PMID: 35106553 PMCID: PMC8921615 DOI: 10.1093/bib/bbab584] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 12/06/2021] [Accepted: 12/20/2021] [Indexed: 01/05/2023] Open
Abstract
Feature representation and discriminative learning are proven models and technologies in artificial intelligence fields; however, major challenges for machine learning on large biological datasets are learning an effective model with mechanistical explanation on the model determination and prediction. To satisfy such demands, we developed Vec2image, an explainable convolutional neural network framework for characterizing the feature engineering, feature selection and classifier training that is mainly based on the collaboration of principal component coordinate conversion, deep residual neural networks and embedded k-nearest neighbor representation on pseudo images of high-dimensional biological data, where the pseudo images represent feature measurements and feature associations simultaneously. Vec2image has achieved better performance compared with other popular methods and illustrated its efficiency on feature selection in cell marker identification from tissue-specific single-cell datasets. In particular, in a case study on type 2 diabetes (T2D) by multiple human islet scRNA-seq datasets, Vec2image first displayed robust performance on T2D classification model building across different datasets, then a specific Vec2image model was trained to accurately recognize the cell state and efficiently rank feature genes relevant to T2D which uncovered potential T2D cellular pathogenesis; and next the cell activity changes, cell composition imbalances and cell–cell communication dysfunctions were associated to our finding T2D feature genes from both population-shared and individual-specific perspectives. Collectively, Vec2image is a new and efficient explainable artificial intelligence methodology that can be widely applied in human-readable classification and prediction on the basis of pseudo image representation of biological deep sequencing data.
Collapse
Affiliation(s)
- Hui Tang
- School of Mathematics, South China University of Technology, Guangzhou, 510640, China
| | - Xiangtian Yu
- Clinical Research Center, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, 200233, China
| | - Rui Liu
- School of Mathematics, South China University of Technology, Guangzhou, 510640, China.,Pazhou Lab, Guangzhou 510330, China
| | - Tao Zeng
- Guangzhou Laboratory, Guangzhou, China.,Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| |
Collapse
|
40
|
Grisanti Canozo FJ, Zuo Z, Martin JF, Samee MAH. Cell-type modeling in spatial transcriptomics data elucidates spatially variable colocalization and communication between cell-types in mouse brain. Cell Syst 2022; 13:58-70.e5. [PMID: 34626538 PMCID: PMC8776574 DOI: 10.1016/j.cels.2021.09.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 08/06/2021] [Accepted: 09/10/2021] [Indexed: 01/21/2023]
Abstract
Single-cell spatial transcriptomics (sc-ST) holds the promise to elucidate architectural aspects of complex tissues. Such analyses require modeling cell types in sc-ST datasets through their integration with single-cell RNA-seq datasets. However, this integration, is nontrivial since the two technologies differ widely in the number of profiled genes, and the datasets often do not share many marker genes for given cell types. We developed a neural network model, spatial transcriptomics cell-types assignment using neural networks (STANN), to overcome these challenges. Analysis of STANN's predicted cell types in mouse olfactory bulb (MOB) sc-ST data delineated MOB architecture beyond its morphological layer-based conventional description. We find that cell-type proportions remain consistent within individual morphological layers but vary significantly between layers. Notably, even within a layer, cellular colocalization patterns and intercellular communication mechanisms show high spatial variations. These observations imply a refinement of major cell types into subtypes characterized by spatially localized gene regulatory networks and receptor-ligand usage.
Collapse
Affiliation(s)
| | - Zhen Zuo
- Baylor College of Medicine, Houston, TX 77030, USA
| | - James F Martin
- Baylor College of Medicine, Houston, TX 77030, USA; Texas Heart Institute, Houston, TX 77030, USA
| | | |
Collapse
|
41
|
Ascensión AM, Araúzo-Bravo MJ, Izeta A. Challenges and Opportunities for the Translation of Single-Cell RNA Sequencing Technologies to Dermatology. Life (Basel) 2022; 12:67. [PMID: 35054460 PMCID: PMC8781146 DOI: 10.3390/life12010067] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 12/21/2021] [Accepted: 12/28/2021] [Indexed: 12/19/2022] Open
Abstract
Skin is a complex and heterogeneous organ at the cellular level. This complexity is beginning to be understood through the application of single-cell genomics and computational tools. A large number of datasets that shed light on how the different human skin cell types interact in homeostasis-and what ceases to work in diverse dermatological diseases-have been generated and are publicly available. However, translation of these novel aspects to the clinic is lacking. This review aims to summarize the state-of-the-art of skin biology using single-cell technologies, with a special focus on skin pathologies and the translation of mechanistic findings to the clinic. The main implications of this review are to summarize the benefits and limitations of single-cell analysis and thus help translate the emerging insights from these novel techniques to the bedside.
Collapse
Affiliation(s)
- Alex M. Ascensión
- Tissue Engineering Group, Biodonostia Health Research Institute, 20014 Donostia-San Sebastián, Spain;
- Computational Biology and Systems Biomedicine Group, Biodonostia Health Research Institute, 20014 Donostia-San Sebastián, Spain;
| | - Marcos J. Araúzo-Bravo
- Computational Biology and Systems Biomedicine Group, Biodonostia Health Research Institute, 20014 Donostia-San Sebastián, Spain;
- Max Planck Institute for Molecular Biomedicine, 48167 Muenster, Germany
- IKERBASQUE, Basque Foundation for Science, 48012 Bilbao, Spain
| | - Ander Izeta
- Tissue Engineering Group, Biodonostia Health Research Institute, 20014 Donostia-San Sebastián, Spain;
- School of Engineering, Tecnun-University of Navarra, 20009 Donostia-San Sebastián, Spain
| |
Collapse
|
42
|
Yin Q, Wang Y, Guan J, Ji G. scIAE: an integrative autoencoder-based ensemble classification framework for single-cell RNA-seq data. Brief Bioinform 2021; 23:6463428. [PMID: 34913057 DOI: 10.1093/bib/bbab508] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 10/28/2021] [Accepted: 11/04/2021] [Indexed: 12/12/2022] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) allows quantitative analysis of gene expression at the level of single cells, beneficial to study cell heterogeneity. The recognition of cell types facilitates the construction of cell atlas in complex tissues or organisms, which is the basis of almost all downstream scRNA-seq data analyses. Using disease-related scRNA-seq data to perform the prediction of disease status can facilitate the specific diagnosis and personalized treatment of disease. Since single-cell gene expression data are high-dimensional and sparse with dropouts, we propose scIAE, an integrative autoencoder-based ensemble classification framework, to firstly perform multiple random projections and apply integrative and devisable autoencoders (integrating stacked, denoising and sparse autoencoders) to obtain compressed representations. Then base classifiers are built on the lower-dimensional representations and the predictions from all base models are integrated. The comparison of scIAE and common feature extraction methods shows that scIAE is effective and robust, independent of the choice of dimension, which is beneficial to subsequent cell classification. By testing scIAE on different types of data and comparing it with existing general and single-cell-specific classification methods, it is proven that scIAE has a great classification power in cell type annotation intradataset, across batches, across platforms and across species, and also disease status prediction. The architecture of scIAE is flexible and devisable, and it is available at https://github.com/JGuan-lab/scIAE.
Collapse
Affiliation(s)
- Qingyang Yin
- Department of Automation, Xiamen University, Xiamen, Fujian 361102, China.,Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California 90089, USA
| | - Yang Wang
- Department of Automation, Xiamen University, Xiamen, Fujian 361102, China
| | - Jinting Guan
- Department of Automation, Xiamen University, Xiamen, Fujian 361102, China.,National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 361102, China
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen, Fujian 361102, China.,National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 361102, China
| |
Collapse
|
43
|
Mädler SC, Julien-Laferriere A, Wyss L, Phan M, Sonrel A, Kang ASW, Ulrich E, Schmucki R, Zhang JD, Ebeling M, Badi L, Kam-Thong T, Schwalie PC, Hatje K. Besca, a single-cell transcriptomics analysis toolkit to accelerate translational research. NAR Genom Bioinform 2021; 3:lqab102. [PMID: 34761219 PMCID: PMC8573822 DOI: 10.1093/nargab/lqab102] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 10/08/2021] [Accepted: 10/12/2021] [Indexed: 02/07/2023] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) revolutionized our understanding of disease biology. The promise it presents to also transform translational research requires highly standardized and robust software workflows. Here, we present the toolkit Besca, which streamlines scRNA-seq analyses and their use to deconvolute bulk RNA-seq data according to current best practices. Beyond a standard workflow covering quality control, filtering, and clustering, two complementary Besca modules, utilizing hierarchical cell signatures and supervised machine learning, automate cell annotation and provide harmonized nomenclatures. Subsequently, the gene expression profiles can be employed to estimate cell type proportions in bulk transcriptomics data. Using multiple, diverse scRNA-seq datasets, some stemming from highly heterogeneous tumor tissue, we show how Besca aids acceleration, interoperability, reusability and interpretability of scRNA-seq data analyses, meeting crucial demands in translational research and beyond.
Collapse
Affiliation(s)
- Sophia Clara Mädler
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Alice Julien-Laferriere
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Luis Wyss
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Miroslav Phan
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Anthony Sonrel
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Albert S W Kang
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Eric Ulrich
- Roche Pharma Research and Early Development, I2O Disease Translational Area, Roche Innovation Center Basel, Basel, Switzerland
| | - Roland Schmucki
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Jitao David Zhang
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Martin Ebeling
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Laura Badi
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Tony Kam-Thong
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Petra C Schwalie
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Klas Hatje
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| |
Collapse
|
44
|
Abstract
Cell atlases are essential companions to the genome as they elucidate how genes are used in a cell type-specific manner or how the usage of genes changes over the lifetime of an organism. This review explores recent advances in whole-organism single-cell atlases, which enable understanding of cell heterogeneity and tissue and cell fate, both in health and disease. Here we provide an overview of recent efforts to build cell atlases across species and discuss the challenges that the field is currently facing. Moreover, we propose the concept of having a knowledgebase that can scale with the number of experiments and computational approaches and a new feedback loop for development and benchmarking of computational methods that includes contributions from the users. These two aspects are key for community efforts in single-cell biology that will help produce a comprehensive annotated map of cell types and states with unparalleled resolution.
Collapse
Affiliation(s)
| | - Bruno Tojo
- Instituto Superior Técnico, Universidade de Lisboa, 1049-001 Lisbon, Portugal
| | - Aaron McGeever
- Chan Zuckerberg Biohub, San Francisco, California 94103, USA;
| |
Collapse
|
45
|
Wang T, Bai J, Nabavi S. Single-cell classification using graph convolutional networks. BMC Bioinformatics 2021; 22:364. [PMID: 34238220 PMCID: PMC8268184 DOI: 10.1186/s12859-021-04278-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Accepted: 06/24/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Analyzing single-cell RNA sequencing (scRNAseq) data plays an important role in understanding the intrinsic and extrinsic cellular processes in biological and biomedical research. One significant effort in this area is the identification of cell types. With the availability of a huge amount of single cell sequencing data and discovering more and more cell types, classifying cells into known cell types has become a priority nowadays. Several methods have been introduced to classify cells utilizing gene expression data. However, incorporating biological gene interaction networks has been proved valuable in cell classification procedures. RESULTS In this study, we propose a multimodal end-to-end deep learning model, named sigGCN, for cell classification that combines a graph convolutional network (GCN) and a neural network to exploit gene interaction networks. We used standard classification metrics to evaluate the performance of the proposed method on the within-dataset classification and the cross-dataset classification. We compared the performance of the proposed method with those of the existing cell classification tools and traditional machine learning classification methods. CONCLUSIONS Results indicate that the proposed method outperforms other commonly used methods in terms of classification accuracy and F1 scores. This study shows that the integration of prior knowledge about gene interactions with gene expressions using GCN methodologies can extract effective features improving the performance of cell classification.
Collapse
Affiliation(s)
- Tianyu Wang
- Computer Science and Engineering Department, University of Connecticut, Storrs, CT USA
| | - Jun Bai
- Computer Science and Engineering Department, University of Connecticut, Storrs, CT USA
| | - Sheida Nabavi
- Computer Science and Engineering Department, University of Connecticut, Storrs, CT USA
| |
Collapse
|
46
|
Prasad S, Rankine A, Prasad T, Song P, Dokukin ME, Makarova N, Backman V, Sokolov I. Atomic Force Microscopy Detects the Difference in Cancer Cells of Different Neoplastic Aggressiveness via Machine Learning. ADVANCED NANOBIOMED RESEARCH 2021. [DOI: 10.1002/anbr.202000116] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Affiliation(s)
- Siona Prasad
- Department of Mechanical Engineering Tufts University Medford MA 02155 USA
- Department of Computer Science Harvard University Cambridge MA 02138 USA
| | - Alex Rankine
- Department of Mechanical Engineering Tufts University Medford MA 02155 USA
- Department of Computer Science Harvard University Cambridge MA 02138 USA
| | - Tarun Prasad
- Department of Mechanical Engineering Tufts University Medford MA 02155 USA
- Department of Computer Science Harvard University Cambridge MA 02138 USA
| | - Patrick Song
- Department of Mechanical Engineering Tufts University Medford MA 02155 USA
- Department of Computer Science Harvard University Cambridge MA 02138 USA
| | - Maxim E. Dokukin
- NanoScience Solutions, Inc Arlington VA 22203 USA
- Department of Information Technology and Electronics Sarov Physics and Technology Institute Sarov Russian Federation
- Institute of Nanoengineering in Electronics, Spintronics and Photonics National Research Nuclear University MEPhI Moscow Russian Federation
| | - Nadezda Makarova
- Department of Mechanical Engineering Tufts University Medford MA 02155 USA
| | - Vadim Backman
- Department of Biomedical Engineering Northwestern University Evanston IL 60208 USA
| | - Igor Sokolov
- Department of Mechanical Engineering Tufts University Medford MA 02155 USA
- Department of Biomedical Engineering Tufts University Medford MA 02155 USA
- Department of Physics Tufts University Medford MA 02155 USA
| |
Collapse
|
47
|
Sánchez-Corrales YE, Pohle RVC, Castellano S, Giustacchini A. Taming Cell-to-Cell Heterogeneity in Acute Myeloid Leukaemia With Machine Learning. Front Oncol 2021; 11:666829. [PMID: 33996595 PMCID: PMC8117935 DOI: 10.3389/fonc.2021.666829] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 04/06/2021] [Indexed: 12/21/2022] Open
Abstract
Acute Myeloid Leukaemia (AML) is a phenotypically and genetically heterogenous blood cancer characterised by very poor prognosis, with disease relapse being the primary cause of treatment failure. AML heterogeneity arise from different genetic and non-genetic sources, including its proposed hierarchical structure, with leukemic stem cells (LSCs) and progenitors giving origin to a variety of more mature leukemic subsets. Recent advances in single-cell molecular and phenotypic profiling have highlighted the intra and inter-patient heterogeneous nature of AML, which has so far limited the success of cell-based immunotherapy approaches against single targets. Machine Learning (ML) can be uniquely used to find non-trivial patterns from high-dimensional datasets and identify rare sub-populations. Here we review some recent ML tools that applied to single-cell data could help disentangle cell heterogeneity in AML by identifying distinct core molecular signatures of leukemic cell subsets. We discuss the advantages and limitations of unsupervised and supervised ML approaches to cluster and classify cell populations in AML, for the identification of biomarkers and the design of personalised therapies.
Collapse
Affiliation(s)
- Yara E. Sánchez-Corrales
- Genetics and Genomic Medicine Department, Great Ormond Street Institute of Child Health, University College London, London, United Kingdom
| | - Ruben V. C. Pohle
- Molecular and Cellular Immunology Section, Great Ormond Street Institute of Child Health, University College London, London, United Kingdom
| | - Sergi Castellano
- Genetics and Genomic Medicine Department, Great Ormond Street Institute of Child Health, University College London, London, United Kingdom
- University College London (UCL) Genomics, Great Ormond Street Institute of Child Health, University College London, London, United Kingdom
| | - Alice Giustacchini
- Molecular and Cellular Immunology Section, Great Ormond Street Institute of Child Health, University College London, London, United Kingdom
| |
Collapse
|
48
|
Lin W, Fan J, Hu LF, Zhang Y, Ooi JD, Meng T, Jin P, Ding X, Peng LK, Song L, Tang R, Xiao Z, Ao X, Xiao XC, Zhou QL, Xiao P, Zhong Y. Single-cell analysis of angiotensin-converting enzyme II expression in human kidneys and bladders reveals a potential route of 2019 novel coronavirus infection. Chin Med J (Engl) 2021; 134:935-943. [PMID: 33879756 PMCID: PMC8078266 DOI: 10.1097/cm9.0000000000001439] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Since 2019, a novel coronavirus named 2019 novel coronavirus (2019-nCoV) has emerged worldwide. Apart from fever and respiratory complications, acute kidney injury has been observed in a few patients with coronavirus disease 2019. Furthermore, according to recent findings, the virus has been detected in urine. Angiotensin-converting enzyme II (ACE2) has been proposed to serve as the receptor for the entry of 2019-nCoV, which is the same as that for the severe acute respiratory syndrome. This study aimed to investigate the possible cause of kidney damage and the potential route of 2019-nCoV infection in the urinary system. METHODS We used both published kidney and bladder cell atlas data and new independent kidney single-cell RNA sequencing data generated in-house to evaluate ACE2 gene expression in all cell types in healthy kidneys and bladders. The Pearson correlation coefficients between ACE2 and all other genes were first generated. Then, genes with r values larger than 0.1 and P values smaller than 0.01 were deemed significant co-expression genes with ACE2. RESULTS Our results showed the enriched expression of ACE2 in all subtypes of proximal tubule (PT) cells of the kidney. ACE2 expression was found in 5.12%, 5.80%, and 14.38% of the proximal convoluted tubule cells, PT cells, and proximal straight tubule cells, respectively, in three published kidney cell atlas datasets. In addition, ACE2 expression was also confirmed in 12.05%, 6.80%, and 10.20% of cells of the proximal convoluted tubule, PT, and proximal straight tubule, respectively, in our own two healthy kidney samples. For the analysis of public data from three bladder samples, ACE2 expression was low but detectable in bladder epithelial cells. Only 0.25% and 1.28% of intermediate cells and umbrella cells, respectively, had ACE2 expression. CONCLUSION This study has provided bioinformatics evidence of the potential route of 2019-nCoV infection in the urinary system.
Collapse
Affiliation(s)
- Wei Lin
- Department of Pathology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Jue Fan
- Department of Bioinformatics and Data Science, Singleron Biotechnologies, Nanjing, Jiangsu 210032, China
| | - Long-Fei Hu
- Department of Bioinformatics and Data Science, Singleron Biotechnologies, Nanjing, Jiangsu 210032, China
| | - Yan Zhang
- Department of Bioinformatics and Data Science, Singleron Biotechnologies, Nanjing, Jiangsu 210032, China
| | - Joshua D. Ooi
- Centre for Inflammatory Diseases, Monash University Department of Medicine, Monash Medical Centre, Clayton, VIC 3168, Australia
| | - Ting Meng
- Department of Nephrology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Peng Jin
- Department of Organ Transplantation, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Xiang Ding
- Department of Organ Transplantation, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Long-Kai Peng
- Department of Kidney Transplantation, The Second Xiangya Hospital of Central South University, Changsha, Hunan 410011, China
| | - Lei Song
- Department of Kidney Transplantation, The Second Xiangya Hospital of Central South University, Changsha, Hunan 410011, China
| | - Rong Tang
- Department of Kidney Transplantation, The Second Xiangya Hospital of Central South University, Changsha, Hunan 410011, China
| | - Zhou Xiao
- Department of Nephrology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Xiang Ao
- Department of Nephrology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Xiang-Cheng Xiao
- Department of Nephrology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Qiao-Ling Zhou
- Department of Nephrology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Ping Xiao
- Department of Nephrology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Yong Zhong
- Department of Nephrology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| |
Collapse
|
49
|
Huang Y, Zhang P. Evaluation of machine learning approaches for cell-type identification from single-cell transcriptomics data. Brief Bioinform 2021; 22:6145135. [PMID: 33611343 DOI: 10.1093/bib/bbab035] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 01/22/2021] [Accepted: 01/22/2021] [Indexed: 11/14/2022] Open
Abstract
Single-cell transcriptomics is rapidly advancing our understanding of the cellular composition of complex tissues and organisms. A major limitation in most analysis pipelines is the reliance on manual annotations to determine cell identities, which is time-consuming, irreproducible and sometimes lack canonical markers for certain cell types. There is a growing realization of the potential of machine learning models as a supervised classification approach that can significantly aid decision-making processes for cell-type identification. In this work, we performed a comprehensive and impartial evaluation of 10 machine learning models that automatically assign cell phenotypes. The performance of classification methods is estimated by using 20 publicly accessible single-cell RNA sequencing datasets with different sizes, technologies, species and levels of complexity. The performance of each model for within dataset (intra-dataset) and across datasets (inter-dataset) experiments based on the classification accuracy and computation time are both evaluated. Besides, the sensitivity to the number of input features, different annotation levels and dataset complexity was also been estimated. Results showed that most classifiers perform well on a variety of datasets with decreased accuracy for complex datasets, while the Linear Support Vector Machine (linear-SVM) and Logistic Regression classifier models have the best overall performance with remarkably fast computation time. Our work provides a guideline for researchers to select and apply suitable machine learning-based classification models in their analysis workflows and sheds some light on the potential direction of future improvement on automated cell phenotype classification tools based on the single-cell sequencing data.
Collapse
Affiliation(s)
- Yixuan Huang
- George Washington University School of Business, Washington, DC, USA
| | - Peng Zhang
- Division of Immunotherapy and the Director of Bioinformatics Core at the Institute of Human Virology, University of Maryland School of Medicine, MD, USA
| |
Collapse
|
50
|
Pasquini G, Rojo Arias JE, Schäfer P, Busskamp V. Automated methods for cell type annotation on scRNA-seq data. Comput Struct Biotechnol J 2021; 19:961-969. [PMID: 33613863 PMCID: PMC7873570 DOI: 10.1016/j.csbj.2021.01.015] [Citation(s) in RCA: 112] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 01/13/2021] [Accepted: 01/13/2021] [Indexed: 12/22/2022] Open
Abstract
The advent of single-cell sequencing started a new era of transcriptomic and genomic research, advancing our knowledge of the cellular heterogeneity and dynamics. Cell type annotation is a crucial step in analyzing single-cell RNA sequencing data, yet manual annotation is time-consuming and partially subjective. As an alternative, tools have been developed for automatic cell type identification. Different strategies have emerged to ultimately associate gene expression profiles of single cells with a cell type either by using curated marker gene databases, correlating reference expression data, or transferring labels by supervised classification. In this review, we present an overview of the available tools and the underlying approaches to perform automated cell type annotations on scRNA-seq data.
Collapse
Affiliation(s)
- Giovanni Pasquini
- Technische Universität Dresden, Center for Molecular and Cellular Bioengineering (CMCB), Center for Regenerative Therapies Dresden (CRTD), Dresden 01307, Germany
- Universitäts-Augenklinik Bonn, University of Bonn, Department of Ophthalmology, Bonn 53127, Germany
| | - Jesus Eduardo Rojo Arias
- Wellcome-MRC Cambridge Stem Cell Institute, Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, University of Cambridge, Cambridge, UK
| | - Patrick Schäfer
- Technische Universität Dresden, Center for Molecular and Cellular Bioengineering (CMCB), Center for Regenerative Therapies Dresden (CRTD), Dresden 01307, Germany
| | - Volker Busskamp
- Technische Universität Dresden, Center for Molecular and Cellular Bioengineering (CMCB), Center for Regenerative Therapies Dresden (CRTD), Dresden 01307, Germany
- Universitäts-Augenklinik Bonn, University of Bonn, Department of Ophthalmology, Bonn 53127, Germany
| |
Collapse
|