1
|
Bavais J, Chevallier J, Spinelli L, van de Pavert S, Puthier D. SciGeneX: enhancing transcriptional analysis through gene module detection in single-cell and spatial transcriptomics data. NAR Genom Bioinform 2025; 7:lqaf043. [PMID: 40248490 PMCID: PMC12004220 DOI: 10.1093/nargab/lqaf043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Revised: 03/19/2025] [Accepted: 04/09/2025] [Indexed: 04/19/2025] Open
Abstract
The standard pipeline to analyze single-cell RNA-seq or spatial transcriptomics data focuses on a gene-centric approach that overlooks the collective behavior of genes. However, understanding cell populations necessitates recognizing intricate combinations of activated and repressed pathways. Therefore, a broader view of gene behavior offers more accurate insights into cellular heterogeneity in single-cell or spatial transcriptomics data. Here, we describe SciGeneX (Single-cell informative Gene eXplorer), a R package implementing a neighborhood analysis and a graph partitioning method to generate co-expression gene modules. These modules, whether shared or restricted to cell populations, collectively reflect cellular heterogeneity. Their combinations are able to highlight specific cell populations, even rare ones. SciGeneX uncovers rare and novel cell populations that were not observed before in human thymus spatial transcriptomics data. We show that SciGeneX outperforms existing methods on both artificial and experimental datasets. Overall, SciGeneX will aid in unravelling cellular and molecular diversity in single-cell and spatial transcriptomics studies.
Collapse
Affiliation(s)
- Julie Bavais
- Aix-Marseille Univ, INSERM, TAGC, Turing Centre for Living systems, 13288 Marseille, France
- Aix-Marseille Univ, CNRS, INSERM, CIML, Turing Centre for Living systems, 13009 Marseille, France
| | - Jessica Chevallier
- Aix-Marseille Univ, INSERM, TAGC, Turing Centre for Living systems, 13288 Marseille, France
- Aix-Marseille Univ, CNRS, INSERM, CIML, Turing Centre for Living systems, 13009 Marseille, France
| | - Lionel Spinelli
- Aix-Marseille Univ, INSERM, TAGC, Turing Centre for Living systems, 13288 Marseille, France
- Aix-Marseille Univ, CNRS, INSERM, CIML, Turing Centre for Living systems, 13009 Marseille, France
| | - Serge A van de Pavert
- Aix-Marseille Univ, CNRS, INSERM, CIML, Turing Centre for Living systems, 13009 Marseille, France
| | - Denis Puthier
- Aix-Marseille Univ, INSERM, TAGC, MarMaRa Institute, Turing Centre for Living systems, Transcriptomics and Genomics Marseille Luminy (TGML), 13288 Marseille, France
| |
Collapse
|
2
|
M A Basher AR, Hallinan C, Lee K. Heterogeneity-preserving discriminative feature selection for disease-specific subtype discovery. Nat Commun 2025; 16:3593. [PMID: 40234411 PMCID: PMC12000357 DOI: 10.1038/s41467-025-58718-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 03/26/2025] [Indexed: 04/17/2025] Open
Abstract
Disease-specific subtype identification can deepen our understanding of disease progression and pave the way for personalized therapies, given the complexity of disease heterogeneity. Large-scale transcriptomic, proteomic, and imaging datasets create opportunities for discovering subtypes but also pose challenges due to their high dimensionality. To mitigate this, many feature selection methods focus on selecting features that distinguish known diseases or cell states, yet often miss features that preserve heterogeneity and reveal new subtypes. To overcome this gap, we develop Preserving Heterogeneity (PHet), a statistical methodology that employs iterative subsampling and differential analysis of interquartile range, in conjunction with Fisher's method, to identify a small set of features that enhance subtype clustering quality. Here, we show that this method can maintain sample heterogeneity while distinguishing known disease/cell states, with a tendency to outperform previous differential expression and outlier-based methods, indicating its potential to advance our understanding of disease mechanisms and cell differentiation.
Collapse
Affiliation(s)
- Abdur Rahman M A Basher
- Vascular Biology Program, Boston Children's Hospital, Boston, MA, USA
- Department of Surgery, Harvard Medical School, Boston, MA, USA
| | - Caleb Hallinan
- Vascular Biology Program, Boston Children's Hospital, Boston, MA, USA
| | - Kwonmoo Lee
- Vascular Biology Program, Boston Children's Hospital, Boston, MA, USA.
- Department of Surgery, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
3
|
Basher ARMA, Hallinan C, Lee K. Heterogeneity-Preserving Discriminative Feature Selection for Disease-Specific Subtype Discovery. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2023.05.14.540686. [PMID: 38187596 PMCID: PMC10769187 DOI: 10.1101/2023.05.14.540686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
The identification of disease-specific subtypes can provide valuable insights into disease progression and potential individualized therapies, important aspects of precision medicine given the complex nature of disease heterogeneity. The advent of high-throughput technologies has enabled the generation and analysis of various molecular data types, such as single-cell RNA-seq, proteomic, and imaging datasets, on a large scale. While these datasets offer opportunities for subtype discovery, they also pose challenges in finding subtype signatures due to their high dimensionality. Feature selection, a key step in the machine learning pipeline, involves selecting signatures that reduce feature size for more efficient downstream computational analysis. Although many existing methods focus on selecting features that differentiate known diseases or cell states, they often struggle to identify features that both preserve heterogeneity and reveal subtypes. To address this, we utilized deep metric learning-based feature embedding to explore the statistical properties of features crucial for preserving heterogeneity. Our analysis indicated that features with a notable difference in interquartile range (IQR) between classes hold important subtype information. Guided by this insight, we developed a statistical method called PHet (Preserving Heterogeneity), which employs iterative subsampling and differential analysis of IQR combined with Fisher's method to identify a small set of features that preserve heterogeneity and enhance subtype clustering quality. Validation on public single-cell RNA-seq and microarray datasets demonstrated PHet's ability to maintain sample heterogeneity while distinguishing known disease/cell states, with a tendency to outperform previous differential expression and outlier-based methods. Furthermore, an analysis of a single-cell RNA-seq dataset from mouse tracheal epithelial cells identified two distinct basal cell subtypes differentiating towards a luminal secretory phenotype using PHet-based features, demonstrating promising results in a real-data application. These results highlight PHet's potential to enhance our understanding of disease mechanisms and cell differentiation, contributing significantly to the field of personalized medicine.
Collapse
Affiliation(s)
- Abdur Rahman M. A. Basher
- Vascular Biology Program, Boston Children’s Hospital, Boston, MA 02115, USA
- Department of Surgery, Harvard Medical School, Boston, MA 02115, USA
| | - Caleb Hallinan
- Vascular Biology Program, Boston Children’s Hospital, Boston, MA 02115, USA
| | - Kwonmoo Lee
- Vascular Biology Program, Boston Children’s Hospital, Boston, MA 02115, USA
- Department of Surgery, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
4
|
Sun F, Li H, Sun D, Fu S, Gu L, Shao X, Wang Q, Dong X, Duan B, Xing F, Wu J, Xiao M, Zhao F, Han JDJ, Liu Q, Fan X, Li C, Wang C, Shi T. Single-cell omics: experimental workflow, data analyses and applications. SCIENCE CHINA. LIFE SCIENCES 2025; 68:5-102. [PMID: 39060615 DOI: 10.1007/s11427-023-2561-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 04/18/2024] [Indexed: 07/28/2024]
Abstract
Cells are the fundamental units of biological systems and exhibit unique development trajectories and molecular features. Our exploration of how the genomes orchestrate the formation and maintenance of each cell, and control the cellular phenotypes of various organismsis, is both captivating and intricate. Since the inception of the first single-cell RNA technology, technologies related to single-cell sequencing have experienced rapid advancements in recent years. These technologies have expanded horizontally to include single-cell genome, epigenome, proteome, and metabolome, while vertically, they have progressed to integrate multiple omics data and incorporate additional information such as spatial scRNA-seq and CRISPR screening. Single-cell omics represent a groundbreaking advancement in the biomedical field, offering profound insights into the understanding of complex diseases, including cancers. Here, we comprehensively summarize recent advances in single-cell omics technologies, with a specific focus on the methodology section. This overview aims to guide researchers in selecting appropriate methods for single-cell sequencing and related data analysis.
Collapse
Affiliation(s)
- Fengying Sun
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China
| | - Haoyan Li
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Dongqing Sun
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Shaliu Fu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Lei Gu
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Shao
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China
| | - Qinqin Wang
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Dong
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Bin Duan
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Feiyang Xing
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Jun Wu
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Minmin Xiao
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jing-Dong J Han
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China.
| | - Qi Liu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China.
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China.
- Zhejiang Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.
| | - Chen Li
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Chenfei Wang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
| | - Tieliu Shi
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China.
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, 200062, China.
| |
Collapse
|
5
|
Song D, Chen S, Lee C, Li K, Ge X, Li JJ. Synthetic control removes spurious discoveries from double dipping in single-cell and spatial transcriptomics data analyses. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.21.550107. [PMID: 37546812 PMCID: PMC10401959 DOI: 10.1101/2023.07.21.550107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Double dipping is a well-known pitfall in single-cell and spatial transcriptomics data analysis: after a clustering algorithm finds clusters as putative cell types or spatial domains, statistical tests are applied to the same data to identify differentially expressed (DE) genes as potential cell-type or spatial-domain markers. Because the genes that contribute to clustering are inherently likely to be identified as DE genes, double dipping can result in false-positive cell-type or spatial-domain markers, especially when clusters are spurious, leading to ambiguously defined cell types or spatial domains. To address this challenge, we propose ClusterDE, a statistical method designed to identify post-clustering DE genes as reliable markers of cell types and spatial domains, while controlling the false discovery rate (FDR) regardless of clustering quality. The core of ClusterDE involves generating synthetic null data as an in silico negative control that contains only one cell type or spatial domain, allowing for the detection and removal of spurious discoveries caused by double dipping. We demonstrate that ClusterDE controls the FDR and identifies canonical cell-type and spatial-domain markers as top DE genes, distinguishing them from housekeeping genes. ClusterDE's ability to discover reliable markers, or the absence of such markers, can be used to determine whether two ambiguous clusters should be merged. Additionally, ClusterDE is compatible with state-of-the-art analysis pipelines like Seurat and Scanpy.
Collapse
Affiliation(s)
- Dongyuan Song
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030
- Interdepartmental Program of Bioinformatics, University of California, Los Angeles, CA 90095-7246
| | - Siqi Chen
- Department of Statistics and Data Science, University of California, Los Angeles, CA 90095-1554
| | - Christy Lee
- Department of Statistics and Data Science, University of California, Los Angeles, CA 90095-1554
| | - Kexin Li
- Department of Statistics and Data Science, University of California, Los Angeles, CA 90095-1554
| | - Xinzhou Ge
- Department of Statistics and Data Science, University of California, Los Angeles, CA 90095-1554
- Department of Statistics, Oregon State University, Corvallis, OR 97331-4606
| | - Jingyi Jessica Li
- Interdepartmental Program of Bioinformatics, University of California, Los Angeles, CA 90095-7246
- Department of Statistics and Data Science, University of California, Los Angeles, CA 90095-1554
- Department of Human Genetics, University of California, Los Angeles, CA 90095-7088
- Department of Computational Medicine, University of California, Los Angeles, CA 90095-1766
- Department of Biostatistics, University of California, Los Angeles, CA 90095-1772
| |
Collapse
|
6
|
Li R, Qu R, Parisi F, Strino F, Lam H, Stanley JS, Cheng X, Myung P, Kluger Y. LMD: Cluster-Independent Multiscale Marker Identification in Single-cell RNA-seq Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.12.566780. [PMID: 38014159 PMCID: PMC10680591 DOI: 10.1101/2023.11.12.566780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Identifying accurate cell markers in single-cell RNA-seq data is crucial for understanding cellular diversity and function. Localized Marker Detector (LMD) is a novel tool to identify "localized genes" - genes exclusively expressed in groups of highly similar cells - thereby characterizing cellular diversity in a multi-resolution and fine-grained manner. LMD constructs a cell-cell affinity graph, diffuses the gene expression value across the cell graph, and assigns a score to each gene based on its diffusion dynamics. LMD's candidate markers can be grouped into functional gene modules, which accurately reflect cell types, subtypes, and other sources of variation such as cell cycle status. We apply LMD to mouse bone marrow and hair follicle dermal condensate datasets, where LMD facilitates cross-sample comparisons, identifying shared and sample-specific gene signatures and novel cell populations without requiring batch effect correction or integration methods. Furthermore, we assessed the performance of LMD across nine single-cell RNA sequencing datasets, compared it with six other methods aimed at achieving similar objectives, and found that LMD outperforms the other methods evaluated.
Collapse
|
7
|
Wang Y, Pan Z, Mou M, Xia W, Zhang H, Zhang H, Liu J, Zheng L, Luo Y, Zheng H, Yu X, Lian X, Zeng Z, Li Z, Zhang B, Zheng M, Li H, Hou T, Zhu F. A task-specific encoding algorithm for RNAs and RNA-associated interactions based on convolutional autoencoder. Nucleic Acids Res 2023; 51:e110. [PMID: 37889083 PMCID: PMC10682500 DOI: 10.1093/nar/gkad929] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 08/01/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023] Open
Abstract
RNAs play essential roles in diverse physiological and pathological processes by interacting with other molecules (RNA/protein/compound), and various computational methods are available for identifying these interactions. However, the encoding features provided by existing methods are limited and the existing tools does not offer an effective way to integrate the interacting partners. In this study, a task-specific encoding algorithm for RNAs and RNA-associated interactions was therefore developed. This new algorithm was unique in (a) realizing comprehensive RNA feature encoding by introducing a great many of novel features and (b) enabling task-specific integration of interacting partners using convolutional autoencoder-directed feature embedding. Compared with existing methods/tools, this novel algorithm demonstrated superior performances in diverse benchmark testing studies. This algorithm together with its source code could be readily accessed by all user at: https://idrblab.org/corain/ and https://github.com/idrblab/corain/.
Collapse
Affiliation(s)
- Yunxia Wang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Weiqi Xia
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Hongning Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Hanyu Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Jin Liu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Lingyan Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-ZJU Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Hanqi Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Xinyuan Yu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Xichen Lian
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Zhenyu Zeng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-ZJU Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Zhaorong Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-ZJU Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Bing Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-ZJU Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Mingyue Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Honglin Li
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
- School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-ZJU Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
| |
Collapse
|
8
|
Choi Y, Jung K. Normalization of the tumor microenvironment by harnessing vascular and immune modulation to achieve enhanced cancer therapy. Exp Mol Med 2023; 55:2308-2319. [PMID: 37907742 PMCID: PMC10689787 DOI: 10.1038/s12276-023-01114-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 08/07/2023] [Accepted: 08/12/2023] [Indexed: 11/02/2023] Open
Abstract
Solid tumors are complex entities that actively shape their microenvironment to create a supportive environment for their own growth. Angiogenesis and immune suppression are two key characteristics of this tumor microenvironment. Despite attempts to deplete tumor blood vessels using antiangiogenic drugs, extensive vessel pruning has shown limited efficacy. Instead, a targeted approach involving the judicious use of drugs at specific time points can normalize the function and structure of tumor vessels, leading to improved outcomes when combined with other anticancer therapies. Additionally, normalizing the immune microenvironment by suppressing immunosuppressive cells and activating immunostimulatory cells has shown promise in suppressing tumor growth and improving overall survival. Based on these findings, many studies have been conducted to normalize each component of the tumor microenvironment, leading to the development of a variety of strategies. In this review, we provide an overview of the concepts of vascular and immune normalization and discuss some of the strategies employed to achieve these goals.
Collapse
Affiliation(s)
- Yechan Choi
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea
| | - Keehoon Jung
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea.
- Department of Anatomy and Cell Biology, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea.
- Institute of Allergy and Clinical Immunology, Seoul National University Medical Research Center, Seoul, 03080, Republic of Korea.
| |
Collapse
|
9
|
Han X, Wang B, Situ C, Qi Y, Zhu H, Li Y, Guo X. scapGNN: A graph neural network-based framework for active pathway and gene module inference from single-cell multi-omics data. PLoS Biol 2023; 21:e3002369. [PMID: 37956172 PMCID: PMC10681325 DOI: 10.1371/journal.pbio.3002369] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 11/27/2023] [Accepted: 10/07/2023] [Indexed: 11/15/2023] Open
Abstract
Although advances in single-cell technologies have enabled the characterization of multiple omics profiles in individual cells, extracting functional and mechanistic insights from such information remains a major challenge. Here, we present scapGNN, a graph neural network (GNN)-based framework that creatively transforms sparse single-cell profile data into the stable gene-cell association network for inferring single-cell pathway activity scores and identifying cell phenotype-associated gene modules from single-cell multi-omics data. Systematic benchmarking demonstrated that scapGNN was more accurate, robust, and scalable than state-of-the-art methods in various downstream single-cell analyses such as cell denoising, batch effect removal, cell clustering, cell trajectory inference, and pathway or gene module identification. scapGNN was developed as a systematic R package that can be flexibly extended and enhanced for existing analysis processes. It provides a new analytical platform for studying single cells at the pathway and network levels.
Collapse
Affiliation(s)
- Xudong Han
- State Key Laboratory of Reproductive Medicine and Offspring Health, School of Medicine, Southeast University, Nanjing, China
- Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing, China
| | - Bing Wang
- State Key Laboratory of Reproductive Medicine and Offspring Health, School of Medicine, Southeast University, Nanjing, China
- Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing, China
| | - Chenghao Situ
- Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing, China
| | - Yaling Qi
- Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing, China
| | - Hui Zhu
- Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing, China
| | - Yan Li
- Department of Clinical Laboratory, Sir Run Run Hospital, Nanjing Medical University, Nanjing, China
| | - Xuejiang Guo
- State Key Laboratory of Reproductive Medicine and Offspring Health, School of Medicine, Southeast University, Nanjing, China
- Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing, China
| |
Collapse
|
10
|
Song D, Li K, Ge X, Li JJ. ClusterDE: a post-clustering differential expression (DE) method robust to false-positive inflation caused by double dipping. RESEARCH SQUARE 2023:rs.3.rs-3211191. [PMID: 37577698 PMCID: PMC10418557 DOI: 10.21203/rs.3.rs-3211191/v1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
In typical single-cell RNA-seq (scRNA-seq) data analysis, a clustering algorithm is applied to find putative cell types as clusters, and then a statistical differential expression (DE) test is employed to identify the differentially expressed (DE) genes between the cell clusters. However, this common procedure uses the same data twice, an issue known as "double dipping": the same data is used twice to define cell clusters as potential cell types and DE genes as potential cell-type marker genes, leading to false-positive cell-type marker genes even when the cell clusters are spurious. To overcome this challenge, we propose ClusterDE, a post-clustering DE method for controlling the false discovery rate (FDR) of identified DE genes regardless of clustering quality, which can work as an add-on to popular pipelines such as Seurat. The core idea of ClusterDE is to generate real-data-based synthetic null data containing only one cluster, as contrast to the real data, for evaluating the whole procedure of clustering followed by a DE test. Using comprehensive simulation and real data analysis, we show that ClusterDE has not only solid FDR control but also the ability to identify cell-type marker genes as top DE genes and distinguish them from housekeeping genes. ClusterDE is fast, transparent, and adaptive to a wide range of clustering algorithms and DE tests. Besides scRNA-seq data, ClusterDE is generally applicable to post-clustering DE analysis, including single-cell multi-omics data analysis.
Collapse
Affiliation(s)
- Dongyuan Song
- Bioinformatics Interdepartmental Ph.D. Program, University of California, Los Angeles, CA 90095-7246
| | - Kexin Li
- Department of Statistics, University of California, Los Angeles, CA 90095-1554
| | - Xinzhou Ge
- Department of Statistics, University of California, Los Angeles, CA 90095-1554
| | - Jingyi Jessica Li
- Bioinformatics Interdepartmental Ph.D. Program, University of California, Los Angeles, CA 90095-7246
- Department of Statistics, University of California, Los Angeles, CA 90095-1554
- Department of Human Genetics, University of California, Los Angeles, CA 90095-7088
- Department of Computational Medicine, University of California, Los Angeles, CA 90095-1766
- Department of Biostatistics, University of California, Los Angeles, CA 90095-1772
- Radcliffe Institute for Advanced Study, Harvard University, Cambridge, MA 02138
| |
Collapse
|
11
|
Zhu J, Yang Y. scMEB: a fast and clustering-independent method for detecting differentially expressed genes in single-cell RNA-seq data. BMC Genomics 2023; 24:280. [PMID: 37231345 DOI: 10.1186/s12864-023-09374-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Accepted: 05/11/2023] [Indexed: 05/27/2023] Open
Abstract
BACKGROUND Cell clustering is a prerequisite for identifying differentially expressed genes (DEGs) in single-cell RNA sequencing (scRNA-seq) data. Obtaining a perfect clustering result is of central importance for subsequent analyses, but not easy. Additionally, the increase in cell throughput due to the advancement of scRNA-seq protocols exacerbates many computational issues, especially regarding method runtime. To address these difficulties, a new, accurate, and fast method for detecting DEGs in scRNA-seq data is needed. RESULTS Here, we propose single-cell minimum enclosing ball (scMEB), a novel and fast method for detecting single-cell DEGs without prior cell clustering results. The proposed method utilizes a small part of known non-DEGs (stably expressed genes) to build a minimum enclosing ball and defines the DEGs based on the distance of a mapped gene to the center of the hypersphere in a feature space. CONCLUSIONS We compared scMEB to two different approaches that could be used to identify DEGs without cell clustering. The investigation of 11 real datasets revealed that scMEB outperformed rival methods in terms of cell clustering, predicting genes with biological functions, and identifying marker genes. Moreover, scMEB was much faster than the other methods, making it particularly effective for finding DEGs in high-throughput scRNA-seq data. We have developed a package scMEB for the proposed method, which could be available at https://github.com/FocusPaka/scMEB .
Collapse
Affiliation(s)
- Jiadi Zhu
- School of Mathematics and Statistics, Xidian University, Xi'an, China
| | - Youlong Yang
- School of Mathematics and Statistics, Xidian University, Xi'an, China.
| |
Collapse
|