Gao S, Li H, Wu Z, Mizumaki H, Kajigaya S, Young NS. GSNCASCR: An R Package to Identify Differentially Co-Expressed Curated Gene Sets with Single-Cell RNA-Seq Data.
Int J Mol Sci 2025;
26:4771. [PMID:
40429912 PMCID:
PMC12112291 DOI:
10.3390/ijms26104771]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2025] [Revised: 05/06/2025] [Accepted: 05/13/2025] [Indexed: 05/29/2025] Open
Abstract
(1) Differential co-expression analysis between two phenotypes with a known gene set helps to uncover gene regulation alterations. (2) GSNCASCR uses CSCORE to estimate the gene pair correlations for network reconstruction and GSNCA to quantify the structure changes of co-expression networks of the predefined gene sets. It also ranks genes based on their "importance" in the weighted network. The method is implemented with free R software (version 0.1.0, available on GitHub), allowing users to analyze their data with the help of demo vignettes included in the package. (3) With analysis of both simulated and real datasets, we demonstrate that the statistical tests performed with GSNCASCR are able to identify differentially co-expressed gene sets with higher precision than tests with Gene Set Co-Expression Analysis (GSCA, version 1.1.1) and Gene Sets Net Correlations Analysis (GSNCA, version 1.42.0). Specifically, GSNCASCR achieved an AUC value of 0.985, while GSNCA and GSCA achieved 0.817 and 0.893, respectively, when positive and negative pathways are defined as having more than 40% and less than 20% co-expressed gene pairs in the simulated data, respectively. Furthermore, across simulated data with varying noise levels, pathway sizes, and positive/negative pathway definitions, GSNCASCR consistently performs best in over 90% of scenarios, as evaluated by AUC values. With an available COVID-19 dataset, we show CD4+ T cell dysfunction in severe COVID-19 as TNF-α/TNF receptor 1-dependent immune pathways. In the weighted network of a gene set of IFN-γ, IFITM3 was identified as a hub gene, which has been evidenced by a genome-wide association study and functional studies. (4) We developed a bioinformatics tool, GSNCASCR, that analyzes differentially co-expressed pathways with single-cell RNA-sequencing data and also evaluates the importance of the genes within pathways. This tool combines the advantages of two algorithms, enabling the quantification and examination of cell type-specific co-expression changes within pathways. The package allows for the analysis of shared and unique disease-affected pathways across different cell types.
Collapse