Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Dinalankara W, Bravo HC. Gene Expression Signatures Based on Variability can Robustly Predict Tumor Progression and Prognosis. Cancer Inform 2015;14:71-81. [PMID: 26078586 PMCID: PMC4460970 DOI: 10.4137/cin.s23862] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2015] [Revised: 03/22/2015] [Accepted: 03/29/2015] [Indexed: 01/11/2023] Open

For:	Dinalankara W, Bravo HC. Gene Expression Signatures Based on Variability can Robustly Predict Tumor Progression and Prognosis. Cancer Inform 2015;14:71-81. [PMID: 26078586 PMCID: PMC4460970 DOI: 10.4137/cin.s23862] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2015] [Revised: 03/22/2015] [Accepted: 03/29/2015] [Indexed: 01/11/2023] Open

Number

Cited by Other Article(s)

Rahmatallah Y, Glazko G. Improving data interpretability with new differential sample variance gene set tests. BMC Bioinformatics 2025;26:103. [PMID: 40229677 PMCID: PMC11998189 DOI: 10.1186/s12859-025-06117-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2024] [Accepted: 03/20/2025] [Indexed: 04/16/2025] Open

Abstract

BACKGROUND

Gene set analysis methods have played a major role in generating biological interpretations of omics data such as gene expression datasets. However, most methods focus on detecting homogenous pattern changes in mean expression while methods detecting pattern changes in variance remain poorly explored. While a few studies attempted to use gene-level variance analysis, such approach remains under-utilized. When comparing two phenotypes, gene sets with distinct changes in subgroups under one phenotype are overlooked by available methods although they reflect meaningful biological differences between two phenotypes. Multivariate sample-level variance analysis methods are needed to detect such pattern changes.

RESULTS

We used ranking schemes based on minimum spanning tree to generalize the Cramer-Von Mises and Anderson-Darling univariate statistics into multivariate gene set analysis methods to detect differential sample variance or mean. We characterized the detection power and Type I error rate of these methods in addition to two methods developed earlier using simulation results with different parameters. We applied the developed methods to microarray gene expression dataset of prednisolone-resistant and prednisolone-sensitive children diagnosed with B-lineage acute lymphoblastic leukemia and bulk RNA-sequencing gene expression dataset of benign hyperplastic polyps and potentially malignant sessile serrated adenoma/polyps. One or both of the two compared phenotypes in each of these datasets have distinct molecular subtypes that contribute to within phenotype variability and to heterogeneous differences between two compared phenotypes. Our results show that methods designed to detect differential sample variance provide meaningful biological interpretations by detecting specific hallmark gene sets associated with the two compared phenotypes as documented in available literature.

CONCLUSIONS

The results of this study demonstrate the usefulness of methods designed to detect differential sample variance in providing biological interpretations when biologically relevant but heterogeneous changes between two phenotypes are prevalent in specific signaling pathways. Software implementation of the methods is available with detailed documentation from Bioconductor package GSAR. The available methods are applicable to gene expression datasets in a normalized matrix form and could be used with other omics datasets in a normalized matrix form with available collection of feature sets.

Collapse

Rahmatallah Y, Glazko G. Improving data interpretability with new differential sample variance gene set tests. RESEARCH SQUARE 2024:rs.3.rs-4888767. [PMID: 39315246 PMCID: PMC11419169 DOI: 10.21203/rs.3.rs-4888767/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]

Abstract

Background

Gene set analysis methods have played a major role in generating biological interpretations from omics data such as gene expression datasets. However, most methods focus on detecting homogenous pattern changes in mean expression and methods detecting pattern changes in variance remain poorly explored. While a few studies attempted to use gene-level variance analysis, such approach remains under-utilized. When comparing two phenotypes, gene sets with distinct changes in subgroups under one phenotype are overlooked by available methods although they reflect meaningful biological differences between two phenotypes. Multivariate sample-level variance analysis methods are needed to detect such pattern changes.

Results

We use ranking schemes based on minimum spanning tree to generalize the Cramer-Von Mises and Anderson-Darling univariate statistics into multivariate gene set analysis methods to detect differential sample variance or mean. We characterize these methods in addition to two methods developed earlier using simulation results with different parameters. We apply the developed methods to microarray gene expression dataset of prednisolone-resistant and prednisolone-sensitive children diagnosed with B-lineage acute lymphoblastic leukemia and bulk RNA-sequencing gene expression dataset of benign hyperplastic polyps and potentially malignant sessile serrated adenoma/polyps. One or both of the two compared phenotypes in each of these datasets have distinct molecular subtypes that contribute to heterogeneous differences. Our results show that methods designed to detect differential sample variance are able to detect specific hallmark signaling pathways associated with the two compared phenotypes as documented in available literature.

Conclusions

The results in this study demonstrate the usefulness of methods designed to detect differential sample variance in providing biological interpretations when biologically relevant but heterogeneous changes between two phenotypes are prevalent in specific signaling pathways. Software implementation of the developed methods is available with detailed documentation from Bioconductor package GSAR. The available methods are applicable to gene expression datasets in a normalized matrix form and could be used with other omics datasets in a normalized matrix form with available collection of feature sets.

Collapse

Kim K, Kim S, Ahn T, Kim H, Shin SJ, Choi CH, Park S, Kim YB, No JH, Suh DH. A differential diagnosis between uterine leiomyoma and leiomyosarcoma using transcriptome analysis. BMC Cancer 2023;23:1215. [PMID: 38066476 PMCID: PMC10709939 DOI: 10.1186/s12885-023-11394-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 09/11/2023] [Indexed: 12/18/2023] Open

Abstract

BACKGROUND

The objective of this study was to estimate the accuracy of transcriptome-based classifier in differential diagnosis of uterine leiomyoma and leiomyosarcoma. We manually selected 114 normal uterine tissue and 31 leiomyosarcoma samples from publicly available transcriptome data in UCSC Xena as training/validation sets. We developed pre-processing procedure and gene selection method to sensitively find genes of larger variance in leiomyosarcoma than normal uterine tissues. Through our method, 17 genes were selected to build transcriptome-based classifier. The prediction accuracies of deep feedforward neural network (DNN), support vector machine (SVM), random forest (RF), and gradient boosting (GB) models were examined. We interpret the biological functionality of selected genes via network-based analysis using GeneMANIA. To validate the performance of trained model, we additionally collected 35 clinical samples of leiomyosarcoma and leiomyoma as a test set (18 + 17 as 1st and 2nd test sets).

RESULTS

We discovered genes expressed in a highly variable way in leiomyosarcoma while these genes are expressed in a conserved way in normal uterine samples. These genes were mainly associated with DNA replication. As gene selection and model training were made in leiomyosarcoma and uterine normal tissue, proving discriminant of ability between leiomyosarcoma and leiomyoma is necessary. Thus, further validation of trained model was conducted in newly collected clinical samples of leiomyosarcoma and leiomyoma. The DNN classifier performed sensitivity 0.88, 0.77 (8/9, 7/9) while the specificity 1.0 (8/8, 8/8) in two test data set supporting that the selected genes in conjunction with DNN classifier are well discriminating the difference between leiomyosarcoma and leiomyoma in clinical sample.

CONCLUSION

The transcriptome-based classifier accurately distinguished uterine leiomyosarcoma from leiomyoma. Our method can be helpful in clinical practice through the biopsy of sample in advance of surgery. Identification of leiomyosarcoma let the doctor avoid of laparoscopic surgery, thus it minimizes un-wanted tumor spread.

Collapse

Xin R, Cheng Q, Chi X, Feng X, Zhang H, Wang Y, Duan M, Xie T, Song X, Yu Q, Fan Y, Huang L, Zhou F. Computational Characterization of Undifferentially Expressed Genes with Altered Transcription Regulation in Lung Cancer. Genes (Basel) 2023;14:2169. [PMID: 38136991 PMCID: PMC10742656 DOI: 10.3390/genes14122169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 11/19/2023] [Accepted: 11/27/2023] [Indexed: 12/24/2023] Open

Affiliation(s)

Ruihao Xin Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (R.X.); (Y.W.); (M.D.); (L.H.) Jilin Institute of Chemical Technology, College of Information and Control Engineering, Jilin 132000, China; (Q.C.); (X.C.); (H.Z.)
Qian Cheng Jilin Institute of Chemical Technology, College of Information and Control Engineering, Jilin 132000, China; (Q.C.); (X.C.); (H.Z.)
Xiaohang Chi Jilin Institute of Chemical Technology, College of Information and Control Engineering, Jilin 132000, China; (Q.C.); (X.C.); (H.Z.)
Xin Feng School of Science, Jilin Institute of Chemical Technology, Jilin 132000, China; Department of Epidemiology and Biostatistics, School of Public Health, Jilin University, Changchun 130012, China;
Hang Zhang Jilin Institute of Chemical Technology, College of Information and Control Engineering, Jilin 132000, China; (Q.C.); (X.C.); (H.Z.)
Yueying Wang Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (R.X.); (Y.W.); (M.D.); (L.H.)
Meiyu Duan Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (R.X.); (Y.W.); (M.D.); (L.H.)
Tunyang Xie Centre for Mathematical Sciences, University of Cambridge, Wilberforce Road, Cambridge CB3 0WA, UK;
Xiaonan Song Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Software, Jilin University, Changchun 130012, China;
Qiong Yu Department of Epidemiology and Biostatistics, School of Public Health, Jilin University, Changchun 130012, China;
Yusi Fan Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Software, Jilin University, Changchun 130012, China;
Lan Huang Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (R.X.); (Y.W.); (M.D.); (L.H.)
Fengfeng Zhou Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (R.X.); (Y.W.); (M.D.); (L.H.) School of Biology and Engineering, Guizhou Medical University, Guiyang 550025, China

Collapse

Roberts AGK, Catchpoole DR, Kennedy PJ. Identification of differentially distributed gene expression and distinct sets of cancer-related genes identified by changes in mean and variability. NAR Genom Bioinform 2022;4:lqab124. [PMID: 35047816 PMCID: PMC8759562 DOI: 10.1093/nargab/lqab124] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Revised: 11/19/2021] [Accepted: 12/16/2021] [Indexed: 12/13/2022] Open

An R package for divergence analysis of omics data. PLoS One 2021;16:e0249002. [PMID: 33819273 PMCID: PMC8021195 DOI: 10.1371/journal.pone.0249002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Accepted: 03/09/2021] [Indexed: 11/19/2022] Open

Davis-Marcisak EF, Sherman TD, Orugunta P, Stein-O'Brien GL, Puram SV, Roussos Torres ET, Hopkins AC, Jaffee EM, Favorov AV, Afsari B, Goff LA, Fertig EJ. Differential Variation Analysis Enables Detection of Tumor Heterogeneity Using Single-Cell RNA-Sequencing Data. Cancer Res 2019;79:5102-5112. [PMID: 31337651 PMCID: PMC6844448 DOI: 10.1158/0008-5472.can-18-3882] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Revised: 05/13/2019] [Accepted: 07/19/2019] [Indexed: 12/20/2022]

Abstract

Tumor heterogeneity provides a complex challenge to cancer treatment and is a critical component of therapeutic response, disease recurrence, and patient survival. Single-cell RNA-sequencing (scRNA-seq) technologies have revealed the prevalence of intratumor and intertumor heterogeneity. Computational techniques are essential to quantify the differences in variation of these profiles between distinct cell types, tumor subtypes, and patients to fully characterize intratumor and intertumor molecular heterogeneity. In this study, we adapted our algorithm for pathway dysregulation, Expression Variation Analysis (EVA), to perform multivariate statistical analyses of differential variation of expression in gene sets for scRNA-seq. EVA has high sensitivity and specificity to detect pathways with true differential heterogeneity in simulated data. EVA was applied to several public domain scRNA-seq tumor datasets to quantify the landscape of tumor heterogeneity in several key applications in cancer genomics such as immunogenicity, metastasis, and cancer subtypes. Immune pathway heterogeneity of hematopoietic cell populations in breast tumors corresponded to the amount of diversity present in the T-cell repertoire of each individual. Cells from head and neck squamous cell carcinoma (HNSCC) primary tumors had significantly more heterogeneity across pathways than cells from metastases, consistent with a model of clonal outgrowth. Moreover, there were dramatic differences in pathway dysregulation across HNSCC basal primary tumors. Within the basal primary tumors, there was increased immune dysregulation in individuals with a high proportion of fibroblasts present in the tumor microenvironment. These results demonstrate the broad utility of EVA to quantify intertumor and intratumor heterogeneity from scRNA-seq data without reliance on low-dimensional visualization. SIGNIFICANCE: This study presents a robust statistical algorithm for evaluating gene expression heterogeneity within pathways or gene sets in single-cell RNA-seq data.

Collapse

Affiliation(s)

Emily F Davis-Marcisak McKusick-Nathans Institute of the Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland
Thomas D Sherman Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland
Pranay Orugunta Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland
Genevieve L Stein-O'Brien McKusick-Nathans Institute of the Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, Maryland
Sidharth V Puram Department of Otolaryngology-Head and Neck Surgery, Washington University School of Medicine, St. Louis, Missouri Department of Genetics, Washington University School of Medicine, St. Louis, Missouri
Evanthia T Roussos Torres Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland
Alexander C Hopkins Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, Michigan
Elizabeth M Jaffee Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland
Alexander V Favorov Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland Laboratory of Systems Biology and Computational Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
Bahman Afsari Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland
Loyal A Goff McKusick-Nathans Institute of the Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, Maryland
Elana J Fertig Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland. Department of Applied Mathematics and Statistics, Johns Hopkins University Whiting School of Engineering, Baltimore, Maryland Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, Maryland

Collapse

Roberts AGK, Catchpoole DR, Kennedy PJ. Variance-based Feature Selection for Classification of Cancer Subtypes Using Gene Expression Data. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) 2018:1-8. [DOI: 10.1109/ijcnn.2018.8489279] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2025]

Digitizing omics profiles by divergence from a baseline. Proc Natl Acad Sci U S A 2018;115:4545-4552. [PMID: 29666255 PMCID: PMC5939095 DOI: 10.1073/pnas.1721628115] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

Abstract

Technological advances enable increasingly comprehensive profiling of the molecular landscapes of cells, and these data can inform the personalized treatment of complex diseases. Two major obstacles are the complexity of these data and the high degree of person-to-person heterogeneity. We develop a highly simplified, personalized data representation by comparing the profile of an individual to the range of landscapes in a baseline population, thereby mimicking basic clinical diagnostic testing for departures of selected variables from normal levels. Moreover, our method can be applied to any data modality and at any level of granularity, from single features to any subset of features treated as a single entity, for example the gene expression levels in a pathway. Experiments involve both healthy human tissues and various cancer subtypes.

Data collected from omics technologies have revealed pervasive heterogeneity and stochasticity of molecular states within and between phenotypes. A prominent example of such heterogeneity occurs between genome-wide mRNA, microRNA, and methylation profiles from one individual tumor to another, even within a cancer subtype. However, current methods in bioinformatics, such as detecting differentially expressed genes or CpG sites, are population-based and therefore do not effectively model intersample diversity. Here we introduce a unified theory to quantify sample-level heterogeneity that is applicable to a single omics profile. Specifically, we simplify an omics profile to a digital representation based on the omics profiles from a set of samples from a reference or baseline population (e.g., normal tissues). The state of any subprofile (e.g., expression vector for a subset of genes) is said to be “divergent” if it lies outside the estimated support of the baseline distribution and is consequently interpreted as “dysregulated” relative to that baseline. We focus on two cases: single features (e.g., individual genes) and distinguished subsets (e.g., regulatory pathways). Notably, since the divergence analysis is at the individual sample level, dysregulation can be analyzed probabilistically; for example, one can estimate the probability that a gene or pathway is divergent in some population. Finally, the reduction in complexity facilitates a more “personalized” and biologically interpretable analysis of variation, as illustrated by experiments involving tissue characterization, disease detection and progression, and disease–pathway associations.

Collapse

Patange S, Girvan M, Larson DR. Single-cell systems biology: probing the basic unit of information flow. ACTA ACUST UNITED AC 2017;8:7-15. [PMID: 29552672 DOI: 10.1016/j.coisb.2017.11.011] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Extracting the Strongest Signals from Omics Data: Differentially Expressed Pathways and Beyond. Methods Mol Biol 2017. [PMID: 28849561 DOI: 10.1007/978-1-4939-7027-8_7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2025]

Rahmatallah Y, Zybailov B, Emmert-Streib F, Glazko G. GSAR: Bioconductor package for Gene Set analysis in R. BMC Bioinformatics 2017;18:61. [PMID: 28118818 PMCID: PMC5259853 DOI: 10.1186/s12859-017-1482-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Accepted: 01/10/2017] [Indexed: 01/01/2023] Open

Feinberg AP, Koldobskiy MA, Göndör A. Epigenetic modulators, modifiers and mediators in cancer aetiology and progression. Nat Rev Genet 2016;17:284-99. [PMID: 26972587 DOI: 10.1038/nrg.2016.13] [Citation(s) in RCA: 624] [Impact Index Per Article: 69.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]