1
|
Liang H, Berger B, Singh R. Tracing the Shared Foundations of Gene Expression and Chromatin Structure. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.31.646349. [PMID: 40235997 PMCID: PMC11996408 DOI: 10.1101/2025.03.31.646349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/17/2025]
Abstract
The three-dimensional organization of chromatin into topologically associating domains (TADs) may impact gene regulation by bringing distant genes into contact. However, many questions about TADs' function and their influence on transcription remain unresolved due to technical limitations in defining TAD boundaries and measuring the direct effect that TADs have on gene expression. Here, we develop consensus TAD maps for human and mouse with a novel "bag-of-genes" approach for defining the gene composition within TADs. This approach enables new functional interpretations of TADs by providing a way to capture species-level differences in chromatin organization. We also leverage a generative AI foundation model computed from 33 million transcriptomes to define contextual similarity, an embedding-based metric that is more powerful than co-expression at representing functional gene relationships. Our analytical framework directly leads to testable hypotheses about chromatin organization across cellular states. We find that TADs play an active role in facilitating gene co-regulation, possibly through a mechanism involving transcriptional condensates. We also discover that the TAD-linked enhancement of transcriptional context is strongest in early developmental stages and systematically declines with aging. Investigation of cancer cells show distinct patterns of TAD usage that shift with chemotherapy treatment, suggesting specific roles for TAD-mediated regulation in cellular development and plasticity. Finally, we develop "TAD signatures" to improve statistical analysis of single-cell transcriptomic data sets in predicting cancer cell-line drug response. These findings reshape our understanding of cellular plasticity in development and disease, indicating that chromatin organization acts through probabilistic mechanisms rather than deterministic rules. Software availability https://singhlab.net/tadmap.
Collapse
|
2
|
Frost HR. Leveraging cell type-specificity for gene set analysis of single cell transcriptomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.25.615040. [PMID: 39386631 PMCID: PMC11463668 DOI: 10.1101/2024.09.25.615040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Although single cell RNA-sequencing (scRNA-seq) provides unprecedented insights into the biology of complex tissues, analyzing such data on a gene-by-gene basis is challenging due to the large number of tested hypotheses and consequent low statistical power and difficult interpretation. These issues are magnified by the increased noise, significant sparsity and multi-modal distributions characteristic of single cell data. One promising approach for addressing these challenges is gene set testing, or pathway analysis. Unfortunately, statistical and biological differences between single cell and bulk transcriptomic data make it challenging to use existing gene set collections, which were developed for bulk tissue analysis, on scRNA-seq data. In this paper, we describe a procedure for customizing gene set collections originally created for bulk tissue analysis to reflect the structure of gene activity within specific cell types. Our approach leverages information about mean gene expression in the 81 human cell types profiled via scRNA-seq by the Human Protein Atlas (HPA) Single Cell Type Atlas. This HPA information is used to compute cell type-specific gene and gene set weights that can be used to filter or weight gene set collections. As demonstrated through the analysis of immune cell scRNA-seq data using gene sets from the Molecular Signatures Database (MSigDB), accounting for cell type-specificity can significantly improve gene set testing power and interpretability. An example vignette along with gene and gene set weights for the 81 HPA SCTA cell types and the MSigDB collections are available at https://hrfrost.host.dartmouth.edu/SCGeneSetOpt/.
Collapse
Affiliation(s)
- H. Robert Frost
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755
| |
Collapse
|
3
|
Ma X, Liu B, Gong Z, Wang J, Qu Z, Cai J. Comparative proteomic analysis across the developmental stages of the Eimeria tenella. Genomics 2024; 116:110792. [PMID: 38215860 DOI: 10.1016/j.ygeno.2024.110792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/07/2024] [Accepted: 01/09/2024] [Indexed: 01/14/2024]
Abstract
Eimeria tenella is the main pathogen responsible for coccidiosis in chickens. The life cycle of E. tenella is, arguably, the least complex of all Coccidia, with only one host. However, it presents different developmental stages, either in the environment or in the host and either intracellular or extracellular. Its signaling and metabolic pathways change with its different developmental stages. Until now, little is known about the developmental regulation and transformation mechanisms of its life cycle. In this study, protein profiles from the five developmental stages, including unsporulated oocysts (USO), partially sporulated (7 h) oocysts (SO7h), sporulated oocysts (SO), sporozoites (S) and second-generation merozoites (M2), were harvested using the label-free quantitative proteomics approach. Then the differentially expressed proteins (DEPs) for these stages were identified. A total of 314, 432, 689, and 665 DEPs were identified from the comparison of SO7h vs USO, SO vs SO7h, S vs SO, and M2 vs S, respectively. By conducting weighted gene coexpression network analysis (WGCNA), six modules were dissected. Proteins in blue and brown modules were calculated to be significantly positively correlated with the E. tenella developmental stages of sporozoites (S) and second-generation merozoites (M2), respectively. In addition, hub proteins with high intra-module degree were identified. Gene Ontology (GO) and Kyoto Encyclopedia of Gene and Genomes (KEGG) pathway enrichment analyses revealed that hub proteins in blue modules were involved in electron transport chain and oxidative phosphorylation. Hub proteins in the brown module were involved in RNA splicing. These findings provide new clues and ideas to enhance our fundamental understanding of the molecular mechanisms underlying parasite development.
Collapse
Affiliation(s)
- Xueting Ma
- State Key Laboratory for Animal Disease Control and Prevention, College of Veterinary Medicine, Lanzhou University, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou 730000, China; Gansu Province Research Center for Basic Disciplines of Pathogen Biology, Lanzhou 730046, China
| | - Baohong Liu
- State Key Laboratory for Animal Disease Control and Prevention, College of Veterinary Medicine, Lanzhou University, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou 730000, China; Gansu Province Research Center for Basic Disciplines of Pathogen Biology, Lanzhou 730046, China.
| | - Zhenxing Gong
- College of Animal Science and Technology, Ningxia University, Yinchuan, Ningxia Province 750021, China
| | - Jing Wang
- State Key Laboratory for Animal Disease Control and Prevention, College of Veterinary Medicine, Lanzhou University, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou 730000, China; Gansu Province Research Center for Basic Disciplines of Pathogen Biology, Lanzhou 730046, China
| | - Zigang Qu
- State Key Laboratory for Animal Disease Control and Prevention, College of Veterinary Medicine, Lanzhou University, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou 730000, China; Gansu Province Research Center for Basic Disciplines of Pathogen Biology, Lanzhou 730046, China
| | - Jianping Cai
- State Key Laboratory for Animal Disease Control and Prevention, College of Veterinary Medicine, Lanzhou University, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou 730000, China; Gansu Province Research Center for Basic Disciplines of Pathogen Biology, Lanzhou 730046, China.
| |
Collapse
|
4
|
Mah JL, Dunn CW. Cell type evolution reconstruction across species through cell phylogenies of single-cell RNA sequencing data. Nat Ecol Evol 2024; 8:325-338. [PMID: 38182680 DOI: 10.1038/s41559-023-02281-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 11/16/2023] [Indexed: 01/07/2024]
Abstract
The origin and evolution of cell types has emerged as a key topic in evolutionary biology. Driven by rapidly accumulating single-cell datasets, recent attempts to infer cell type evolution have largely been limited to pairwise comparisons because we lack approaches to build cell phylogenies using model-based approaches. Here we approach the challenges of applying explicit phylogenetic methods to single-cell data by using principal components as phylogenetic characters. We infer a cell phylogeny from a large, comparative single-cell dataset of eye cells from five distantly related mammals. Robust cell type clades enable us to provide a phylogenetic, rather than phenetic, definition of cell type, allowing us to forgo marker genes and phylogenetically classify cells by topology. We further observe evolutionary relationships between diverse vessel endothelia and identify the myelinating and non-myelinating Schwann cells as sister cell types. Finally, we examine principal component loadings and describe the gene expression dynamics underlying the function and identity of cell type clades that have been conserved across the five species. A cell phylogeny provides a rigorous framework towards investigating the evolutionary history of cells and will be critical to interpret comparative single-cell datasets that aim to ask fundamental evolutionary questions.
Collapse
Affiliation(s)
- Jasmine L Mah
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA.
| | - Casey W Dunn
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
| |
Collapse
|
5
|
Li S, Schmid KT, de Vries DH, Korshevniuk M, Losert C, Oelen R, van Blokland IV, Groot HE, Swertz MA, van der Harst P, Westra HJ, van der Wijst MGP, Heinig M, Franke L. Identification of genetic variants that impact gene co-expression relationships using large-scale single-cell data. Genome Biol 2023; 24:80. [PMID: 37072791 PMCID: PMC10111756 DOI: 10.1186/s13059-023-02897-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 03/16/2023] [Indexed: 04/20/2023] Open
Abstract
BACKGROUND Expression quantitative trait loci (eQTL) studies show how genetic variants affect downstream gene expression. Single-cell data allows reconstruction of personalized co-expression networks and therefore the identification of SNPs altering co-expression patterns (co-expression QTLs, co-eQTLs) and the affected upstream regulatory processes using a limited number of individuals. RESULTS We conduct a co-eQTL meta-analysis across four scRNA-seq peripheral blood mononuclear cell datasets using a novel filtering strategy followed by a permutation-based multiple testing approach. Before the analysis, we evaluate the co-expression patterns required for co-eQTL identification using different external resources. We identify a robust set of cell-type-specific co-eQTLs for 72 independent SNPs affecting 946 gene pairs. These co-eQTLs are replicated in a large bulk cohort and provide novel insights into how disease-associated variants alter regulatory networks. One co-eQTL SNP, rs1131017, that is associated with several autoimmune diseases, affects the co-expression of RPS26 with other ribosomal genes. Interestingly, specifically in T cells, the SNP additionally affects co-expression of RPS26 and a group of genes associated with T cell activation and autoimmune disease. Among these genes, we identify enrichment for targets of five T-cell-activation-related transcription factors whose binding sites harbor rs1131017. This reveals a previously overlooked process and pinpoints potential regulators that could explain the association of rs1131017 with autoimmune diseases. CONCLUSION Our co-eQTL results highlight the importance of studying context-specific gene regulation to understand the biological implications of genetic variation. With the expected growth of sc-eQTL datasets, our strategy and technical guidelines will facilitate future co-eQTL identification, further elucidating unknown disease mechanisms.
Collapse
Affiliation(s)
- Shuang Li
- Genetics Department, University Medical Center Groningen, Groningen, the Netherlands
- Genomics Coordination Center, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Katharina T Schmid
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Department of Computer Science, School of Computation, Information and Technology, Technical University Munich, Munich, Germany
| | - Dylan H de Vries
- Genetics Department, University Medical Center Groningen, Groningen, the Netherlands
| | - Maryna Korshevniuk
- Genetics Department, University Medical Center Groningen, Groningen, the Netherlands
| | - Corinna Losert
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Department of Computer Science, School of Computation, Information and Technology, Technical University Munich, Munich, Germany
| | - Roy Oelen
- Genetics Department, University Medical Center Groningen, Groningen, the Netherlands
| | - Irene V van Blokland
- Genetics Department, University Medical Center Groningen, Groningen, the Netherlands
- Department of Cardiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Hilde E Groot
- Department of Cardiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Morris A Swertz
- Genetics Department, University Medical Center Groningen, Groningen, the Netherlands
- Genomics Coordination Center, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Pim van der Harst
- Department of Cardiology, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Harm-Jan Westra
- Genetics Department, University Medical Center Groningen, Groningen, the Netherlands
| | | | - Matthias Heinig
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.
- Department of Computer Science, School of Computation, Information and Technology, Technical University Munich, Munich, Germany.
- Munich Heart Alliance, DZHK (German Center for Cardiovascular Research), Munich, Germany.
| | - Lude Franke
- Genetics Department, University Medical Center Groningen, Groningen, the Netherlands.
| |
Collapse
|
6
|
Choi Y, Li R, Quon G. siVAE: interpretable deep generative models for single-cell transcriptomes. Genome Biol 2023; 24:29. [PMID: 36803416 PMCID: PMC9940350 DOI: 10.1186/s13059-023-02850-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 01/06/2023] [Indexed: 02/22/2023] Open
Abstract
Neural networks such as variational autoencoders (VAE) perform dimensionality reduction for the visualization and analysis of genomic data, but are limited in their interpretability: it is unknown which data features are represented by each embedding dimension. We present siVAE, a VAE that is interpretable by design, thereby enhancing downstream analysis tasks. Through interpretation, siVAE also identifies gene modules and hubs without explicit gene network inference. We use siVAE to identify gene modules whose connectivity is associated with diverse phenotypes such as iPSC neuronal differentiation efficiency and dementia, showcasing the wide applicability of interpretable generative models for genomic data analysis.
Collapse
Affiliation(s)
- Yongin Choi
- Graduate Group in Biomedical Engineering, University of California, Davis, Davis, CA, USA
- Genome Center, University of California, Davis, Davis, CA, USA
| | - Ruoxin Li
- Genome Center, University of California, Davis, Davis, CA, USA
- Graduate Group in Biostatistics, University of California, Davis, Davis, CA, USA
| | - Gerald Quon
- Graduate Group in Biomedical Engineering, University of California, Davis, Davis, CA, USA.
- Genome Center, University of California, Davis, Davis, CA, USA.
- Department of Molecular and Cellular Biology, University of California, Davis, Davis, CA, USA.
| |
Collapse
|
7
|
Afshar S, Braun PR, Han S, Lin Y. A multimodal deep learning model to infer cell-type-specific functional gene networks. BMC Bioinformatics 2023; 24:47. [PMID: 36788477 PMCID: PMC9926713 DOI: 10.1186/s12859-023-05146-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 01/11/2023] [Indexed: 02/16/2023] Open
Abstract
BACKGROUND Functional gene networks (FGNs) capture functional relationships among genes that vary across tissues and cell types. Construction of cell-type-specific FGNs enables the understanding of cell-type-specific functional gene relationships and insights into genetic mechanisms of human diseases in disease-relevant cell types. However, most existing FGNs were developed without consideration of specific cell types within tissues. RESULTS In this study, we created a multimodal deep learning model (MDLCN) to predict cell-type-specific FGNs in the human brain by integrating single-nuclei gene expression data with global protein interaction networks. We systematically evaluated the prediction performance of the MDLCN and showed its superior performance compared to two baseline models (boosting tree and convolutional neural network). Based on the predicted cell-type-specific FGNs, we observed that cell-type marker genes had a higher level of hubness than non-marker genes in their corresponding cell type. Furthermore, we showed that risk genes underlying autism and Alzheimer's disease were more strongly connected in disease-relevant cell types, supporting the cellular context of predicted cell-type-specific FGNs. CONCLUSIONS Our study proposes a powerful deep learning approach (MDLCN) to predict FGNs underlying a diverse set of cell types in human brain. The MDLCN model enhances prediction accuracy of cell-type-specific FGNs compared to single modality convolutional neural network (CNN) and boosting tree models, as shown by higher areas under both receiver operating characteristic (ROC) and precision-recall curves for different levels of independent test datasets. The predicted FGNs also show evidence for the cellular context and distinct topological features (i.e. higher hubness and topological score) of cell-type marker genes. Moreover, we observed stronger modularity among disease-associated risk genes in FGNs of disease-relevant cell types. For example, the strength of connectivity among autism risk genes was stronger in neurons, but risk genes underlying Alzheimer's disease were more connected in microglia.
Collapse
Affiliation(s)
- Shiva Afshar
- grid.266436.30000 0004 1569 9707Department of Industrial Engineering, University of Houston, Houston, TX 77204 USA
| | - Patricia R. Braun
- grid.21107.350000 0001 2171 9311Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD 21287 USA
| | - Shizhong Han
- grid.21107.350000 0001 2171 9311Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD 21287 USA ,grid.429552.d0000 0004 5913 1291Lieber Institute for Brain Development, Baltimore, MD 21205 USA
| | - Ying Lin
- Department of Industrial Engineering, University of Houston, Houston, TX, 77204, USA.
| |
Collapse
|
8
|
Heidarzadehpilehrood R, Pirhoushiaran M, Binti Osman M, Abdul Hamid H, Ling KH. Weighted Gene Co-Expression Network Analysis (WGCNA) Discovered Novel Long Non-Coding RNAs for Polycystic Ovary Syndrome. Biomedicines 2023; 11:biomedicines11020518. [PMID: 36831054 PMCID: PMC9953234 DOI: 10.3390/biomedicines11020518] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 02/03/2023] [Accepted: 02/06/2023] [Indexed: 02/15/2023] Open
Abstract
Polycystic ovary syndrome (PCOS) affects reproductive-age women. This condition causes infertility, insulin resistance, obesity, and heart difficulties. The molecular basis and mechanism of PCOS might potentially generate effective treatments. Long non-coding RNAs (lncRNAs) show control over multifactorial disorders' growth and incidence. Numerous studies have emphasized its significance and alterations in PCOS. We used bioinformatic methods to find novel dysregulated lncRNAs in PCOS. To achieve this objective, the gene expression profile of GSE48301, comprising PCOS patients and normal control tissue samples, was evaluated using the R limma package with the following cut-off criterion: p-value < 0.05. Firstly, weighted gene co-expression network analysis (WGCNA) was used to determine the co-expression genes of lncRNAs; subsequently, hub gene identification and pathway enrichment analysis were used. With the defined criteria, nine novel dysregulated lncRNAs were identified. In WGCNA, different colors represent different modules. In the current study, WGCNA resulted in turquoise, gray, blue, and black co-expression modules with dysregulated lncRNAs. The pathway enrichment analysis of these co-expressed modules revealed enrichment in PCOS-associated pathways, including gene expression, signal transduction, metabolism, and apoptosis. In addition, CCT7, EFTUD2, ESR1, JUN, NDUFAB1, CTTNB1, GRB2, and CTNNB1 were identified as hub genes, and some of them have been investigated in PCOS. This study uncovered nine novel PCOS-related lncRNAs. To confirm how these lncRNAs control translational modification in PCOS, functional studies are required.
Collapse
Affiliation(s)
- Roozbeh Heidarzadehpilehrood
- Department of Obstetrics & Gynaecology, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang 43400, Malaysia
| | - Maryam Pirhoushiaran
- Department of Medical Genetics, School of Medicine, Tehran University of Medical Sciences, Tehran 1417613151, Iran
| | - Malina Binti Osman
- Department of Medical Microbiology, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang 43400, Malaysia
| | - Habibah Abdul Hamid
- Department of Obstetrics & Gynaecology, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang 43400, Malaysia
- Correspondence: (H.A.H.); (K.-H.L.)
| | - King-Hwa Ling
- Department of Biomedical Science, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang 43400, Malaysia
- Correspondence: (H.A.H.); (K.-H.L.)
| |
Collapse
|
9
|
Gut Microbiota Alterations in Trace Amine-Associated Receptor 9 (TAAR9) Knockout Rats. Biomolecules 2022; 12:biom12121823. [PMID: 36551251 PMCID: PMC9775382 DOI: 10.3390/biom12121823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 11/27/2022] [Accepted: 11/29/2022] [Indexed: 12/12/2022] Open
Abstract
Trace amine-associated receptors (TAAR1-TAAR9) are a family of G-protein-coupled monoaminergic receptors which might have great pharmacological potential. It has now been well established that TAAR1 plays an important role in the central nervous system. Interestingly, deletion of TAAR9 in rats leads to alterations in the periphery. Previously, we found that knockout of TAAR9 in rats (TAAR9-KO rats) decreased low-density lipoprotein cholesterol levels in the blood. TAAR9 was also identified in intestinal tissues, and it is known that it responds to polyamines. To elucidate the role of TAAR9 in the intestinal epithelium, we analyzed TAAR9-co-expressed gene clusters in public data for cecum samples. As identified by gene ontology enrichment analysis, in the intestine, TAAR9 is co-expressed with genes involved in intestinal mucosa homeostasis and function, including cell organization, differentiation, and death. Additionally, TAAR9 was co-expressed with genes implicated in dopamine signaling, which may suggest a role for this receptor in the regulation of peripheral dopaminergic transmission. To further investigate how TAAR9 might be involved in colonic mucosal homeostasis, we analyzed the fecal microbiome composition in TAAR9-KO rats and their wild-type littermates. We identified a significant difference in the number of observed taxa between the microbiome of TAAR9-KO and wild-type rats. In TAAR9-KO rats, the gut microbial community became more variable compared with the wild-type rats. Furthermore, it was found that the family Saccharimonadaceae, which is one of the top 10 most abundant families in TAAR9-KO rat feces, is almost completely absent in wild-type animal fecal samples. Taken together, these data indicate a role of TAAR9 in intestinal function.
Collapse
|
10
|
Acharyya S, Zhou X, Baladandayuthapani V. SpaceX: gene co-expression network estimation for spatial transcriptomics. Bioinformatics 2022; 38:5033-5041. [PMID: 36179087 PMCID: PMC9665869 DOI: 10.1093/bioinformatics/btac645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 08/27/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION The analysis of spatially resolved transcriptome enables the understanding of the spatial interactions between the cellular environment and transcriptional regulation. In particular, the characterization of the gene-gene co-expression at distinct spatial locations or cell types in the tissue enables delineation of spatial co-regulatory patterns as opposed to standard differential single gene analyses. To enhance the ability and potential of spatial transcriptomics technologies to drive biological discovery, we develop a statistical framework to detect gene co-expression patterns in a spatially structured tissue consisting of different clusters in the form of cell classes or tissue domains. RESULTS We develop SpaceX (spatially dependent gene co-expression network), a Bayesian methodology to identify both shared and cluster-specific co-expression network across genes. SpaceX uses an over-dispersed spatial Poisson model coupled with a high-dimensional factor model which is based on a dimension reduction technique for computational efficiency. We show via simulations, accuracy gains in co-expression network estimation and structure by accounting for (increasing) spatial correlation and appropriate noise distributions. In-depth analysis of two spatial transcriptomics datasets in mouse hypothalamus and human breast cancer using SpaceX, detected multiple hub genes which are related to cognitive abilities for the hypothalamus data and multiple cancer genes (e.g. collagen family) from the tumor region for the breast cancer data. AVAILABILITY AND IMPLEMENTATION The SpaceX R-package is available at github.com/bayesrx/SpaceX. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Satwik Acharyya
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | | |
Collapse
|
11
|
Algabri YA, Li L, Liu ZP. scGENA: A Single-Cell Gene Coexpression Network Analysis Framework for Clustering Cell Types and Revealing Biological Mechanisms. Bioengineering (Basel) 2022; 9:bioengineering9080353. [PMID: 36004879 PMCID: PMC9405199 DOI: 10.3390/bioengineering9080353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 07/27/2022] [Accepted: 07/27/2022] [Indexed: 11/16/2022] Open
Abstract
Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput technique that can measure gene expression, reveal cell heterogeneity, rare and complex cell populations, and discover cell types and their relationships. The analysis of scRNA-seq data is challenging because of transcripts sparsity, replication noise, and outlier cell populations. A gene coexpression network (GCN) analysis effectively deciphers phenotypic differences in specific states by describing gene–gene pairwise relationships. The underlying gene modules with different coexpression patterns partially bridge the gap between genotype and phenotype. This study presents a new framework called scGENA (single-cell gene coexpression network analysis) for GCN analysis based on scRNA-seq data. Although there are several methods for scRNA-seq data analysis, we aim to build an integrative pipeline for several purposes that cover primary data preprocessing, including data exploration, quality control, normalization, imputation, and dimensionality reduction of clustering as downstream of GCN analysis. To demonstrate this integrated workflow, an scRNA-seq dataset of the human diabetic pancreas with 1600 cells and 39,851 genes was implemented to perform all these processes in practice. As a result, scGENA is demonstrated to uncover interesting gene modules behind complex diseases, which reveal biological mechanisms. scGENA provides a state-of-the-art method for gene coexpression analysis for scRNA-seq data.
Collapse
|
12
|
Bhardwaj A, Josse C, Van Daele D, Poulet C, Chavez M, Struman I, Van Steen K. Deeper insights into long-term survival heterogeneity of pancreatic ductal adenocarcinoma (PDAC) patients using integrative individual- and group-level transcriptome network analyses. Sci Rep 2022; 12:11027. [PMID: 35773268 PMCID: PMC9247075 DOI: 10.1038/s41598-022-14592-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 06/09/2022] [Indexed: 11/22/2022] Open
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is categorized as the leading cause of cancer mortality worldwide. However, its predictive markers for long-term survival are not well known. It is interesting to delineate individual-specific perturbed genes when comparing long-term (LT) and short-term (ST) PDAC survivors and integrate individual- and group-based transcriptome profiling. Using a discovery cohort of 19 PDAC patients from CHU-Liège (Belgium), we first performed differential gene expression analysis comparing LT to ST survivor. Second, we adopted systems biology approaches to obtain clinically relevant gene modules. Third, we created individual-specific perturbation profiles. Furthermore, we used Degree-Aware disease gene prioritizing (DADA) method to develop PDAC disease modules; Network-based Integration of Multi-omics Data (NetICS) to integrate group-based and individual-specific perturbed genes in relation to PDAC LT survival. We identified 173 differentially expressed genes (DEGs) in ST and LT survivors and five modules (including 38 DEGs) showing associations to clinical traits. Validation of DEGs in the molecular lab suggested a role of REG4 and TSPAN8 in PDAC survival. Via NetICS and DADA, we identified various known oncogenes such as CUL1 and TGFB1. Our proposed analytic workflow shows the advantages of combining clinical and omics data as well as individual- and group-level transcriptome profiling.
Collapse
Affiliation(s)
- Archana Bhardwaj
- GIGA-R Centre, BIO3 - Medical Genomics, University of Liège, Avenue de L'Hôpital, 11, 4000, Liège, Belgium.
| | - Claire Josse
- Laboratory of Human Genetics, GIGA Research, University Hospital (CHU), Liège, Belgium
- Medical Oncology Department, CHU Liège, Liège, Belgium
| | - Daniel Van Daele
- Department of Gastro-Enterology, University Hospital (CHU), Liège, Belgium
| | - Christophe Poulet
- Laboratory of Human Genetics, GIGA Research, University Hospital (CHU), Liège, Belgium
- Laboratory of Rheumatology, GIGA-R, University Hospital (CHULiege), Liège, Belgium
| | - Marcela Chavez
- Department of Medicine, Division of Hematology, University Hospital (CHU), Liège, Belgium
| | - Ingrid Struman
- GIGA-R Centre, Laboratory of Molecular Angiogenesis, University of Liège, Liège, Belgium
| | - Kristel Van Steen
- GIGA-R Centre, BIO3 - Medical Genomics, University of Liège, Avenue de L'Hôpital, 11, 4000, Liège, Belgium
| |
Collapse
|
13
|
Risk subtyping and prognostic assessment of prostate cancer based on consensus genes. Commun Biol 2022; 5:233. [PMID: 35293897 PMCID: PMC8924191 DOI: 10.1038/s42003-022-03164-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2021] [Accepted: 02/14/2022] [Indexed: 01/20/2023] Open
Abstract
Prostate cancer (PCa) is the most frequent malignancy in male urogenital system around worldwide. We performed molecular subtyping and prognostic assessment based on consensus genes in patients with PCa. Five cohorts containing 1,046 PCa patients with RNA expression profiles and recorded clinical follow-up information were included. Univariate, multivariate Cox regression analysis and least absolute shrinkage and selection operator (LASSO) Cox regression were used to select prognostic genes and establish the signature. Immunohistochemistry staining, cell proliferation, migration and invasion assays were used to assess the biological functions of key genes. Thirty-nine intersecting consensus prognostic genes from five independent cohorts were identified. Subsequently, an eleven-consensus-gene classifier was established. In addition, multivariate Cox regression analyses showed that the classifier served as an independent indicator of recurrence-free survival in three of the five cohorts. Combined receiver operating characteristic (ROC) analysis achieved synthesized effects by combining the classifier with clinicopathological features in four of five cohorts. SRD5A2 inhibits cell proliferation, while ITGA11 promotes cell migration and invasion, possibly through the PI3K/AKT signaling pathway. To conclude, we established and validated an eleven-consensus-gene classifier, which may add prognostic value to the currently available staging system. By analysis of gene expression profiles of prostate cancer patients from multiple platforms, an eleven-consensus-gene classifier is constructed to provide a robust tool for the prediction of recurrence-free survival.
Collapse
|
14
|
Wu G, Li X, Guo W, Wei Z, Hu T, Shan Y, Gu J. JEBIN: analyzing gene co-expressions across multiple datasets by joint network embedding. Brief Bioinform 2022; 23:6519533. [PMID: 35134135 DOI: 10.1093/bib/bbab603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 12/15/2021] [Accepted: 12/27/2021] [Indexed: 11/13/2022] Open
Abstract
The inference of gene co-expression associations is one of the fundamental tasks for large-scale transcriptomic data analysis. Due to the high dimensionality and high noises in transcriptomic data, it is difficult to infer stable gene co-expression associations from single dataset. Meta-analysis of multisource data can effectively tackle this problem. We proposed Joint Embedding of multiple BIpartite Networks (JEBIN) to learn the low-dimensional consensus representation for genes by integrating multiple expression datasets. JEBIN infers gene co-expression associations in a nonlinear and global similarity manner and can integrate datasets with different distributions in linear time complexity with the gene and total sample size. The effectiveness and scalability of JEBIN were verified by simulation experiments, and its superiority over the commonly used integration methods was proved by three indexes on real biological datasets. Then, JEBIN was applied to study the gene co-expression patterns of hepatocellular carcinoma (HCC) based on multiple expression datasets of HCC and adjacent normal tissues, and further on latest HCC single-cell RNA-seq data. Results show that gene co-expressions are highly different between bulk and single-cell datasets. Finally, many differentially co-expressed ligand-receptor pairs were discovered by comparing HCC with adjacent normal data, providing candidate HCC targets for abnormal cell-cell communications.
Collapse
Affiliation(s)
- Guiying Wu
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xiangyu Li
- School of Software Engineering, Beijing Jiaotong University, Beijing 100044, China
| | - Wenbo Guo
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Zheng Wei
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Tao Hu
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Yiran Shan
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Jin Gu
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| |
Collapse
|
15
|
Kumar R, Ojha KK, Yadav HN, Singh VK. Linking co-expression modules with phenotypes. Bioinformation 2022; 18:438-441. [PMID: 36909689 PMCID: PMC9997497 DOI: 10.6026/97320630018438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 04/30/2022] [Accepted: 04/30/2022] [Indexed: 11/23/2022] Open
Abstract
The method for quantifying the association between co-expression module and clinical trait of interest requires application of dimensionality reduction to summaries modules as one dimensional (1D) vector. However, these methods are often linked with information loss. The amount of information lost depends upon the percentage of variance captured by the reduced 1D vector. Therefore, it is of interest to describe a method using analysis of rank (AOR) to assess the association between module and clinical trait of interest. This method works with clinical traits represented as binary class labels and can be adopted for clinical traits measured in continuous scale by dividing samples in two groups around median value. Application of the AOR method on test data for muscle gene expression profiles identifies modules significantly associated with diabetes status.
Collapse
Affiliation(s)
- Rakesh Kumar
- Department of Bioinformatics, Central University of South Bihar, Gaya, Bihar 824236, India
| | - Krishna Kumar Ojha
- Department of Bioinformatics, Central University of South Bihar, Gaya, Bihar 824236, India
| | - Harlokesh Narayan Yadav
- Department of Pharmacology, All India Institute of Medical Sciences, Ansari Nagar, New Delhi - 110029, India
| | - Vijay Kumar Singh
- Department of Bioinformatics, Central University of South Bihar, Gaya, Bihar 824236, India
| |
Collapse
|
16
|
Wu G, Li Y. Distinct characteristics of correlation analysis at the single-cell and the population level. Stat Appl Genet Mol Biol 2022; 21:sagmb-2022-0015. [PMID: 35918809 DOI: 10.1515/sagmb-2022-0015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 06/13/2022] [Indexed: 11/15/2022]
Abstract
Correlation analysis is widely used in biological studies to infer molecular relationships within biological networks. Recently, single-cell analysis has drawn tremendous interests, for its ability to obtain high-resolution molecular phenotypes. It turns out that there is little overlap of co-expressed genes identified in single-cell level investigations with that of population level investigations. However, the nature of the relationship of correlations between single-cell and population levels remains unclear. In this manuscript, we aimed to unveil the origin of the differences between the correlation coefficients at the single-cell level and that at the population level, and bridge the gap between them. Through developing formulations to link correlations at the single-cell and the population level, we illustrated that aggregated correlations could be stronger, weaker or equal to the corresponding individual correlations, depending on the variations and the correlations within the population. When the correlation within the population is weaker than the individual correlation, the aggregated correlation is stronger than the corresponding individual correlation. Besides, our data indicated that aggregated correlation is more likely to be stronger than the corresponding individual correlation, and it was rare to find gene-pairs exclusively strongly correlated at the single-cell level. Through a bottom-up approach to model interactions between molecules in a signaling cascade or a multi-regulator-controlled gene expression, we surprisingly found that the existence of interaction between two components could not be excluded simply based on their low correlation coefficients, suggesting a reconsideration of connectivity within biological networks which was derived solely from correlation analysis. We also investigated the impact of technical random measurement errors on the correlation coefficients for the single-cell level and the population level. The results indicate that the aggregated correlation is relatively robust and less affected. Because of the heterogeneity among single cells, correlation coefficients calculated based on data of the single-cell level might be different from that of the population level. Depending on the specific question we are asking, proper sampling and normalization procedure should be done before we draw any conclusions.
Collapse
Affiliation(s)
- Guoyu Wu
- School of Clinical Pharmacy, Guangdong Pharmaceutical University, Guangzhou, China
- Key Specialty of Clinical Pharmacy, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China
- NMPA Key Laboratory for Technology Research and Evaluation of Pharmacovigilance, Guangdong Pharmaceutical University, Guangzhou, China
| | - Yuchao Li
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- MegaLab, MegaRobo Technologies Co., Ltd, Beijing, China
| |
Collapse
|
17
|
Wang Z, Chai C, Wang R, Feng Y, Huang L, Zhang Y, Xiao X, Yang S, Zhang Y, Zhang X. Single-cell transcriptome atlas of human mesenchymal stem cells exploring cellular heterogeneity. Clin Transl Med 2021; 11:e650. [PMID: 34965030 PMCID: PMC8715893 DOI: 10.1002/ctm2.650] [Citation(s) in RCA: 83] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 10/24/2021] [Accepted: 10/30/2021] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND The heterogeneity of mesenchymal stem cells (MSCs) is poorly understood, thus limiting clinical application and basic research reproducibility. Advanced single-cell RNA sequencing (scRNA-seq) is a robust tool used to analyse for dissecting cellular heterogeneity. However, the comprehensive single-cell atlas for human MSCs has not been achieved. METHODS This study used massive parallel multiplexing scRNA-seq to construct an atlas of > 130 000 single-MSC transcriptomes across multiple tissues and donors to assess their heterogeneity. The most widely clinically utilised tissue resources for MSCs were collected, including normal bone marrow (n = 3), adipose (n = 3), umbilical cord (n = 2), and dermis (n = 3). RESULTS Seven tissue-specific and five conserved MSC subpopulations with distinct gene-expression signatures were identified from multiple tissue origins based on the high-quality data, which has not been achieved previously. This study showed that extracellular matrix (ECM) highly contributes to MSC heterogeneity. Notably, tissue-specific MSC subpopulations were substantially heterogeneous on ECM-associated immune regulation, antigen processing/presentation, and senescence, thus promoting inter-donor and intra-tissue heterogeneity. The variable dynamics of ECM-associated genes had discrete trajectory patterns across multiple tissues. Additionally, the conserved and tissue-specific transcriptomic-regulons and protein-protein interactions were identified, potentially representing common or tissue-specific MSC functional roles. Furthermore, the umbilical-cord-specific subpopulation possessed advantages in immunosuppressive properties. CONCLUSION In summary, this work provides timely and great insights into MSC heterogeneity at multiple levels. This MSC atlas taxonomy also provides a comprehensive understanding of cellular heterogeneity, thus revealing the potential improvements in MSC-based therapeutic efficacy.
Collapse
Affiliation(s)
- Zheng Wang
- Medical Center of Hematologythe Second Affiliated HospitalArmy Medical UniversityChongqingChina
- State Key Laboratory of TraumaBurn and Combined InjuryArmy Medical UniversityChongqingChina
| | - Chengyan Chai
- Medical Center of Hematologythe Second Affiliated HospitalArmy Medical UniversityChongqingChina
- State Key Laboratory of TraumaBurn and Combined InjuryArmy Medical UniversityChongqingChina
| | - Rui Wang
- Medical Center of Hematologythe Second Affiliated HospitalArmy Medical UniversityChongqingChina
- State Key Laboratory of TraumaBurn and Combined InjuryArmy Medical UniversityChongqingChina
| | - Yimei Feng
- Medical Center of Hematologythe Second Affiliated HospitalArmy Medical UniversityChongqingChina
- State Key Laboratory of TraumaBurn and Combined InjuryArmy Medical UniversityChongqingChina
| | - Lei Huang
- Department of Urologythe Second Affiliated HospitalArmy Military Medical UniversityChongqingChina
| | - Yiming Zhang
- Department of Plastic and Cosmetic Surgerythe Second Affiliated HospitalArmy Medical UniversityChongqingChina
| | - Xia Xiao
- Time Plastic Surgery HospitalChongqingChina
| | - Shijie Yang
- Medical Center of Hematologythe Second Affiliated HospitalArmy Medical UniversityChongqingChina
- State Key Laboratory of TraumaBurn and Combined InjuryArmy Medical UniversityChongqingChina
| | - Yunfang Zhang
- Medical Center of Hematologythe Second Affiliated HospitalArmy Medical UniversityChongqingChina
- State Key Laboratory of TraumaBurn and Combined InjuryArmy Medical UniversityChongqingChina
| | - Xi Zhang
- Medical Center of Hematologythe Second Affiliated HospitalArmy Medical UniversityChongqingChina
- State Key Laboratory of TraumaBurn and Combined InjuryArmy Medical UniversityChongqingChina
- National Clinical Research Center for Hematologic Diseasesthe First Affiliated Hospital of Soochow UniversitySuzhouChina
| |
Collapse
|
18
|
Li X, Zhang S, Wong KC. Evolving Transcriptomic Profiles From Single-Cell RNA-Seq Data Using Nature-Inspired Multiobjective Optimization. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2445-2458. [PMID: 32031947 DOI: 10.1109/tcbb.2020.2971993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Transcriptomic profiling plays an important role in post-genomic analysis. Especially, the single-cell RNA-seq technology has advanced our understanding of gene expression from cell population level into individual cell level. Many computational methods have been proposed to decipher transcriptomic profiles from those RNA-seq data. However, most of the related algorithms suffer from realistic restrictions such as high dimensionality and premature convergence. In this paper, we propose and formulate an evolutionary multiobjective blind compressed sensing (EMOBCS) to address those problems for evolving transcriptomic profiles from single-cell RNA-seq data. In the proposed framework, to characterize various gene expression profile models, two objective functions including chi-squared kernel score and euclidean distance of different gene expression profiles are formulated. After that, multiobjective blind compressed sensing based on artificial bee colony is designed to optimize the two objective functions on single-cell RNA-seq data by proposing a rank probability model and two new search strategies into the cooperative convolution framework in an unbiased manner. To demonstrate its effectiveness, extensive experiments have been conducted, comparing the proposed algorithm with 14 algorithms including eight state-of-the-art algorithms and six different EMOBCS algorithms under different search strategies on 10 single-cell RNA-seq datasets and one case study. The experimental results reveal that the proposed algorithm is better than or comparable with those compared algorithms. Furthermore, we also conduct the time complexity analysis, convergence analysis, and parameter analysis to demonstrate various properties of EMOBCS.
Collapse
|
19
|
Comprehensive Characterization of Multitissue Expression Landscape, Co-Expression Networks and Positive Selection in Pikeperch. Cells 2021; 10:cells10092289. [PMID: 34571938 PMCID: PMC8471114 DOI: 10.3390/cells10092289] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 08/27/2021] [Accepted: 08/29/2021] [Indexed: 11/19/2022] Open
Abstract
Promising efforts are ongoing to extend genomics resources for pikeperch (Sander lucioperca), a species of high interest for the sustainable European aquaculture sector. Although previous work, including reference genome assembly, transcriptome sequence, and single-nucleotide polymorphism genotyping, added a great wealth of genomic tools, a comprehensive characterization of gene expression across major tissues in pikeperch still remains an unmet research need. Here, we used deep RNA-Sequencing of ten vital tissues collected in eight animals to build a high-confident and annotated trancriptome atlas, to detect the tissue-specificity of gene expression and co-expression network modules, and to investigate genome-wide selective signatures in the Percidae fish family. Pathway enrichment and protein–protein interaction network analyses were performed to characterize the unique biological functions of tissue-specific genes and co-expression modules. We detected strong functional correlations and similarities of tissues with respect to their expression patterns—but also significant differences in the complexity and composition of their transcriptomes. Moreover, functional analyses revealed that tissue-specific genes essentially play key roles in the specific physiological functions of the respective tissues. Identified network modules were also functionally coherent with tissues’ main physiological functions. Although tissue specificity was not associated with positive selection, several genes under selection were found to be involved in hypoxia, immunity, and gene regulation processes, that are crucial for fish adaption and welfare. Overall, these new resources and insights will not only enhance the understanding of mechanisms of organ biology in pikeperch, but also complement the amount of genomic resources for this commercial species.
Collapse
|
20
|
Ma D, Zhan D, Fu Y, Wei S, Lal B, Wang J, Li Y, Lopez-Bertoni H, Yalcin F, Dzaye O, Eberhart CG, Laterra J, Wilson MA, Ying M, Xia S. Mutant IDH1 promotes phagocytic function of microglia/macrophages in gliomas by downregulating ICAM1. Cancer Lett 2021; 517:35-45. [PMID: 34098063 DOI: 10.1016/j.canlet.2021.05.038] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 05/27/2021] [Accepted: 05/28/2021] [Indexed: 11/15/2022]
Abstract
Tumor-associated microglia/macrophages (TAMs) are the main innate immune effector cells in malignant gliomas and have both pro- and anti-tumor functions. The plasticity of TAMs is partially dictated by oncogenic mutations in tumor cells. Heterozygous IDH1 mutation is a cancer driver gene prevalent in grade II/III gliomas, and IDH1 mutant gliomas have relatively favorable clinical outcomes. It is largely unknown how IDH mutation alters TAM phenotypes to influence glioma growth. Here we established clinically relevant isogenic glioma models carrying monoallelic IDH1 R132H mutation (IDH1R132H/WT) and found that IDH1R132H/WT significantly downregulated immune response-related pathways in glioma cells, indicating an immunomodulation role of mutant IDH1. Co-culturing IDH1R132H/WT glioma cells with human macrophages promoted anti-tumor phenotypes of macrophages and increased macrophage migration and phagocytic capacity. In orthotopic xenografts, IDH1R132H/WT decreased tumor growth and prolonged animal survival, accompanied by increased TAM recruitment and upregulated phagocytosis markers, suggesting the induction of anti-tumor TAM functions. Using human cytokine arrays that query 36 proteins, we identified significant downregulation of ICAM-1/CD54 in IDH1R132H/WT gliomas, which was further confirmed by ELISA and immunoblotting analyses. ICAM1 gain-of-function studies revealed that ICAM1 downregulation in IDH1R132H/WT cells played a mechanistic role to mediate the immunomodulation function of IDH1R132H/WT. ICAM-1 silencing in IDH1 wild-type glioma cells decreased tumor growth and increased the anti-tumor function of TAMs. Together, our studies support a new TAM-mediated phagocytic function within IDH1 mutant gliomas, and improved understanding of this process may uncover novel approaches to targeting IDH1 wild type gliomas.
Collapse
Affiliation(s)
- Ding Ma
- Hugo W. Moser Research Institute at Kennedy Krieger, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Blood and Cell Therapy Institute, University of Science and Technology of China, Anhui Provincial Hospital, Hefei, Anhui, China.
| | - Daqian Zhan
- Hugo W. Moser Research Institute at Kennedy Krieger, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Department of Respiratory and Critical Care Medicine, Tongji Hospital, Huazhong University of Science and Technology, Wuhan, China
| | - Yi Fu
- Hugo W. Moser Research Institute at Kennedy Krieger, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Shuang Wei
- Department of Respiratory and Critical Care Medicine, Tongji Hospital, Huazhong University of Science and Technology, Wuhan, China
| | - Bachchu Lal
- Hugo W. Moser Research Institute at Kennedy Krieger, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Jie Wang
- Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Yunqing Li
- Hugo W. Moser Research Institute at Kennedy Krieger, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Hernando Lopez-Bertoni
- Hugo W. Moser Research Institute at Kennedy Krieger, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Fatih Yalcin
- Department of Radiology and Neuroradiology, Charité, Berlin, Germany; University Hospital Center Schleswig Holstein, Department of Neurosurgery, Kiel, Schleswig-Holstein, Germany; Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Omar Dzaye
- Department of Radiology and Neuroradiology, Charité, Berlin, Germany; Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Charles G Eberhart
- Departments of Pathology, Oncology, Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - John Laterra
- Hugo W. Moser Research Institute at Kennedy Krieger, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Departments of Oncology and Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Mary Ann Wilson
- Hugo W. Moser Research Institute at Kennedy Krieger, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Mingyao Ying
- Hugo W. Moser Research Institute at Kennedy Krieger, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Shuli Xia
- Hugo W. Moser Research Institute at Kennedy Krieger, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
21
|
Gao S, Wu Z, Feng X, Kajigaya S, Wang X, Young NS. Comprehensive network modeling from single cell RNA sequencing of human and mouse reveals well conserved transcription regulation of hematopoiesis. BMC Genomics 2020; 21:849. [PMID: 33372598 PMCID: PMC7771096 DOI: 10.1186/s12864-020-07241-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Accepted: 11/18/2020] [Indexed: 12/17/2022] Open
Abstract
Background Presently, there is no comprehensive analysis of the transcription regulation network in hematopoiesis. Comparison of networks arising from gene co-expression across species can facilitate an understanding of the conservation of functional gene modules in hematopoiesis. Results We used single-cell RNA sequencing to profile bone marrow from human and mouse, and inferred transcription regulatory networks in each species in order to characterize transcriptional programs governing hematopoietic stem cell differentiation. We designed an algorithm for network reconstruction to conduct comparative transcriptomic analysis of hematopoietic gene co-expression and transcription regulation in human and mouse bone marrow cells. Co-expression network connectivity of hematopoiesis-related genes was found to be well conserved between mouse and human. The co-expression network showed “small-world” and “scale-free” architecture. The gene regulatory network formed a hierarchical structure, and hematopoiesis transcription factors localized to the hierarchy’s middle level. Conclusions Transcriptional regulatory networks are well conserved between human and mouse. The hierarchical organization of transcription factors may provide insights into hematopoietic cell lineage commitment, and to signal processing, cell survival and disease initiation. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-020-07241-2.
Collapse
Affiliation(s)
- Shouguo Gao
- Hematopoiesis and Bone Marrow Failure Laboratory, Hematology Branch, NHLBI, National Institutes of Health, Bethesda, MD, 20892, USA.
| | - Zhijie Wu
- Hematopoiesis and Bone Marrow Failure Laboratory, Hematology Branch, NHLBI, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Xingmin Feng
- Hematopoiesis and Bone Marrow Failure Laboratory, Hematology Branch, NHLBI, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Sachiko Kajigaya
- Hematopoiesis and Bone Marrow Failure Laboratory, Hematology Branch, NHLBI, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Xujing Wang
- Division of Diabetes, Endocrinology, and Metabolic Diseases (DEM), NIDDK, National Institutes of Health, Bethesda, MD, 20817, USA
| | - Neal S Young
- Hematopoiesis and Bone Marrow Failure Laboratory, Hematology Branch, NHLBI, National Institutes of Health, Bethesda, MD, 20892, USA
| |
Collapse
|
22
|
Tarbier M, Mackowiak SD, Frade J, Catuara-Solarz S, Biryukova I, Gelali E, Menéndez DB, Zapata L, Ossowski S, Bienko M, Gallant CJ, Friedländer MR. Nuclear gene proximity and protein interactions shape transcript covariations in mammalian single cells. Nat Commun 2020; 11:5445. [PMID: 33116115 PMCID: PMC7595044 DOI: 10.1038/s41467-020-19011-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 09/15/2020] [Indexed: 01/19/2023] Open
Abstract
Single-cell RNA sequencing studies on gene co-expression patterns could yield important regulatory and functional insights, but have so far been limited by the confounding effects of differentiation and cell cycle. We apply a tailored experimental design that eliminates these confounders, and report thousands of intrinsically covarying gene pairs in mouse embryonic stem cells. These covariations form a network with biological properties, outlining known and novel gene interactions. We provide the first evidence that miRNAs naturally induce transcriptome-wide covariations and compare the relative importance of nuclear organization, transcriptional and post-transcriptional regulation in defining covariations. We find that nuclear organization has the greatest impact, and that genes encoding for physically interacting proteins specifically tend to covary, suggesting importance for protein complex formation. Our results lend support to the concept of post-transcriptional RNA operons, but we further present evidence that nuclear proximity of genes may provide substantial functional regulation in mammalian single cells. Gene expression covariation can be studied by single-cell RNA sequencing. Here the authors analyze intrinsically covarying gene pairs by eliminating the confounding effects in single-cell experiments and observe covariation of proximal genes and miRNA-induced covariation of target mRNAs.
Collapse
Affiliation(s)
- Marcel Tarbier
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Sebastian D Mackowiak
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - João Frade
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain
| | - Silvina Catuara-Solarz
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain
| | - Inna Biryukova
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Eleni Gelali
- Science for Life Laboratory, Department of Medical Biochemistry and Biophysics, Karolinska Institute, Stockholm, Sweden
| | - Diego Bárcena Menéndez
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain
| | - Luis Zapata
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain.,Center for Evolution and Cancer, The Institute of Cancer Research, London, UK
| | - Stephan Ossowski
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain.,Department of Experimental and Health Sciences, University Pompeu Fabra, Barcelona, Spain.,Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| | - Magda Bienko
- Science for Life Laboratory, Department of Medical Biochemistry and Biophysics, Karolinska Institute, Stockholm, Sweden
| | - Caroline J Gallant
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Marc R Friedländer
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden.
| |
Collapse
|
23
|
Sekula M, Gaskins J, Datta S. A sparse Bayesian factor model for the construction of gene co-expression networks from single-cell RNA sequencing count data. BMC Bioinformatics 2020; 21:361. [PMID: 32811424 PMCID: PMC7437941 DOI: 10.1186/s12859-020-03707-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Accepted: 08/04/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Gene co-expression networks (GCNs) are powerful tools that enable biologists to examine associations between genes during different biological processes. With the advancement of new technologies, such as single-cell RNA sequencing (scRNA-seq), there is a need for developing novel network methods appropriate for new types of data. RESULTS We present a novel sparse Bayesian factor model to explore the network structure associated with genes in scRNA-seq data. Latent factors impact the gene expression values for each cell and provide flexibility to account for common features of scRNA-seq: high proportions of zero values, increased cell-to-cell variability, and overdispersion due to abnormally large expression counts. From our model, we construct a GCN by analyzing the positive and negative associations of the factors that are shared between each pair of genes. CONCLUSIONS Simulation studies demonstrate that our methodology has high power in identifying gene-gene associations while maintaining a nominal false discovery rate. In real data analyses, our model identifies more known and predicted protein-protein interactions than other competing network models.
Collapse
Affiliation(s)
- Michael Sekula
- Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY, USA.
| | - Jeremy Gaskins
- Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY, USA
| | - Susmita Datta
- Department of Biostatistics, University of Florida, Gainesville, FL, USA
| |
Collapse
|
24
|
A lncRNA landscape in breast cancer reveals a potential role for AC009283.1 in proliferation and apoptosis in HER2-enriched subtype. Sci Rep 2020; 10:13146. [PMID: 32753692 PMCID: PMC7403317 DOI: 10.1038/s41598-020-69905-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 07/19/2020] [Indexed: 12/23/2022] Open
Abstract
Breast cancer is the most commonly diagnosed neoplasm in women worldwide with a well-recognized heterogeneous pathology, classified into four molecular subtypes: Luminal A, Luminal B, HER2-enriched and Basal-like, each one with different biological and clinical characteristics. Long non-coding RNAs (lncRNAs) represent 33% of the human transcriptome and play critical roles in breast carcinogenesis, but most of their functions are still unknown. Therefore, cancer research could benefit from continued exploration into the biology of lncRNAs in this neoplasm. We characterized lncRNA expression portraits in 74 breast tumors belonging to the four molecular subtypes using transcriptome microarrays. To infer the biological role of the deregulated lncRNAs in the molecular subtypes, we performed co-expression analysis of lncRNA-mRNA and gene ontology analysis. We identified 307 deregulated lncRNAs in tumor compared to normal tissue and 354 deregulated lncRNAs among the different molecular subtypes. Through co-expression analysis between lncRNAs and protein-coding genes, along with gene enrichment analysis, we inferred the potential function of the most deregulated lncRNAs in each molecular subtype, and independently validated our results taking advantage of TCGA data. Overexpression of the AC009283.1 was observed in the HER2-enriched subtype and it is localized in an amplification zone at chromosome 17q12, suggesting it to be a potential tumorigenic lncRNA. The functional role of lncRNA AC009283.1 was examined through loss of function assays in vitro and determining its impact on global gene expression. These studies revealed that AC009283.1 regulates genes involved in proliferation, cell cycle and apoptosis in a HER2 cellular model. We further confirmed these findings through ssGSEA and CEMITool analysis in an independent HER2-amplified breast cancer cohort. Our findings suggest a wide range of biological functions for lncRNAs in each breast cancer molecular subtype and provide a basis for their biological and functional study, as was conducted for AC009283.1, showing it to be a potential regulator of proliferation and apoptosis in the HER2-enriched subtype.
Collapse
|
25
|
Cardozo LE, Russo PST, Gomes-Correia B, Araujo-Pereira M, Sepúlveda-Hermosilla G, Maracaja-Coutinho V, Nakaya HI. webCEMiTool: Co-expression Modular Analysis Made Easy. Front Genet 2019; 10:146. [PMID: 30894872 PMCID: PMC6414412 DOI: 10.3389/fgene.2019.00146] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Accepted: 02/12/2019] [Indexed: 12/31/2022] Open
Abstract
Co-expression analysis has been widely used to elucidate the functional architecture of genes under different biological processes. Such analysis, however, requires substantial knowledge about programming languages and/or bioinformatics skills. We present webCEMiTool,1 a unique online tool that performs comprehensive modular analyses in a fully automated manner. The webCEMiTool not only identifies co-expression gene modules but also performs several functional analyses on them. In addition, webCEMiTool integrates transcriptomic data with interactome information (i.e., protein-protein interactions) and identifies potential hubs on each network. The tool generates user-friendly html reports that allow users to search for specific genes in each module, as well as check if a module contains genes overrepresented in specific pathways or altered in a specific sample phenotype. We used webCEMiTool to perform a modular analysis of single-cell RNA-seq data of human cells infected with either Zika virus or dengue virus.
Collapse
Affiliation(s)
- Lucas E Cardozo
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of São Paulo, São Paulo, Brazil
| | - Pedro S T Russo
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of São Paulo, São Paulo, Brazil
| | - Bruno Gomes-Correia
- Advanced Center for Chronic Diseases-ACCDiS, Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile
| | - Mariana Araujo-Pereira
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of São Paulo, São Paulo, Brazil
| | | | - Vinicius Maracaja-Coutinho
- Advanced Center for Chronic Diseases-ACCDiS, Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile
| | - Helder I Nakaya
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of São Paulo, São Paulo, Brazil
| |
Collapse
|
26
|
Chiu YC, Hsiao TH, Wang LJ, Chen Y, Shao YHJ. scdNet: a computational tool for single-cell differential network analysis. BMC SYSTEMS BIOLOGY 2018; 12:124. [PMID: 30577836 PMCID: PMC6302455 DOI: 10.1186/s12918-018-0652-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Background Single-cell RNA sequencing (scRNA-Seq) is an emerging technology that has revolutionized the research of the tumor heterogeneity. However, the highly sparse data matrices generated by the technology have posed an obstacle to the analysis of differential gene regulatory networks. Results Addressing the challenges, this study presents, as far as we know, the first bioinformatics tool for scRNA-Seq-based differential network analysis (scdNet). The tool features a sample size adjustment of gene-gene correlation, comparison of inter-state correlations, and construction of differential networks. A simulation analysis demonstrated the power of scdNet in the analyses of sparse scRNA-Seq data matrices, with low requirement on the sample size, high computation efficiency, and tolerance of sequencing noises. Applying the tool to analyze two datasets of single circulating tumor cells (CTCs) of prostate cancer and early mouse embryos, our data demonstrated that differential gene regulation plays crucial roles in anti-androgen resistance and early embryonic development. Conclusions Overall, the tool is widely applicable to datasets generated by the emerging technology to bring biological insights into tumor heterogeneity and other studies. MATLAB implementation of scdNet is available at https://github.com/ChenLabGCCRI/scdNet.
Collapse
Affiliation(s)
- Yu-Chiao Chiu
- Greehey Children's Cancer Research Institute, University of Texas Health Science Center at San Antonio, San Antonio, TX, 78229, USA
| | - Tzu-Hung Hsiao
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, 40705, Taiwan
| | - Li-Ju Wang
- Greehey Children's Cancer Research Institute, University of Texas Health Science Center at San Antonio, San Antonio, TX, 78229, USA
| | - Yidong Chen
- Greehey Children's Cancer Research Institute, University of Texas Health Science Center at San Antonio, San Antonio, TX, 78229, USA. .,Department of Epidemiology and Biostatistics, University of Texas Health Science Center at San Antonio, San Antonio, TX, 78229, USA.
| | - Yu-Hsuan Joni Shao
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, 10675, Taiwan.
| |
Collapse
|
27
|
Brain Cell Type Specific Gene Expression and Co-expression Network Architectures. Sci Rep 2018; 8:8868. [PMID: 29892006 PMCID: PMC5995803 DOI: 10.1038/s41598-018-27293-5] [Citation(s) in RCA: 295] [Impact Index Per Article: 42.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Accepted: 05/31/2018] [Indexed: 01/08/2023] Open
Abstract
Elucidating brain cell type specific gene expression patterns is critical towards a better understanding of how cell-cell communications may influence brain functions and dysfunctions. We set out to compare and contrast five human and murine cell type-specific transcriptome-wide RNA expression data sets that were generated within the past several years. We defined three measures of brain cell type-relative expression including specificity, enrichment, and absolute expression and identified corresponding consensus brain cell “signatures,” which were well conserved across data sets. We validated that the relative expression of top cell type markers are associated with proxies for cell type proportions in bulk RNA expression data from postmortem human brain samples. We further validated novel marker genes using an orthogonal ATAC-seq dataset. We performed multiscale coexpression network analysis of the single cell data sets and identified robust cell-specific gene modules. To facilitate the use of the cell type-specific genes for cell type proportion estimation and deconvolution from bulk brain gene expression data, we developed an R package, BRETIGEA. In summary, we identified a set of novel brain cell consensus signatures and robust networks from the integration of multiple datasets and therefore transcend limitations related to technical issues characteristic of each individual study.
Collapse
|