1
|
López-Paleta N, Moreno-Barbosa E, Velázquez-Castro J. A fast validation test of gene regulatory network models via the Fokker-Planck equation. J Biol Phys 2025; 51:16. [PMID: 40388063 PMCID: PMC12089004 DOI: 10.1007/s10867-025-09681-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2025] [Accepted: 04/23/2025] [Indexed: 05/20/2025] Open
Abstract
Since Waddington proposed the concept of the "epigenetic landscape" in 1957, researchers have developed various methodologies to represent it in diverse processes. Studying the epigenetic landscape provides valuable qualitative information regarding cell development and the stability of phenotypic and morphogenetic patterns. Although Waddington's original idea was a visual metaphor, a contemporary perspective relates it to the landscape formed by the basins of attraction of a dynamical system describing the temporal evolution of protein concentrations driven by a gene regulatory network. Transitions among these attractors can be driven by stochastic perturbations, with the cell state more likely to transition to the nearest attractor or to the one that presents the path of least resistance. In this study, we define the epigenetic landscape using the free energy potential obtained from the solution of the Fokker-Planck equation on the regulatory network. Specifically, we obtained a numerical approximate solution of the Fokker-Planck equation describing the Arabidopsis thaliana flower morphogenesis process. We observed good agreement between the coexpression matrix obtained from the Fokker-Planck equation and the experimental coexpression matrix. This paper proposes a method for obtaining this landscape by solving the Fokker-Planck equation (FPE) associated with a dynamical system describing the temporal evolution of protein concentrations involved in the process of interest. As these systems are high-dimensional and analytical solutions are often unfeasible, we propose a gamma mixture model to solve the FPE, transforming this problem into an optimization problem. This methodology can enhance the analysis of gene regulatory networks by directly relating theoretical mathematical models with experimental observations of coexpression matrices, thus providing a discriminating technique for competing models.
Collapse
Affiliation(s)
- Natalia López-Paleta
- Facultad de Ciencias Físico Matemáticas, Benemérita Universidad Autónoma de Puebla, Puebla, 72570, Puebla, México
| | - Eduardo Moreno-Barbosa
- Facultad de Ciencias Físico Matemáticas, Benemérita Universidad Autónoma de Puebla, Puebla, 72570, Puebla, México
| | - Jorge Velázquez-Castro
- Facultad de Ciencias Físico Matemáticas, Benemérita Universidad Autónoma de Puebla, Puebla, 72570, Puebla, México.
| |
Collapse
|
2
|
Morin A, Chu CP, Pavlidis P. Identifying reproducible transcription regulator coexpression patterns with single cell transcriptomics. PLoS Comput Biol 2025; 21:e1012962. [PMID: 40257984 PMCID: PMC12011263 DOI: 10.1371/journal.pcbi.1012962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2024] [Accepted: 03/13/2025] [Indexed: 04/23/2025] Open
Abstract
The proliferation of single cell transcriptomics has potentiated our ability to unveil patterns that reflect dynamic cellular processes such as the regulation of gene transcription. In this study, we leverage a broad collection of single cell RNA-seq data to identify the gene partners whose expression is most coordinated with each human and mouse transcription regulator (TR). We assembled 120 human and 103 mouse scRNA-seq datasets from the literature (>28 million cells), constructing a single cell coexpression network for each. We aimed to understand the consistency of TR coexpression profiles across a broad sampling of biological contexts, rather than examine the preservation of context-specific signals. Our workflow therefore explicitly prioritizes the patterns that are most reproducible across cell types. Towards this goal, we characterize the similarity of each TR's coexpression within and across species. We create single cell coexpression rankings for each TR, demonstrating that this aggregated information recovers literature curated targets on par with ChIP-seq data. We then combine the coexpression and ChIP-seq information to identify candidate regulatory interactions supported across methods and species. Finally, we highlight interactions for the important neural TR ASCL1 to demonstrate how our compiled information can be adopted for community use.
Collapse
Affiliation(s)
- Alexander Morin
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Ching Pan Chu
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Paul Pavlidis
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
3
|
Jin Z, Zhou X, Fang Z. DelaySSA: stochastic simulation of biochemical systems and gene regulatory networks with or without time delays. PLoS Comput Biol 2025; 21:e1012919. [PMID: 40198732 PMCID: PMC11977973 DOI: 10.1371/journal.pcbi.1012919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Accepted: 02/26/2025] [Indexed: 04/10/2025] Open
Abstract
Stochastic Simulation Algorithm (SSA) is crucial for modeling biochemical reactions and gene regulatory networks. Traditional SSA is characterized by Markovian property and cannot naturally model systems with time delays. Several algorithms have already been designed to handle delayed reactions, yet few easy-to-use implementations exist. To address these challenges, we have developed DelaySSA, an R package that implements currently available algorithms for SSA with or without delays. Meanwhile, we also provided Matlab and Python versions to support wider applications. We demonstrated its accuracy and validity by simulating two classical models: the Bursty model and Refractory model. We then tested its capability to simulate the RNA Velocity model, where it successfully reproduced both the up- and down-regulation stages in the phase portrait. Finally, we extended its application to simulate a gene regulatory network of lung cancer adeno-to-squamous transition (AST) and qualitatively analyzed its bistability behavior by approximating the Waddington's landscape. Modeling the therapeutic intervention of a SOX2 degrader as a delayed degradation reaction, AST is effectively blocked and reprogrammed back to the adenocarcinoma state, providing a useful clue for targeting drug-resistant AST in the future. Taken together, DelaySSA is a powerful and easy-to-use software suite, facilitating accurate modeling of various kinds of biological systems and broadening the scope of stochastic simulations in systems biology.
Collapse
Affiliation(s)
- Ziyan Jin
- Department of Colorectal Surgery and Oncology of the Second Affiliated Hospital, and Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, Zhejiang University, Hangzhou, China
| | - Xinyi Zhou
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai, China
| | - Zhaoyuan Fang
- Department of Colorectal Surgery and Oncology of the Second Affiliated Hospital, and Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, Zhejiang University, Hangzhou, China
- Edinburgh Medical School, College of Medicine and Veterinary Medicine, The University of Edinburgh, Edinburgh, United Kingdom
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, China
- Biomedical and Health Translational Research Center of Zhejiang Province, Haining, China
| |
Collapse
|
4
|
Cha J, Lee I. Single-cell network biology enabling cell-type-resolved disease genetics. Genomics Inform 2025; 23:10. [PMID: 40148916 PMCID: PMC11951680 DOI: 10.1186/s44342-025-00042-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2024] [Accepted: 03/12/2025] [Indexed: 03/29/2025] Open
Abstract
Gene network models provide a foundation for graph theory approaches, aiding in the novel discovery of drug targets, disease genes, and genetic mechanisms for various biological functions. Disease genetics must be interpreted within the cellular context of disease-associated cell types, which cannot be achieved with datasets consisting solely of organism-level samples. Single-cell RNA sequencing (scRNA-seq) technology allows computational distinction of cell states which provides a unique opportunity to understand cellular biology that drives disease processes. Importantly, the abundance of cell samples with their transcriptome-wide profile allows the modeling of systemic cell-type-specific gene networks (CGNs), offering insights into gene-cell-disease relationships. In this review, we present reference-based and de novo inference of gene functional interaction networks that we have recently developed using scRNA-seq datasets. We also introduce a compendium of CGNs as a useful resource for cell-type-resolved disease genetics. By leveraging these advances, we envision single-cell network biology as the key approach for mapping the gene-cell-disease axis.
Collapse
Affiliation(s)
- Junha Cha
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea.
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea.
| |
Collapse
|
5
|
Chen L, Dautle M, Gao R, Zhang S, Chen Y. Inferring gene regulatory networks from time-series scRNA-seq data via GRANGER causal recurrent autoencoders. Brief Bioinform 2025; 26:bbaf089. [PMID: 40062616 PMCID: PMC11891664 DOI: 10.1093/bib/bbaf089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 01/26/2025] [Accepted: 02/18/2025] [Indexed: 05/13/2025] Open
Abstract
The development of single-cell RNA sequencing (scRNA-seq) technology provides valuable data resources for inferring gene regulatory networks (GRNs), enabling deeper insights into cellular mechanisms and diseases. While many methods exist for inferring GRNs from static scRNA-seq data, current approaches face challenges in accurately handling time-series scRNA-seq data due to high noise levels and data sparsity. The temporal dimension introduces additional complexity by requiring models to capture dynamic changes, increasing sensitivity to noise, and exacerbating data sparsity across time points. In this study, we introduce GRANGER, an unsupervised deep learning-based method that integrates multiple advanced techniques, including a recurrent variational autoencoder, GRANGER causality, sparsity-inducing penalties, and negative binomial (NB)-based loss functions, to infer GRNs. GRANGER was evaluated using multiple popular benchmarking datasets, where it demonstrated superior performance compared to eight well-known GRN inference methods. The integration of a NB-based loss function and sparsity-inducing penalties in GRANGER significantly enhanced its capacity to address dropout noise and sparsity in scRNA-seq data. Additionally, GRANGER exhibited robustness against high levels of dropout noise. We applied GRANGER to scRNA-seq data from the whole mouse brain obtained through the BRAIN Initiative project and identified GRNs for five transcription regulators: E2f7, Gbx1, Sox10, Prox1, and Onecut2, which play crucial roles in diverse brain cell types. The inferred GRNs not only recalled many known regulatory relationships but also revealed sets of novel regulatory interactions with functional potential. These findings demonstrate that GRANGER is a highly effective tool for real-world applications in discovering novel gene regulatory relationships.
Collapse
Affiliation(s)
- Liang Chen
- College of Computer and Information Engineering, Tianjin Normal University, 393 Binshui W Ave, Tianjin, Tianjin 300387, China
| | - Madison Dautle
- Department of Biological and Biomedical Sciences, Rowan University, 201 Mullica Hill Road, Glassboro, NJ 08028, United States
| | - Ruoying Gao
- College of Computer and Information Engineering, Tianjin Normal University, 393 Binshui W Ave, Tianjin, Tianjin 300387, China
| | - Shaoqiang Zhang
- College of Computer and Information Engineering, Tianjin Normal University, 393 Binshui W Ave, Tianjin, Tianjin 300387, China
| | - Yong Chen
- Department of Biological and Biomedical Sciences, Rowan University, 201 Mullica Hill Road, Glassboro, NJ 08028, United States
| |
Collapse
|
6
|
Morin A, Chu CP, Pavlidis P. Identifying Reproducible Transcription Regulator Coexpression Patterns with Single Cell Transcriptomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.02.15.580581. [PMID: 38559016 PMCID: PMC10979919 DOI: 10.1101/2024.02.15.580581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
The proliferation of single cell transcriptomics has potentiated our ability to unveil patterns that reflect dynamic cellular processes such as the regulation of gene transcription. In this study, we leverage a broad collection of single cell RNA-seq data to identify the gene partners whose expression is most coordinated with each human and mouse transcription regulator (TR). We assembled 120 human and 103 mouse scRNA-seq datasets from the literature (>28 million cells), constructing a single cell coexpression network for each. We aimed to understand the consistency of TR coexpression profiles across a broad sampling of biological contexts, rather than examine the preservation of context-specific signals. Our workflow therefore explicitly prioritizes the patterns that are most reproducible across cell types. Towards this goal, we characterize the similarity of each TR's coexpression within and across species. We create single cell coexpression rankings for each TR, demonstrating that this aggregated information recovers literature curated targets on par with ChIP-seq data. We then combine the coexpression and ChIP-seq information to identify candidate regulatory interactions supported across methods and species. Finally, we highlight interactions for the important neural TR ASCL1 to demonstrate how our compiled information can be adopted for community use.
Collapse
Affiliation(s)
- Alexander Morin
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC, Canada
| | - C. Pan Chu
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC, Canada
| | - Paul Pavlidis
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
7
|
Larsson I, Held F, Popova G, Koc A, Kundu S, Jörnsten R, Nelander S. Reconstructing the regulatory programs underlying the phenotypic plasticity of neural cancers. Nat Commun 2024; 15:9699. [PMID: 39516198 PMCID: PMC11549355 DOI: 10.1038/s41467-024-53954-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 10/22/2024] [Indexed: 11/16/2024] Open
Abstract
Nervous system cancers exhibit diverse transcriptional cell states influenced by normal development, injury response, and growth. However, the understanding of these states' regulation and pharmacological relevance remains limited. Here we present "single-cell regulatory-driven clustering" (scregclust), a method that reconstructs cellular regulatory programs from extensive collections of single-cell RNA sequencing (scRNA-seq) data from both tumors and developing tissues. The algorithm efficiently divides target genes into modules, predicting key transcription factors and kinases with minimal computational time. Applying this method to adult and childhood brain cancers, we identify critical regulators and suggest interventions that could improve temozolomide treatment in glioblastoma. Additionally, our integrative analysis reveals a meta-module regulated by SPI1 and IRF8 linked to an immune-mediated mesenchymal-like state. Finally, scregclust's flexibility is demonstrated across 15 tumor types, uncovering both pan-cancer and specific regulators. The algorithm is provided as an easy-to-use R package that facilitates the exploration of regulatory programs underlying cell plasticity.
Collapse
Affiliation(s)
- Ida Larsson
- Department of Immunology, Genetics and Pathology, Uppsala University, SE-751 85, Uppsala, Sweden
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Felix Held
- Mathematical Sciences, Chalmers University of Technology, SE-412 96, Gothenburg, Sweden
| | - Gergana Popova
- Department of Immunology, Genetics and Pathology, Uppsala University, SE-751 85, Uppsala, Sweden
| | - Alper Koc
- Department of Immunology, Genetics and Pathology, Uppsala University, SE-751 85, Uppsala, Sweden
| | - Soumi Kundu
- Department of Immunology, Genetics and Pathology, Uppsala University, SE-751 85, Uppsala, Sweden
| | - Rebecka Jörnsten
- Mathematical Sciences, Chalmers University of Technology, SE-412 96, Gothenburg, Sweden
| | - Sven Nelander
- Department of Immunology, Genetics and Pathology, Uppsala University, SE-751 85, Uppsala, Sweden.
| |
Collapse
|
8
|
Karamveer, Uzun Y. Approaches for Benchmarking Single-Cell Gene Regulatory Network Methods. Bioinform Biol Insights 2024; 18:11779322241287120. [PMID: 39502448 PMCID: PMC11536393 DOI: 10.1177/11779322241287120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 09/10/2024] [Indexed: 11/08/2024] Open
Abstract
Gene regulatory networks are powerful tools for modeling genetic interactions that control the expression of genes driving cell differentiation, and single-cell sequencing offers a unique opportunity to build these networks with high-resolution genomic data. There are many proposed computational methods to build these networks using single-cell data, and different approaches are used to benchmark these methods. However, a comprehensive discussion specifically focusing on benchmarking approaches is missing. In this article, we lay the GRN terminology, present an overview of common gold-standard studies and data sets, and define the performance metrics for benchmarking network construction methodologies. We also point out the advantages and limitations of different benchmarking approaches, suggest alternative ground truth data sets that can be used for benchmarking, and specify additional considerations in this context.
Collapse
Affiliation(s)
- Karamveer
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Yasin Uzun
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Penn State Cancer Institute, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| |
Collapse
|
9
|
K Lodi M, Chernikov A, Ghosh P. COFFEE: consensus single cell-type specific inference for gene regulatory networks. Brief Bioinform 2024; 25:bbae457. [PMID: 39311699 PMCID: PMC11418232 DOI: 10.1093/bib/bbae457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 07/22/2024] [Accepted: 09/02/2024] [Indexed: 09/26/2024] Open
Abstract
The inference of gene regulatory networks (GRNs) is crucial to understanding the regulatory mechanisms that govern biological processes. GRNs may be represented as edges in a graph, and hence, it have been inferred computationally for scRNA-seq data. A wisdom of crowds approach to integrate edges from several GRNs to create one composite GRN has demonstrated improved performance when compared with individual algorithm implementations on bulk RNA-seq and microarray data. In an effort to extend this approach to scRNA-seq data, we present COFFEE (COnsensus single cell-type speciFic inFerence for gEnE regulatory networks), a Borda voting-based consensus algorithm that integrates information from 10 established GRN inference methods. We conclude that COFFEE has improved performance across synthetic, curated, and experimental datasets when compared with baseline methods. Additionally, we show that a modified version of COFFEE can be leveraged to improve performance on newer cell-type specific GRN inference methods. Overall, our results demonstrate that consensus-based methods with pertinent modifications continue to be valuable for GRN inference at the single cell level. While COFFEE is benchmarked on 10 algorithms, it is a flexible strategy that can incorporate any set of GRN inference algorithms according to user preference. A Python implementation of COFFEE may be found on GitHub: https://github.com/lodimk2/coffee.
Collapse
Affiliation(s)
- Musaddiq K Lodi
- Integrative Life Sciences, Virginia Commonwealth University, 1000 W Cary St, Richmond, VA 23284, United States
| | - Anna Chernikov
- Center for Biological Data Science, Virginia Commonwealth University, 1015 Floyd Ave, Richmond, VA 23284, United States
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, 401 W Main St, Richmond, VA 23284, United States
| |
Collapse
|
10
|
Bishop A, Romero JC, Tonapi S, Parihar M, Loranc E, Miller H, Lawrence L, Bassani N, Robledo D, Cao L, Nie J, Kanda K, Stoja A, Garcia N, Gorthi A, Stoveken B, Lane A, Fan T, Cassel T, Zha S, Musi N. ATM phosphorylation of CD98HC increases antiporter membrane localization and prevents chronic toxic glutamate accumulation in Ataxia telangiectasia. RESEARCH SQUARE 2024:rs.3.rs-4947457. [PMID: 39281865 PMCID: PMC11398575 DOI: 10.21203/rs.3.rs-4947457/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/18/2024]
Abstract
Ataxia telangiectasia (A-T) is a rare genetic disorder characterized by neurological defects, immunodeficiency, cancer predisposition, radiosensitivity, decreased blood vessel integrity, and diabetes. ATM, the protein mutated in A-T, responds to DNA damage and oxidative stress, but its functional relationship to the progressive clinical manifestation of A-T is not understood. CD98HC chaperones cystine/glutamate (x c - ) and cationic/neutral amino acid (y + L) antiporters to the cell membrane, and CD98HC phosphorylation by ATM accelerates membrane localization to acutely increase amino acid transport. Loss of ATM impacts tissues reliant on SLC family antiporters relevant to A-T phenotypes, such as endothelial cells (telangiectasia) and pancreatic α-cells (fatty liver and diabetes) with toxic glutamate accumulation. Bypassing the antiporters restores intracellular metabolic balance both in ATM-deficient cells and mouse models. These findings provide new insight into the long-known benefits of N-acetyl cysteine to A-T cells beyond oxidative stress through removing excess glutamate by production of glutathione.
Collapse
|
11
|
Mohammed M, Dziedziech A, Macedo D, Huppertz F, Veith Y, Postel Z, Christ E, Scheytt R, Slotte T, Henriksson J, Ankarklev J. Single-cell transcriptomics reveal transcriptional programs underlying male and female cell fate during Plasmodium falciparum gametocytogenesis. Nat Commun 2024; 15:7177. [PMID: 39187486 PMCID: PMC11347709 DOI: 10.1038/s41467-024-51201-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 08/01/2024] [Indexed: 08/28/2024] Open
Abstract
The Plasmodium falciparum life cycle includes obligate transition between a human and mosquito host. Gametocytes are responsible for transmission from the human to the mosquito vector where gamete fusion followed by meiosis occurs. To elucidate how male and female gametocytes differentiate in the absence of sex chromosomes, we perform FACS-based cell enrichment of a P. falciparum gametocyte reporter line followed by single-cell RNA-seq. In our analyses we define the transcriptional programs and predict candidate driver genes underlying male and female development, including genes from the ApiAP2 family of transcription factors. A motif-driven, gene regulatory network analysis indicates that AP2-G5 specifically modulates male development. Additionally, genes linked to the inner membrane complex, involved in morphological changes, are uniquely expressed in the female lineage. The transcriptional programs of male and female development detailed herein allow for further exploration of the evolution of sex in eukaryotes and provide targets for future development of transmission blocking therapies.
Collapse
Affiliation(s)
- Mubasher Mohammed
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden.
| | - Alexis Dziedziech
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
- Department of Global Health, Institut Pasteur, 25-28 Rue du Docteur Roux, Paris, France
| | - Diego Macedo
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Frederik Huppertz
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Ylva Veith
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Zoé Postel
- Department of Ecology, Environment and Plant Science, Stockholm University, Stockholm, Sweden
| | - Elena Christ
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Richard Scheytt
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Tanja Slotte
- Department of Ecology, Environment and Plant Science, Stockholm University, Stockholm, Sweden
| | - Johan Henriksson
- Laboratory for Molecular Infection Medicine Sweden (MIMS), Department of Molecular Biology, Umeå Centre for Microbial Research (UCMR), Integrated Science Lab, Umeå University, Umeå, Sweden
| | - Johan Ankarklev
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden.
- Microbial Single Cell Genomics Facility, SciLifeLab, Biomedical Center (BMC) Uppsala University, Uppsala, Sweden.
| |
Collapse
|
12
|
Loers JU, Vermeirssen V. A single-cell multimodal view on gene regulatory network inference from transcriptomics and chromatin accessibility data. Brief Bioinform 2024; 25:bbae382. [PMID: 39207727 PMCID: PMC11359808 DOI: 10.1093/bib/bbae382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 06/27/2024] [Accepted: 07/23/2024] [Indexed: 09/04/2024] Open
Abstract
Eukaryotic gene regulation is a combinatorial, dynamic, and quantitative process that plays a vital role in development and disease and can be modeled at a systems level in gene regulatory networks (GRNs). The wealth of multi-omics data measured on the same samples and even on the same cells has lifted the field of GRN inference to the next stage. Combinations of (single-cell) transcriptomics and chromatin accessibility allow the prediction of fine-grained regulatory programs that go beyond mere correlation of transcription factor and target gene expression, with enhancer GRNs (eGRNs) modeling molecular interactions between transcription factors, regulatory elements, and target genes. In this review, we highlight the key components for successful (e)GRN inference from (sc)RNA-seq and (sc)ATAC-seq data exemplified by state-of-the-art methods as well as open challenges and future developments. Moreover, we address preprocessing strategies, metacell generation and computational omics pairing, transcription factor binding site detection, and linear and three-dimensional approaches to identify chromatin interactions as well as dynamic and causal eGRN inference. We believe that the integration of transcriptomics together with epigenomics data at a single-cell level is the new standard for mechanistic network inference, and that it can be further advanced with integrating additional omics layers and spatiotemporal data, as well as with shifting the focus towards more quantitative and causal modeling strategies.
Collapse
Affiliation(s)
- Jens Uwe Loers
- Lab for Computational Biology, Integromics and Gene Regulation (CBIGR), Cancer Research Institute Ghent (CRIG), Corneel Heymanslaan 10, 9000 Ghent, Belgium
- Department of Biomedical Molecular Biology, Ghent University, Zwijnaarde-Technologiepark 71, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, 9000 Ghent, Belgium
| | - Vanessa Vermeirssen
- Lab for Computational Biology, Integromics and Gene Regulation (CBIGR), Cancer Research Institute Ghent (CRIG), Corneel Heymanslaan 10, 9000 Ghent, Belgium
- Department of Biomedical Molecular Biology, Ghent University, Zwijnaarde-Technologiepark 71, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, 9000 Ghent, Belgium
| |
Collapse
|
13
|
Vaparanta K, Merilahti JAM, Ojala VK, Elenius K. De Novo Multi-Omics Pathway Analysis Designed for Prior Data Independent Inference of Cell Signaling Pathways. Mol Cell Proteomics 2024; 23:100780. [PMID: 38703893 PMCID: PMC11259815 DOI: 10.1016/j.mcpro.2024.100780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 04/07/2024] [Accepted: 04/30/2024] [Indexed: 05/06/2024] Open
Abstract
New tools for cell signaling pathway inference from multi-omics data that are independent of previous knowledge are needed. Here, we propose a new de novo method, the de novo multi-omics pathway analysis (DMPA), to model and combine omics data into network modules and pathways. DMPA was validated with published omics data and was found accurate in discovering reported molecular associations in transcriptome, interactome, phosphoproteome, methylome, and metabolomics data, and signaling pathways in multi-omics data. DMPA was benchmarked against module discovery and multi-omics integration methods and outperformed previous methods in module and pathway discovery especially when applied to datasets of relatively low sample sizes. Transcription factor, kinase, subcellular location, and function prediction algorithms were devised for transcriptome, phosphoproteome, and interactome modules and pathways, respectively. To apply DMPA in a biologically relevant context, interactome, phosphoproteome, transcriptome, and proteome data were collected from analyses carried out using melanoma cells to address gamma-secretase cleavage-dependent signaling characteristics of the receptor tyrosine kinase TYRO3. The pathways modeled with DMPA reflected the predicted function and its direction in validation experiments.
Collapse
Affiliation(s)
- Katri Vaparanta
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland; Medicity Research Laboratories, University of Turku, Turku, Finland; Institute of Biomedicine, University of Turku, Turku, Finland.
| | - Johannes A M Merilahti
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland; Medicity Research Laboratories, University of Turku, Turku, Finland; Institute of Biomedicine, University of Turku, Turku, Finland
| | - Veera K Ojala
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland; Medicity Research Laboratories, University of Turku, Turku, Finland; Institute of Biomedicine, University of Turku, Turku, Finland
| | - Klaus Elenius
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland; Medicity Research Laboratories, University of Turku, Turku, Finland; Institute of Biomedicine, University of Turku, Turku, Finland; Department of Oncology, Turku University Hospital, Turku, Finland.
| |
Collapse
|
14
|
Moeckel C, Mouratidis I, Chantzi N, Uzun Y, Georgakopoulos-Soares I. Advances in computational and experimental approaches for deciphering transcriptional regulatory networks: Understanding the roles of cis-regulatory elements is essential, and recent research utilizing MPRAs, STARR-seq, CRISPR-Cas9, and machine learning has yielded valuable insights. Bioessays 2024; 46:e2300210. [PMID: 38715516 PMCID: PMC11444527 DOI: 10.1002/bies.202300210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/22/2024] [Accepted: 04/23/2024] [Indexed: 05/16/2024]
Abstract
Understanding the influence of cis-regulatory elements on gene regulation poses numerous challenges given complexities stemming from variations in transcription factor (TF) binding, chromatin accessibility, structural constraints, and cell-type differences. This review discusses the role of gene regulatory networks in enhancing understanding of transcriptional regulation and covers construction methods ranging from expression-based approaches to supervised machine learning. Additionally, key experimental methods, including MPRAs and CRISPR-Cas9-based screening, which have significantly contributed to understanding TF binding preferences and cis-regulatory element functions, are explored. Lastly, the potential of machine learning and artificial intelligence to unravel cis-regulatory logic is analyzed. These computational advances have far-reaching implications for precision medicine, therapeutic target discovery, and the study of genetic variations in health and disease.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Yasin Uzun
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
15
|
Adeoye T, Shah SI, Ullah G. Systematic Analysis of Biological Processes Reveals Gene Co-expression Modules Driving Pathway Dysregulation in Alzheimer's Disease. Aging Dis 2024; 16:1598-1625. [PMID: 38913039 DOI: 10.14336/ad.2024.0429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Accepted: 06/12/2024] [Indexed: 06/25/2024] Open
Abstract
Alzheimer's disease (AD) manifests as a complex systems pathology with intricate interplay among various genes and biological processes. Traditional differential gene expression (DEG) analysis, while commonly employed to characterize AD-driven perturbations, does not sufficiently capture the full spectrum of underlying biological processes. Utilizing single-nucleus RNA-sequencing data from postmortem brain samples across key regions-middle temporal gyrus, superior frontal gyrus, and entorhinal cortex-we provide a comprehensive systematic analysis of disrupted processes in AD. We go beyond the DEG-centric analysis by integrating pathway activity analysis with weighted gene co-expression patterns to comprehensively map gene interconnectivity, identifying region- and cell-type-specific drivers of biological processes associated with AD. Our analysis reveals profound modular heterogeneity in neurons and glia as well as extensive AD-related functional disruptions. Co-expression networks highlighted the extended involvement of astrocytes and microglia in biological processes beyond neuroinflammation, such as calcium homeostasis, glutamate regulation, lipid metabolism, vesicle-mediated transport, and TOR signaling. We find limited representation of DEGs within dysregulated pathways across neurons and glial cells, suggesting that differential gene expression alone may not adequately represent the disease complexity. Further dissection of inferred gene modules revealed distinct dynamics of hub DEGs in neurons versus glia, suggesting that DEGs exert more impact on neurons compared to glial cells in driving modular dysregulations underlying perturbed biological processes. Interestingly, we observe an overall downregulation of astrocyte and microglia modules across all brain regions in AD, indicating a prevailing trend of functional repression in glial cells across these regions. Notable genes from the CALM and HSP90 families emerged as hub genes across neuronal modules in all brain regions, suggesting conserved roles as drivers of synaptic dysfunction in AD. Our findings demonstrate the importance of an integrated, systems-oriented approach combining pathway and network analysis to comprehensively understand the cell-type-specific roles of genes in AD-related biological processes.
Collapse
|
16
|
Taylor MA, Kandyba E, Halliwill K, Delrosario R, Khoroshkin M, Goodarzi H, Quigley D, Li YR, Wu D, Bollam SR, Mirzoeva OK, Akhurst RJ, Balmain A. Stem-cell states converge in multistage cutaneous squamous cell carcinoma development. Science 2024; 384:eadi7453. [PMID: 38815020 DOI: 10.1126/science.adi7453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 04/05/2024] [Indexed: 06/01/2024]
Abstract
Stem cells play a critical role in cancer development by contributing to cell heterogeneity, lineage plasticity, and drug resistance. We created gene expression networks from hundreds of mouse tissue samples (both normal and tumor) and integrated these with lineage tracing and single-cell RNA-seq, to identify convergence of cell states in premalignant tumor cells expressing markers of lineage plasticity and drug resistance. Two of these cell states representing multilineage plasticity or proliferation were inversely correlated, suggesting a mutually exclusive relationship. Treatment of carcinomas in vivo with chemotherapy repressed the proliferative state and activated multilineage plasticity whereas inhibition of differentiation repressed plasticity and potentiated responses to cell cycle inhibitors. Manipulation of this cell state transition point may provide a source of potential combinatorial targets for cancer therapy.
Collapse
Affiliation(s)
- Mark A Taylor
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA
- Clinical Research Centre, Medical University of Bialystok, Bialystok 15-089, Poland
| | - Eve Kandyba
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA
| | - Kyle Halliwill
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA
- AbbVie, South San Francisco, CA 94080, USA
| | - Reyno Delrosario
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA
| | - Matvei Khoroshkin
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA
| | - Hani Goodarzi
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94518, USA
- Department of Urology, University of California San Francisco, San Francisco, CA 94518, USA
- Arc Institute, Palo Alto, CA 94304, USA
| | - David Quigley
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Urology, University of California San Francisco, San Francisco, CA 94518, USA
- Department of Epidemiology & Biostatistics, University of California San Francisco, San Francisco, CA 94518, USA
| | - Yun Rose Li
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Radiation Oncology, City of Hope National Medical Center, Duarte, CA 91010, USA
- Department of Cancer Genetics & Epigenetics, City of Hope National Medical Center, Duarte, CA 91010, USA
- Division of Quantitative Medicine & Systems Biology, Translational Genomics Research Institute, Phoenix, CA 85004, USA
| | - Di Wu
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA
| | - Saumya R Bollam
- Biomedical Sciences Graduate Program, University of California San Francisco, San Francisco, CA 94518, USA
| | - Olga K Mirzoeva
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA
| | - Rosemary J Akhurst
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Anatomy, University of California San Francisco, San Francisco, CA 94518, USA
| | - Allan Balmain
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94518, USA
| |
Collapse
|
17
|
Zhang D, Gao S, Liu ZP, Gao R. LogicGep: Boolean networks inference using symbolic regression from time-series transcriptomic profiling data. Brief Bioinform 2024; 25:bbae286. [PMID: 38886006 PMCID: PMC11182660 DOI: 10.1093/bib/bbae286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 05/09/2024] [Accepted: 06/06/2024] [Indexed: 06/20/2024] Open
Abstract
Reconstructing the topology of gene regulatory network from gene expression data has been extensively studied. With the abundance functional transcriptomic data available, it is now feasible to systematically decipher regulatory interaction dynamics in a logic form such as a Boolean network (BN) framework, which qualitatively indicates how multiple regulators aggregated to affect a common target gene. However, inferring both the network topology and gene interaction dynamics simultaneously is still a challenging problem since gene expression data are typically noisy and data discretization is prone to information loss. We propose a new method for BN inference from time-series transcriptional profiles, called LogicGep. LogicGep formulates the identification of Boolean functions as a symbolic regression problem that learns the Boolean function expression and solve it efficiently through multi-objective optimization using an improved gene expression programming algorithm. To avoid overly emphasizing dynamic characteristics at the expense of topology structure ones, as traditional methods often do, a set of promising Boolean formulas for each target gene is evolved firstly, and a feed-forward neural network trained with continuous expression data is subsequently employed to pick out the final solution. We validated the efficacy of LogicGep using multiple datasets including both synthetic and real-world experimental data. The results elucidate that LogicGep adeptly infers accurate BN models, outperforming other representative BN inference algorithms in both network topology reconstruction and the identification of Boolean functions. Moreover, the execution of LogicGep is hundreds of times faster than other methods, especially in the case of large network inference.
Collapse
Affiliation(s)
- Dezhen Zhang
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Shuhua Gao
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Zhi-Ping Liu
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Rui Gao
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| |
Collapse
|
18
|
Maizels RJ, Snell DM, Briscoe J. Reconstructing developmental trajectories using latent dynamical systems and time-resolved transcriptomics. Cell Syst 2024; 15:411-424.e9. [PMID: 38754365 DOI: 10.1016/j.cels.2024.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 02/01/2024] [Accepted: 04/17/2024] [Indexed: 05/18/2024]
Abstract
The snapshot nature of single-cell transcriptomics presents a challenge for studying the dynamics of cell fate decisions. Metabolic labeling and splicing can provide temporal information at single-cell level, but current methods have limitations. Here, we present a framework that overcomes these limitations: experimentally, we developed sci-FATE2, an optimized method for metabolic labeling with increased data quality, which we used to profile 45,000 embryonic stem (ES) cells differentiating into neural tube identities. Computationally, we developed a two-stage framework for dynamical modeling: VelvetVAE, a variational autoencoder (VAE) for velocity inference that outperforms all other tools tested, and VelvetSDE, a neural stochastic differential equation (nSDE) framework for simulating trajectory distributions. These recapitulate underlying dataset distributions and capture features such as decision boundaries between alternative fates and fate-specific gene expression. These methods recast single-cell analyses from descriptions of observed data to models of the dynamics that generated them, providing a framework for investigating developmental fate decisions.
Collapse
Affiliation(s)
- Rory J Maizels
- The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK; University College, London, UK
| | - Daniel M Snell
- The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - James Briscoe
- The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK.
| |
Collapse
|
19
|
Zinati Y, Takiddeen A, Emad A. GRouNdGAN: GRN-guided simulation of single-cell RNA-seq data using causal generative adversarial networks. Nat Commun 2024; 15:4055. [PMID: 38744843 PMCID: PMC11525796 DOI: 10.1038/s41467-024-48516-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 05/01/2024] [Indexed: 05/16/2024] Open
Abstract
We introduce GRouNdGAN, a gene regulatory network (GRN)-guided reference-based causal implicit generative model for simulating single-cell RNA-seq data, in silico perturbation experiments, and benchmarking GRN inference methods. Through the imposition of a user-defined GRN in its architecture, GRouNdGAN simulates steady-state and transient-state single-cell datasets where genes are causally expressed under the control of their regulating transcription factors (TFs). Training on six experimental reference datasets, we show that our model captures non-linear TF-gene dependencies and preserves gene identities, cell trajectories, pseudo-time ordering, and technical and biological noise, with no user manipulation and only implicit parameterization. GRouNdGAN can synthesize cells under new conditions to perform in silico TF knockout experiments. Benchmarking various GRN inference algorithms reveals that GRouNdGAN effectively bridges the existing gap between simulated and biological data benchmarks of GRN inference algorithms, providing gold standard ground truth GRNs and realistic cells corresponding to the biological system of interest.
Collapse
Affiliation(s)
- Yazdan Zinati
- Department of Electrical and Computer Engineering, McGill University, Montreal, QC, Canada
| | - Abdulrahman Takiddeen
- Department of Electrical and Computer Engineering, McGill University, Montreal, QC, Canada
| | - Amin Emad
- Department of Electrical and Computer Engineering, McGill University, Montreal, QC, Canada.
- Mila, Quebec AI Institute, Montreal, QC, Canada.
- The Rosalind and Morris Goodman Cancer Institute, Montreal, QC, Canada.
| |
Collapse
|
20
|
Stock M, Popp N, Fiorentino J, Scialdone A. Topological benchmarking of algorithms to infer gene regulatory networks from single-cell RNA-seq data. Bioinformatics 2024; 40:btae267. [PMID: 38627250 PMCID: PMC11096270 DOI: 10.1093/bioinformatics/btae267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 02/28/2024] [Accepted: 04/16/2024] [Indexed: 05/18/2024] Open
Abstract
MOTIVATION In recent years, many algorithms for inferring gene regulatory networks from single-cell transcriptomic data have been published. Several studies have evaluated their accuracy in estimating the presence of an interaction between pairs of genes. However, these benchmarking analyses do not quantify the algorithms' ability to capture structural properties of networks, which are fundamental, e.g., for studying the robustness of a gene network to external perturbations. Here, we devise a three-step benchmarking pipeline called STREAMLINE that quantifies the ability of algorithms to capture topological properties of networks and identify hubs. RESULTS To this aim, we use data simulated from different types of networks as well as experimental data from three different organisms. We apply our benchmarking pipeline to four inference algorithms and provide guidance on which algorithm should be used depending on the global network property of interest. AVAILABILITY AND IMPLEMENTATION STREAMLINE is available at https://github.com/ScialdoneLab/STREAMLINE. The data generated in this study are available at https://doi.org/10.5281/zenodo.10710444.
Collapse
Affiliation(s)
- Marco Stock
- Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 81377, Germany
- Institute of Functional Epigenetics, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich 85354, Germany
| | - Niclas Popp
- Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 81377, Germany
- Institute of Functional Epigenetics, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
| | - Jonathan Fiorentino
- Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 81377, Germany
- Institute of Functional Epigenetics, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
| | - Antonio Scialdone
- Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 81377, Germany
- Institute of Functional Epigenetics, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
| |
Collapse
|
21
|
Maizels RJ. A dynamical perspective: moving towards mechanism in single-cell transcriptomics. Philos Trans R Soc Lond B Biol Sci 2024; 379:20230049. [PMID: 38432314 PMCID: PMC10909508 DOI: 10.1098/rstb.2023.0049] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 10/31/2023] [Indexed: 03/05/2024] Open
Abstract
As the field of single-cell transcriptomics matures, research is shifting focus from phenomenological descriptions of cellular phenotypes to a mechanistic understanding of the gene regulation underneath. This perspective considers the value of capturing dynamical information at single-cell resolution for gaining mechanistic insight; reviews the available technologies for recording and inferring temporal information in single cells; and explores whether better dynamical resolution is sufficient to adequately capture the causal relationships driving complex biological systems. This article is part of a discussion meeting issue 'Causes and consequences of stochastic processes in development and disease'.
Collapse
Affiliation(s)
- Rory J. Maizels
- The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
- University College London, London WC1E 6BT, UK
| |
Collapse
|
22
|
Zhu L, Wang J. Quantifying Landscape-Flux via Single-Cell Transcriptomics Uncovers the Underlying Mechanism of Cell Cycle. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2308879. [PMID: 38353329 DOI: 10.1002/advs.202308879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 01/23/2024] [Indexed: 04/25/2024]
Abstract
Recent developments in single-cell sequencing technology enable the acquisition of entire transcriptome data. Understanding the underlying mechanism and identifying the driving force of transcriptional regulation governing cell function directly from these data remains challenging. This study reconstructs a continuous vector field of the cell cycle based on discrete single-cell RNA velocity to quantify the single-cell global nonequilibrium dynamic landscape-flux. It reveals that large fluctuations disrupt the global landscape and genetic perturbations alter landscape-flux, thus identifying key genes in maintaining cell cycle dynamics and predicting associated functional effects. Additionally, it quantifies the fundamental energy cost of the cell cycle initiation and unveils that sustaining the cell cycle requires curl flux and dissipation to maintain the oscillatory phase coherence. This study enables the inference of the cell cycle gene regulatory networks directly from the single-cell transcriptomic data, including the feedback mechanisms and interaction intensity. This provides a golden opportunity to experimentally verify the landscape-flux theory and also obtain its associated quantifications. It also offers a unique framework for combining the landscape-flux theory and single-cell high-through sequencing experiments for understanding the underlying mechanisms of the cell cycle and can be extended to other nonequilibrium biological processes, such as differentiation development and disease pathogenesis.
Collapse
Affiliation(s)
- Ligang Zhu
- College of Physics, Jilin University, Changchun, 130021, P. R. China
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, 130022, P. R. China
| | - Jin Wang
- Center for Theoretical Interdisciplinary Sciences, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, P. R. China
- Department of Chemistry, Physics and Astronomy, Stony Brook University, Stony Brook, NY, 11794, USA
| |
Collapse
|
23
|
Adeoye T, Shah SI, Ullah G. Systematic Analysis of Biological Processes Reveals Gene Co-expression Modules Driving Pathway Dysregulation in Alzheimer's Disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.15.585267. [PMID: 38559218 PMCID: PMC10980062 DOI: 10.1101/2024.03.15.585267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Alzheimer's disease (AD) manifests as a complex systems pathology with intricate interplay among various genes and biological processes. Traditional differential gene expression (DEG) analysis, while commonly employed to characterize AD-driven perturbations, does not sufficiently capture the full spectrum of underlying biological processes. Utilizing single-nucleus RNA-sequencing data from postmortem brain samples across key regions-middle temporal gyrus, superior frontal gyrus, and entorhinal cortex-we provide a comprehensive systematic analysis of disrupted processes in AD. We go beyond the DEG-centric analysis by integrating pathway activity analysis with weighted gene co-expression patterns to comprehensively map gene interconnectivity, identifying region- and cell-type-specific drivers of biological processes associated with AD. Our analysis reveals profound modular heterogeneity in neurons and glia as well as extensive AD-related functional disruptions. Co-expression networks highlighted the extended involvement of astrocytes and microglia in biological processes beyond neuroinflammation, such as calcium homeostasis, glutamate regulation, lipid metabolism, vesicle-mediated transport, and TOR signaling. We find limited representation of DEGs within dysregulated pathways across neurons and glial cells, indicating that differential gene expression alone may not adequately represent the disease complexity. Further dissection of inferred gene modules revealed distinct dynamics of hub DEGs in neurons versus glia, highlighting the differential impact of DEGs on neurons compared to glial cells in driving modular dysregulations underlying perturbed biological processes. Interestingly, we note an overall downregulation of both astrocyte and microglia modules in AD across all brain regions, suggesting a prevailing trend of functional repression in glial cells across these regions. Notable genes, including those of the CALM and HSP90 family genes emerged as hub genes across neuronal modules in all brain regions, indicating conserved roles as drivers of synaptic dysfunction in AD. Our findings demonstrate the importance of an integrated, systems-oriented approach combining pathway and network analysis for a comprehensive understanding of the cell-type-specific roles of genes in AD-related biological processes.
Collapse
Affiliation(s)
- Temitope Adeoye
- Department of Physics, University of South Florida, Tampa, FL 33620
| | - Syed I Shah
- Department of Physics, University of South Florida, Tampa, FL 33620
| | - Ghanim Ullah
- Department of Physics, University of South Florida, Tampa, FL 33620
| |
Collapse
|
24
|
Schäfer PSL, Dimitrov D, Villablanca EJ, Saez-Rodriguez J. Integrating single-cell multi-omics and prior biological knowledge for a functional characterization of the immune system. Nat Immunol 2024; 25:405-417. [PMID: 38413722 DOI: 10.1038/s41590-024-01768-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 01/16/2024] [Indexed: 02/29/2024]
Abstract
The immune system comprises diverse specialized cell types that cooperate to defend the host against a wide range of pathogenic threats. Recent advancements in single-cell and spatial multi-omics technologies provide rich information about the molecular state of immune cells. Here, we review how the integration of single-cell and spatial multi-omics data with prior knowledge-gathered from decades of detailed biochemical studies-allows us to obtain functional insights, focusing on gene regulatory processes and cell-cell interactions. We present diverse applications in immunology and critically assess underlying assumptions and limitations. Finally, we offer a perspective on the ongoing technological and algorithmic developments that promise to get us closer to a systemic mechanistic understanding of the immune system.
Collapse
Affiliation(s)
- Philipp Sven Lars Schäfer
- Institute for Computational Bioscience, Faculty of Medicine and Heidelberg University Hospital, Heidelberg University, Heidelberg, Germany
| | - Daniel Dimitrov
- Institute for Computational Bioscience, Faculty of Medicine and Heidelberg University Hospital, Heidelberg University, Heidelberg, Germany
| | - Eduardo J Villablanca
- Division of Immunology and Allergy, Department of Medicine Solna, Karolinska Institute and Karolinska University Hospital, Stockholm, Sweden
- Center of Molecular Medicine, Stockholm, Sweden
| | - Julio Saez-Rodriguez
- Institute for Computational Bioscience, Faculty of Medicine and Heidelberg University Hospital, Heidelberg University, Heidelberg, Germany.
| |
Collapse
|
25
|
da Silva JEH, de Carvalho PC, Camata JJ, de Oliveira IL, Bernardino HS. A Data-Distribution and Successive Spline Points based discretization approach for evolving gene regulatory networks from scRNA-Seq time-series data using Cartesian Genetic Programming. Biosystems 2024; 236:105126. [PMID: 38278505 DOI: 10.1016/j.biosystems.2024.105126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 11/18/2023] [Accepted: 01/19/2024] [Indexed: 01/28/2024]
Abstract
The inference of gene regulatory networks (GRNs) is a widely addressed problem in Systems Biology. GRNs can be modeled as Boolean networks, which is the simplest approach for this task. However, Boolean models need binarized data. Several approaches have been developed for the discretization of gene expression data (GED). Also, the advance of data extraction technologies, such as single-cell RNA-Sequencing (scRNA-Seq), provides a new vision of gene expression and brings new challenges for dealing with its specificities, such as a large occurrence of zero data. This work proposes a new discretization approach for dealing with scRNA-Seq time-series data, named Distribution and Successive Spline Points Discretization (DSSPD), which considers the data distribution and a proper preprocessing step. Here, Cartesian Genetic Programming (CGP) is used to infer GRNs using the results of DSSPD. The proposal is compared with CGP with the standard data handling and five state-of-the-art algorithms on curated models and experimental data. The results show that the proposal improves the results of CGP in all tested cases and outperforms the state-of-the-art algorithms in most cases.
Collapse
Affiliation(s)
| | | | - José J Camata
- Universidade Federal de Juiz de Fora, Juiz de Fora, MG, Brazil.
| | | | | |
Collapse
|
26
|
Alanis-Lobato G, Bartlett TE, Huang Q, Simon CS, McCarthy A, Elder K, Snell P, Christie L, Niakan KK. MICA: a multi-omics method to predict gene regulatory networks in early human embryos. Life Sci Alliance 2024; 7:e202302415. [PMID: 37879938 PMCID: PMC10599980 DOI: 10.26508/lsa.202302415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 10/12/2023] [Accepted: 10/13/2023] [Indexed: 10/27/2023] Open
Abstract
Recent advances in single-cell omics have transformed characterisation of cell types in challenging-to-study biological contexts. In contexts with limited single-cell samples, such as the early human embryo inference of transcription factor-gene regulatory network (GRN) interactions is especially difficult. Here, we assessed application of different linear or non-linear GRN predictions to single-cell simulated and human embryo transcriptome datasets. We also compared how expression normalisation impacts on GRN predictions, finding that transcripts per million reads outperformed alternative methods. GRN inferences were more reproducible using a non-linear method based on mutual information (MI) applied to single-cell transcriptome datasets refined with chromatin accessibility (CA) (called MICA), compared with alternative network prediction methods tested. MICA captures complex non-monotonic dependencies and feedback loops. Using MICA, we generated the first GRN inferences in early human development. MICA predicted co-localisation of the AP-1 transcription factor subunit proto-oncogene JUND and the TFAP2C transcription factor AP-2γ in early human embryos. Overall, our comparative analysis of GRN prediction methods defines a pipeline that can be applied to single-cell multi-omics datasets in especially challenging contexts to infer interactions between transcription factor expression and target gene regulation.
Collapse
Affiliation(s)
| | | | - Qiulin Huang
- Human Embryo and Stem Cell Laboratory, The Francis Crick Institute, London, UK
- Department of Physiology, Development and Neuroscience, The Centre for Trophoblast Research, University of Cambridge, Cambridge, UK
| | - Claire S Simon
- Human Embryo and Stem Cell Laboratory, The Francis Crick Institute, London, UK
| | - Afshan McCarthy
- Human Embryo and Stem Cell Laboratory, The Francis Crick Institute, London, UK
| | | | | | | | - Kathy K Niakan
- Human Embryo and Stem Cell Laboratory, The Francis Crick Institute, London, UK
- Department of Physiology, Development and Neuroscience, The Centre for Trophoblast Research, University of Cambridge, Cambridge, UK
- Wellcome - Medical Research Council Cambridge Stem Cell Institute, Jeffrey Cheah Biomedical Centre, University of Cambridge, Cambridge, UK
- Epigenetics Programme, Babraham Institute, Cambridge, UK
| |
Collapse
|
27
|
Cheng J, Cheng M, Lusis AJ, Yang X. Gene Regulatory Networks in Coronary Artery Disease. Curr Atheroscler Rep 2023; 25:1013-1023. [PMID: 38008808 PMCID: PMC11466510 DOI: 10.1007/s11883-023-01170-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/09/2023] [Indexed: 11/28/2023]
Abstract
PURPOSE OF REVIEW Coronary artery disease is a complex disorder and the leading cause of mortality worldwide. As technologies for the generation of high-throughput multiomics data have advanced, gene regulatory network modeling has become an increasingly powerful tool in understanding coronary artery disease. This review summarizes recent and novel gene regulatory network tools for bulk tissue and single cell data, existing databases for network construction, and applications of gene regulatory networks in coronary artery disease. RECENT FINDINGS New gene regulatory network tools can integrate multiomics data to elucidate complex disease mechanisms at unprecedented cellular and spatial resolutions. At the same time, updates to coronary artery disease expression data in existing databases have enabled researchers to build gene regulatory networks to study novel disease mechanisms. Gene regulatory networks have proven extremely useful in understanding CAD heritability beyond what is explained by GWAS loci and in identifying mechanisms and key driver genes underlying disease onset and progression. Gene regulatory networks can holistically and comprehensively address the complex nature of coronary artery disease. In this review, we discuss key algorithmic approaches to construct gene regulatory networks and highlight state-of-the-art methods that model specific modes of gene regulation. We also explore recent applications of these tools in coronary artery disease patient data repositories to understand disease heritability and shared and distinct disease mechanisms and key driver genes across tissues, between sexes, and between species.
Collapse
Affiliation(s)
- Jenny Cheng
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA, 90095, USA
- Molecular, Cellular and Integrative Physiology Interdepartmental Program, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA, 90095, USA
| | - Michael Cheng
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA, 90095, USA
| | - Aldons J Lusis
- Department of Medicine, Division of Cardiology, University of California, Los Angeles, 650 Charles E Young Drive South, Los Angeles, CA, 90095, USA.
- Departments of Human Genetics & Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, 650 Charles E. Young Drive South, Los Angeles, CA, 90095, USA.
| | - Xia Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA, 90095, USA.
- Molecular, Cellular and Integrative Physiology Interdepartmental Program, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA, 90095, USA.
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA, 90095, USA.
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA, 90095, USA.
| |
Collapse
|
28
|
Paas-Oliveros E, Hernández-Lemus E, de Anda-Jáuregui G. Computational single cell oncology: state of the art. Front Genet 2023; 14:1256991. [PMID: 38028624 PMCID: PMC10663273 DOI: 10.3389/fgene.2023.1256991] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 10/24/2023] [Indexed: 12/01/2023] Open
Abstract
Single cell computational analysis has emerged as a powerful tool in the field of oncology, enabling researchers to decipher the complex cellular heterogeneity that characterizes cancer. By leveraging computational algorithms and bioinformatics approaches, this methodology provides insights into the underlying genetic, epigenetic and transcriptomic variations among individual cancer cells. In this paper, we present a comprehensive overview of single cell computational analysis in oncology, discussing the key computational techniques employed for data processing, analysis, and interpretation. We explore the challenges associated with single cell data, including data quality control, normalization, dimensionality reduction, clustering, and trajectory inference. Furthermore, we highlight the applications of single cell computational analysis, including the identification of novel cell states, the characterization of tumor subtypes, the discovery of biomarkers, and the prediction of therapy response. Finally, we address the future directions and potential advancements in the field, including the development of machine learning and deep learning approaches for single cell analysis. Overall, this paper aims to provide a roadmap for researchers interested in leveraging computational methods to unlock the full potential of single cell analysis in understanding cancer biology with the goal of advancing precision oncology. For this purpose, we also include a notebook that instructs on how to apply the recommended tools in the Preprocessing and Quality Control section.
Collapse
Affiliation(s)
- Ernesto Paas-Oliveros
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Guillermo de Anda-Jáuregui
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
- Investigadores por Mexico, Conahcyt, Mexico City, Mexico
| |
Collapse
|
29
|
Kim D, Tran A, Kim HJ, Lin Y, Yang JYH, Yang P. Gene regulatory network reconstruction: harnessing the power of single-cell multi-omic data. NPJ Syst Biol Appl 2023; 9:51. [PMID: 37857632 PMCID: PMC10587078 DOI: 10.1038/s41540-023-00312-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/02/2023] [Indexed: 10/21/2023] Open
Abstract
Inferring gene regulatory networks (GRNs) is a fundamental challenge in biology that aims to unravel the complex relationships between genes and their regulators. Deciphering these networks plays a critical role in understanding the underlying regulatory crosstalk that drives many cellular processes and diseases. Recent advances in sequencing technology have led to the development of state-of-the-art GRN inference methods that exploit matched single-cell multi-omic data. By employing diverse mathematical and statistical methodologies, these methods aim to reconstruct more comprehensive and precise gene regulatory networks. In this review, we give a brief overview on the statistical and methodological foundations commonly used in GRN inference methods. We then compare and contrast the latest state-of-the-art GRN inference methods for single-cell matched multi-omics data, and discuss their assumptions, limitations and opportunities. Finally, we discuss the challenges and future directions that hold promise for further advancements in this rapidly developing field.
Collapse
Affiliation(s)
- Daniel Kim
- School of Mathematics and Statistics, University of Sydney, Camperdown, NSW, Australia
- Computational Systems Biology Unit, Children's Medical Research Institute, University of Sydney, Camperdown, NSW, Australia
- Sydney Precision Data Science Centre, University of Sydney, Camperdown, NSW, Australia
| | - Andy Tran
- School of Mathematics and Statistics, University of Sydney, Camperdown, NSW, Australia
- Sydney Precision Data Science Centre, University of Sydney, Camperdown, NSW, Australia
- Charles Perkins Centre, University of Sydney, Camperdown, NSW, Australia
| | - Hani Jieun Kim
- Computational Systems Biology Unit, Children's Medical Research Institute, University of Sydney, Camperdown, NSW, Australia
- Sydney Precision Data Science Centre, University of Sydney, Camperdown, NSW, Australia
| | - Yingxin Lin
- School of Mathematics and Statistics, University of Sydney, Camperdown, NSW, Australia
- Sydney Precision Data Science Centre, University of Sydney, Camperdown, NSW, Australia
- Charles Perkins Centre, University of Sydney, Camperdown, NSW, Australia
| | - Jean Yee Hwa Yang
- School of Mathematics and Statistics, University of Sydney, Camperdown, NSW, Australia.
- Sydney Precision Data Science Centre, University of Sydney, Camperdown, NSW, Australia.
- Charles Perkins Centre, University of Sydney, Camperdown, NSW, Australia.
| | - Pengyi Yang
- School of Mathematics and Statistics, University of Sydney, Camperdown, NSW, Australia.
- Computational Systems Biology Unit, Children's Medical Research Institute, University of Sydney, Camperdown, NSW, Australia.
- Sydney Precision Data Science Centre, University of Sydney, Camperdown, NSW, Australia.
- Charles Perkins Centre, University of Sydney, Camperdown, NSW, Australia.
| |
Collapse
|
30
|
Shi Q, Chen X, Zhang Z. Decoding Human Biology and Disease Using Single-cell Omics Technologies. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:926-949. [PMID: 37739168 PMCID: PMC10928380 DOI: 10.1016/j.gpb.2023.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 05/22/2023] [Accepted: 06/08/2023] [Indexed: 09/24/2023]
Abstract
Over the past decade, advances in single-cell omics (SCO) technologies have enabled the investigation of cellular heterogeneity at an unprecedented resolution and scale, opening a new avenue for understanding human biology and disease. In this review, we summarize the developments of sequencing-based SCO technologies and computational methods, and focus on considerable insights acquired from SCO sequencing studies to understand normal and diseased properties, with a particular emphasis on cancer research. We also discuss the technological improvements of SCO and its possible contribution to fundamental research of the human, as well as its great potential in clinical diagnoses and personalized therapies of human disease.
Collapse
Affiliation(s)
- Qiang Shi
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing 100871, China
| | - Xueyan Chen
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing 100871, China
| | - Zemin Zhang
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing 100871, China; Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China; Changping Laboratory, Beijing 102206, China.
| |
Collapse
|
31
|
Groves SM, Quaranta V. Quantifying cancer cell plasticity with gene regulatory networks and single-cell dynamics. FRONTIERS IN NETWORK PHYSIOLOGY 2023; 3:1225736. [PMID: 37731743 PMCID: PMC10507267 DOI: 10.3389/fnetp.2023.1225736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 08/25/2023] [Indexed: 09/22/2023]
Abstract
Phenotypic plasticity of cancer cells can lead to complex cell state dynamics during tumor progression and acquired resistance. Highly plastic stem-like states may be inherently drug-resistant. Moreover, cell state dynamics in response to therapy allow a tumor to evade treatment. In both scenarios, quantifying plasticity is essential for identifying high-plasticity states or elucidating transition paths between states. Currently, methods to quantify plasticity tend to focus on 1) quantification of quasi-potential based on the underlying gene regulatory network dynamics of the system; or 2) inference of cell potency based on trajectory inference or lineage tracing in single-cell dynamics. Here, we explore both of these approaches and associated computational tools. We then discuss implications of each approach to plasticity metrics, and relevance to cancer treatment strategies.
Collapse
Affiliation(s)
- Sarah M. Groves
- Department of Pharmacology, Vanderbilt University, Nashville, TN, United States
| | - Vito Quaranta
- Department of Pharmacology, Vanderbilt University, Nashville, TN, United States
- Department of Biochemistry, Vanderbilt University, Nashville, TN, United States
| |
Collapse
|
32
|
Vanheer L, Fantuzzi F, To SK, Schiavo A, Van Haele M, Ostyn T, Haesen T, Yi X, Janiszewski A, Chappell J, Rihoux A, Sawatani T, Roskams T, Pattou F, Kerr-Conte J, Cnop M, Pasque V. Inferring regulators of cell identity in the human adult pancreas. NAR Genom Bioinform 2023; 5:lqad068. [PMID: 37435358 PMCID: PMC10331937 DOI: 10.1093/nargab/lqad068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 06/17/2023] [Accepted: 06/28/2023] [Indexed: 07/13/2023] Open
Abstract
Cellular identity during development is under the control of transcription factors that form gene regulatory networks. However, the transcription factors and gene regulatory networks underlying cellular identity in the human adult pancreas remain largely unexplored. Here, we integrate multiple single-cell RNA-sequencing datasets of the human adult pancreas, totaling 7393 cells, and comprehensively reconstruct gene regulatory networks. We show that a network of 142 transcription factors forms distinct regulatory modules that characterize pancreatic cell types. We present evidence that our approach identifies regulators of cell identity and cell states in the human adult pancreas. We predict that HEYL, BHLHE41 and JUND are active in acinar, beta and alpha cells, respectively, and show that these proteins are present in the human adult pancreas as well as in human induced pluripotent stem cell (hiPSC)-derived islet cells. Using single-cell transcriptomics, we found that JUND represses beta cell genes in hiPSC-alpha cells. BHLHE41 depletion induced apoptosis in primary pancreatic islets. The comprehensive gene regulatory network atlas can be explored interactively online. We anticipate our analysis to be the starting point for a more sophisticated dissection of how transcription factors regulate cell identity and cell states in the human adult pancreas.
Collapse
Affiliation(s)
- Lotte Vanheer
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Federica Fantuzzi
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - San Kit To
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Andrea Schiavo
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - Matthias Van Haele
- Department of Imaging and Pathology; Translational Cell and Tissue Research, KU Leuven and University Hospitals Leuven; Herestraat 49, B-3000 Leuven, Belgium
| | - Tessa Ostyn
- Department of Imaging and Pathology; Translational Cell and Tissue Research, KU Leuven and University Hospitals Leuven; Herestraat 49, B-3000 Leuven, Belgium
| | - Tine Haesen
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Xiaoyan Yi
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - Adrian Janiszewski
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Joel Chappell
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Adrien Rihoux
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Toshiaki Sawatani
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - Tania Roskams
- Department of Imaging and Pathology; Translational Cell and Tissue Research, KU Leuven and University Hospitals Leuven; Herestraat 49, B-3000 Leuven, Belgium
| | - Francois Pattou
- University of Lille, Inserm, CHU Lille, Institute Pasteur Lille, U1190-EGID, F-59000 Lille, France
- European Genomic Institute for Diabetes, F-59000 Lille, France
- University of Lille, F-59000 Lille, France
| | - Julie Kerr-Conte
- University of Lille, Inserm, CHU Lille, Institute Pasteur Lille, U1190-EGID, F-59000 Lille, France
- European Genomic Institute for Diabetes, F-59000 Lille, France
- University of Lille, F-59000 Lille, France
| | - Miriam Cnop
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
- Division of Endocrinology; Erasmus Hospital, Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - Vincent Pasque
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| |
Collapse
|
33
|
Marku M, Pancaldi V. From time-series transcriptomics to gene regulatory networks: A review on inference methods. PLoS Comput Biol 2023; 19:e1011254. [PMID: 37561790 PMCID: PMC10414591 DOI: 10.1371/journal.pcbi.1011254] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023] Open
Abstract
Inference of gene regulatory networks has been an active area of research for around 20 years, leading to the development of sophisticated inference algorithms based on a variety of assumptions and approaches. With the ever increasing demand for more accurate and powerful models, the inference problem remains of broad scientific interest. The abstract representation of biological systems through gene regulatory networks represents a powerful method to study such systems, encoding different amounts and types of information. In this review, we summarize the different types of inference algorithms specifically based on time-series transcriptomics, giving an overview of the main applications of gene regulatory networks in computational biology. This review is intended to give an updated reference of regulatory networks inference tools to biologists and researchers new to the topic and guide them in selecting the appropriate inference method that best fits their questions, aims, and experimental data.
Collapse
Affiliation(s)
- Malvina Marku
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
| | - Vera Pancaldi
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
- Barcelona Supercomputing Center, Barcelona, Spain
| |
Collapse
|
34
|
Androvic P, Schifferer M, Perez Anderson K, Cantuti-Castelvetri L, Jiang H, Ji H, Liu L, Gouna G, Berghoff SA, Besson-Girard S, Knoferle J, Simons M, Gokce O. Spatial Transcriptomics-correlated Electron Microscopy maps transcriptional and ultrastructural responses to brain injury. Nat Commun 2023; 14:4115. [PMID: 37433806 DOI: 10.1038/s41467-023-39447-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 06/14/2023] [Indexed: 07/13/2023] Open
Abstract
Understanding the complexity of cellular function within a tissue necessitates the combination of multiple phenotypic readouts. Here, we developed a method that links spatially-resolved gene expression of single cells with their ultrastructural morphology by integrating multiplexed error-robust fluorescence in situ hybridization (MERFISH) and large area volume electron microscopy (EM) on adjacent tissue sections. Using this method, we characterized in situ ultrastructural and transcriptional responses of glial cells and infiltrating T-cells after demyelinating brain injury in male mice. We identified a population of lipid-loaded "foamy" microglia located in the center of remyelinating lesion, as well as rare interferon-responsive microglia, oligodendrocytes, and astrocytes that co-localized with T-cells. We validated our findings using immunocytochemistry and lipid staining-coupled single-cell RNA sequencing. Finally, by integrating these datasets, we detected correlations between full-transcriptome gene expression and ultrastructural features of microglia. Our results offer an integrative view of the spatial, ultrastructural, and transcriptional reorganization of single cells after demyelinating brain injury.
Collapse
Affiliation(s)
- Peter Androvic
- Institute for Stroke and Dementia Research, University Hospital of Munich, LMU Munich, Munich, Germany
| | - Martina Schifferer
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
- Munich Cluster of Systems Neurology (SyNergy), Munich, Germany
| | - Katrin Perez Anderson
- Institute for Stroke and Dementia Research, University Hospital of Munich, LMU Munich, Munich, Germany
| | - Ludovico Cantuti-Castelvetri
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
- Institute of Neuronal Cell Biology, Technical University Munich, Munich, Germany
| | - Hanyi Jiang
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
- Munich Cluster of Systems Neurology (SyNergy), Munich, Germany
| | - Hao Ji
- Institute for Stroke and Dementia Research, University Hospital of Munich, LMU Munich, Munich, Germany
| | - Lu Liu
- Institute for Stroke and Dementia Research, University Hospital of Munich, LMU Munich, Munich, Germany
| | - Garyfallia Gouna
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
- Institute of Neuronal Cell Biology, Technical University Munich, Munich, Germany
| | - Stefan A Berghoff
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
- Institute of Neuronal Cell Biology, Technical University Munich, Munich, Germany
| | - Simon Besson-Girard
- Institute for Stroke and Dementia Research, University Hospital of Munich, LMU Munich, Munich, Germany
| | - Johanna Knoferle
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
- Institute of Neuronal Cell Biology, Technical University Munich, Munich, Germany
- Department of Neurodegenerative Diseases and Geriatric Psychiatry, University Hospital Bonn, Bonn, Germany
| | - Mikael Simons
- Institute for Stroke and Dementia Research, University Hospital of Munich, LMU Munich, Munich, Germany
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
- Munich Cluster of Systems Neurology (SyNergy), Munich, Germany
- Institute of Neuronal Cell Biology, Technical University Munich, Munich, Germany
| | - Ozgun Gokce
- Institute for Stroke and Dementia Research, University Hospital of Munich, LMU Munich, Munich, Germany.
- Munich Cluster of Systems Neurology (SyNergy), Munich, Germany.
- Department of Neurodegenerative Diseases and Geriatric Psychiatry, University Hospital Bonn, Bonn, Germany.
| |
Collapse
|
35
|
Kamal A, Arnold C, Claringbould A, Moussa R, Servaas NH, Kholmatov M, Daga N, Nogina D, Mueller‐Dott S, Reyes‐Palomares A, Palla G, Sigalova O, Bunina D, Pabst C, Zaugg JB. GRaNIE and GRaNPA: inference and evaluation of enhancer-mediated gene regulatory networks. Mol Syst Biol 2023; 19:e11627. [PMID: 37073532 PMCID: PMC10258561 DOI: 10.15252/msb.202311627] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 04/01/2023] [Accepted: 04/03/2023] [Indexed: 04/20/2023] Open
Abstract
Enhancers play a vital role in gene regulation and are critical in mediating the impact of noncoding genetic variants associated with complex traits. Enhancer activity is a cell-type-specific process regulated by transcription factors (TFs), epigenetic mechanisms and genetic variants. Despite the strong mechanistic link between TFs and enhancers, we currently lack a framework for jointly analysing them in cell-type-specific gene regulatory networks (GRN). Equally important, we lack an unbiased way of assessing the biological significance of inferred GRNs since no complete ground truth exists. To address these gaps, we present GRaNIE (Gene Regulatory Network Inference including Enhancers) and GRaNPA (Gene Regulatory Network Performance Analysis). GRaNIE (https://git.embl.de/grp-zaugg/GRaNIE) builds enhancer-mediated GRNs based on covariation of chromatin accessibility and RNA-seq across samples (e.g. individuals), while GRaNPA (https://git.embl.de/grp-zaugg/GRaNPA) assesses the performance of GRNs for predicting cell-type-specific differential expression. We demonstrate their power by investigating gene regulatory mechanisms underlying the response of macrophages to infection, cancer and common genetic traits including autoimmune diseases. Finally, our methods identify the TF PURA as a putative regulator of pro-inflammatory macrophage polarisation.
Collapse
Affiliation(s)
- Aryan Kamal
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
- Faculty of BiosciencesCollaboration for Joint PhD Degree between EMBL and Heidelberg UniversityHeidelbergGermany
| | - Christian Arnold
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
| | - Annique Claringbould
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
| | - Rim Moussa
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
| | - Nila H Servaas
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
| | - Maksim Kholmatov
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
| | - Neha Daga
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
| | - Daria Nogina
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
| | - Sophia Mueller‐Dott
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
| | - Armando Reyes‐Palomares
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
- Present address:
Department of Biochemistry and Molecular BiologyComplutense University of MadridMadridSpain
| | - Giovanni Palla
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
- Present address:
Institute of Computational BiologyHelmholtz Center MunichOberschleißheimGermany
| | - Olga Sigalova
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
- Faculty of BiosciencesCollaboration for Joint PhD Degree between EMBL and Heidelberg UniversityHeidelbergGermany
| | - Daria Bunina
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
| | - Caroline Pabst
- Department of Medicine V, Hematology, Oncology and RheumatologyUniversity Hospital HeidelbergHeidelbergGermany
- Molecular Medicine Partnership UnitUniversity of HeidelbergHeidelbergGermany
| | - Judith B Zaugg
- European Molecular Biology Laboratory, Structural and Computational Biology UnitHeidelbergGermany
- Molecular Medicine Partnership UnitUniversity of HeidelbergHeidelbergGermany
| |
Collapse
|
36
|
Dong M, He Y, Jiang Y, Zou F. Joint gene network construction by single-cell RNA sequencing data. Biometrics 2023; 79:915-925. [PMID: 35184277 PMCID: PMC10548400 DOI: 10.1111/biom.13645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 11/30/2021] [Accepted: 02/07/2022] [Indexed: 11/26/2022]
Abstract
In contrast to differential gene expression analysis at the single-gene level, gene regulatory network (GRN) analysis depicts complex transcriptomic interactions among genes for better understandings of underlying genetic architectures of human diseases and traits. Recent advances in single-cell RNA sequencing (scRNA-seq) allow constructing GRNs at a much finer resolution than bulk RNA-seq and microarray data. However, scRNA-seq data are inherently sparse, which hinders the direct application of the popular Gaussian graphical models (GGMs). Furthermore, most existing approaches for constructing GRNs with scRNA-seq data only consider gene networks under one condition. To better understand GRNs across different but related conditions at single-cell resolution, we propose to construct Joint Gene Networks with scRNA-seq data (JGNsc) under the GGMs framework. To facilitate the use of GGMs, JGNsc first proposes a hybrid imputation procedure that combines a Bayesian zero-inflated Poisson model with an iterative low-rank matrix completion step to efficiently impute zero-inflated counts resulted from technical artifacts. JGNsc then transforms the imputed data via a nonparanormal transformation, based on which joint GGMs are constructed. We demonstrate JGNsc and assess its performance using synthetic data. The application of JGNsc on two cancer clinical studies of medulloblastoma and glioblastoma gains novel insights in addition to confirming well-known biological results.
Collapse
Affiliation(s)
- Meichen Dong
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Yiping He
- Department of Pathology, School of Medicine, Duke University, Durham, North Carolina, USA
| | - Yuchao Jiang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Fei Zou
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
37
|
Schiffthaler B, van Zalen E, Serrano AR, Street NR, Delhomme N. Seiðr: Efficient calculation of robust ensemble gene networks. Heliyon 2023; 9:e16811. [PMID: 37313140 PMCID: PMC10258422 DOI: 10.1016/j.heliyon.2023.e16811] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Revised: 05/22/2023] [Accepted: 05/29/2023] [Indexed: 06/15/2023] Open
Abstract
Gene regulatory and gene co-expression networks are powerful research tools for identifying biological signal within high-dimensional gene expression data. In recent years, research has focused on addressing shortcomings of these techniques with regard to the low signal-to-noise ratio, non-linear interactions and dataset dependent biases of published methods. Furthermore, it has been shown that aggregating networks from multiple methods provides improved results. Despite this, few useable and scalable software tools have been implemented to perform such best-practice analyses. Here, we present Seidr (stylized Seiðr), a software toolkit designed to assist scientists in gene regulatory and gene co-expression network inference. Seidr creates community networks to reduce algorithmic bias and utilizes noise corrected network backboning to prune noisy edges in the networks. Using benchmarks in real-world conditions across three eukaryotic model organisms, Saccharomyces cerevisiae, Drosophila melanogaster, and Arabidopsis thaliana, we show that individual algorithms are biased toward functional evidence for certain gene-gene interactions. We further demonstrate that the community network is less biased, providing robust performance across different standards and comparisons for the model organisms. Finally, we apply Seidr to a network of drought stress in Norway spruce (Picea abies (L.) H. Krast) as an example application in a non-model species. We demonstrate the use of a network inferred using Seidr for identifying key components, communities and suggesting gene function for non-annotated genes.
Collapse
Affiliation(s)
- Bastian Schiffthaler
- Department of Plant Physiology, Umea Plant Science Center, Umea University, Umea, Sweden
| | - Elena van Zalen
- Department of Plant Physiology, Umea Plant Science Center, Umea University, Umea, Sweden
| | - Alonso R. Serrano
- Department of Plant Physiology, Umea Plant Science Center, Swedish University of Agricultural Sciences, Umea, Sweden
| | - Nathaniel R. Street
- Department of Plant Physiology, Umea Plant Science Center, Umea University, Umea, Sweden
| | - Nicolas Delhomme
- Department of Plant Physiology, Umea Plant Science Center, Swedish University of Agricultural Sciences, Umea, Sweden
| |
Collapse
|
38
|
Merchant JP, Zhu K, Henrion MYR, Zaidi SSA, Lau B, Moein S, Alamprese ML, Pearse RV, Bennett DA, Ertekin-Taner N, Young-Pearse TL, Chang R. Predictive network analysis identifies JMJD6 and other potential key drivers in Alzheimer's disease. Commun Biol 2023; 6:503. [PMID: 37188718 PMCID: PMC10185548 DOI: 10.1038/s42003-023-04791-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 03/31/2023] [Indexed: 05/17/2023] Open
Abstract
Despite decades of genetic studies on late-onset Alzheimer's disease, the underlying molecular mechanisms remain unclear. To better comprehend its complex etiology, we use an integrative approach to build robust predictive (causal) network models using two large human multi-omics datasets. We delineate bulk-tissue gene expression into single cell-type gene expression and integrate clinical and pathologic traits, single nucleotide variation, and deconvoluted gene expression for the construction of cell type-specific predictive network models. Here, we focus on neuron-specific network models and prioritize 19 predicted key drivers modulating Alzheimer's pathology, which we then validate by knockdown in human induced pluripotent stem cell-derived neurons. We find that neuronal knockdown of 10 of the 19 targets significantly modulates levels of amyloid-beta and/or phosphorylated tau peptides, most notably JMJD6. We also confirm our network structure by RNA sequencing in the neurons following knockdown of each of the 10 targets, which additionally predicts that they are upstream regulators of REST and VGF. Our work thus identifies robust neuronal key drivers of the Alzheimer's-associated network state which may represent therapeutic targets with relevance to both amyloid and tau pathology in Alzheimer's disease.
Collapse
Affiliation(s)
- Julie P Merchant
- Ann Romney Center for Neurologic Diseases, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Neuroscience Graduate Group, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Kuixi Zhu
- The Center for Innovation in Brain Sciences, University of Arizona, Tucson, AZ, USA
| | - Marc Y R Henrion
- Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, Pembroke Place, L3 5QA, UK
- Malawi-Liverpool-Wellcome Trust Clinical Research Programme, PO Box 30096, Blantyre, Malawi
| | - Syed S A Zaidi
- The Center for Innovation in Brain Sciences, University of Arizona, Tucson, AZ, USA
| | - Branden Lau
- The Center for Innovation in Brain Sciences, University of Arizona, Tucson, AZ, USA
- Arizona Research Labs, Genetics Core, University of Arizona, Tucson, AZ, USA
| | - Sara Moein
- The Center for Innovation in Brain Sciences, University of Arizona, Tucson, AZ, USA
| | - Melissa L Alamprese
- The Center for Innovation in Brain Sciences, University of Arizona, Tucson, AZ, USA
| | - Richard V Pearse
- Ann Romney Center for Neurologic Diseases, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - Nilüfer Ertekin-Taner
- Department of Neuroscience, Mayo Clinic Florida, Jacksonville, FL, USA
- Department of Neurology, Mayo Clinic Florida, Jacksonville, FL, USA
| | - Tracy L Young-Pearse
- Ann Romney Center for Neurologic Diseases, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Harvard Stem Cell Institute, Harvard University, Boston, MA, USA.
| | - Rui Chang
- The Center for Innovation in Brain Sciences, University of Arizona, Tucson, AZ, USA.
- Department of Neurology, University of Arizona, Tucson, AZ, USA.
- INTelico Therapeutics LLC, Tucson, AZ, USA.
- PATH Biotech LLC, Tucson, AZ, USA.
| |
Collapse
|
39
|
Price PD, Parkus SM, Wright AE. Recent progress in understanding the genomic architecture of sexual conflict. Curr Opin Genet Dev 2023; 80:102047. [PMID: 37163877 DOI: 10.1016/j.gde.2023.102047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 04/02/2023] [Accepted: 04/02/2023] [Indexed: 05/12/2023]
Abstract
Genomic conflict between the sexes over shared traits is widely assumed to be resolved through the evolution of sex-biased expression and the subsequent emergence of sexually dimorphic phenotypes. However, while there is support for a broad relationship between genome-wide patterns of expression level and sexual conflict, recent studies suggest that sex differences in the nature and strength of interactions between loci are instead key to conflict resolution. Furthermore, the advent of new technologies for measuring and perturbing expression means we now have much more power to detect genomic signatures of sexual conflict. Here, we review our current understanding of the genomic architecture of sexual conflict in the light of these new studies and highlight the potential for novel approaches to address outstanding knowledge gaps.
Collapse
Affiliation(s)
- Peter D Price
- Ecology and Evolutionary Biology, School of Biosciences, University of Sheffield, United Kingdom. https://twitter.com/@PeterDPrice
| | - Sylvie M Parkus
- Ecology and Evolutionary Biology, School of Biosciences, University of Sheffield, United Kingdom
| | - Alison E Wright
- Ecology and Evolutionary Biology, School of Biosciences, University of Sheffield, United Kingdom.
| |
Collapse
|
40
|
Li L, Sun L, Chen G, Wong CW, Ching WK, Liu ZP. LogBTF: gene regulatory network inference using Boolean threshold network model from single-cell gene expression data. Bioinformatics 2023; 39:btad256. [PMID: 37079737 PMCID: PMC10172039 DOI: 10.1093/bioinformatics/btad256] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/25/2023] [Accepted: 04/13/2023] [Indexed: 04/22/2023] Open
Abstract
MOTIVATION From a systematic perspective, it is crucial to infer and analyze gene regulatory network (GRN) from high-throughput single-cell RNA sequencing data. However, most existing GRN inference methods mainly focus on the network topology, only few of them consider how to explicitly describe the updated logic rules of regulation in GRNs to obtain their dynamics. Moreover, some inference methods also fail to deal with the over-fitting problem caused by the noise in time series data. RESULTS In this article, we propose a novel embedded Boolean threshold network method called LogBTF, which effectively infers GRN by integrating regularized logistic regression and Boolean threshold function. First, the continuous gene expression values are converted into Boolean values and the elastic net regression model is adopted to fit the binarized time series data. Then, the estimated regression coefficients are applied to represent the unknown Boolean threshold function of the candidate Boolean threshold network as the dynamical equations. To overcome the multi-collinearity and over-fitting problems, a new and effective approach is designed to optimize the network topology by adding a perturbation design matrix to the input data and thereafter setting sufficiently small elements of the output coefficient vector to zeros. In addition, the cross-validation procedure is implemented into the Boolean threshold network model framework to strengthen the inference capability. Finally, extensive experiments on one simulated Boolean value dataset, dozens of simulation datasets, and three real single-cell RNA sequencing datasets demonstrate that the LogBTF method can infer GRNs from time series data more accurately than some other alternative methods for GRN inference. AVAILABILITY AND IMPLEMENTATION The source data and code are available at https://github.com/zpliulab/LogBTF.
Collapse
Affiliation(s)
- Lingyu Li
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan 250061, China
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
| | - Liangjie Sun
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
| | - Guangyi Chen
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan 250061, China
| | - Chi-Wing Wong
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
| | - Wai-Ki Ching
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan 250061, China
| |
Collapse
|
41
|
Karaaslanli A, Saha S, Maiti T, Aviyente S. Kernelized multiview signed graph learning for single-cell RNA sequencing data. BMC Bioinformatics 2023; 24:127. [PMID: 37016281 PMCID: PMC10071725 DOI: 10.1186/s12859-023-05250-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Accepted: 03/22/2023] [Indexed: 04/06/2023] Open
Abstract
BACKGROUND Characterizing the topology of gene regulatory networks (GRNs) is a fundamental problem in systems biology. The advent of single cell technologies has made it possible to construct GRNs at finer resolutions than bulk and microarray datasets. However, cellular heterogeneity and sparsity of the single cell datasets render void the application of regular Gaussian assumptions for constructing GRNs. Additionally, most GRN reconstruction approaches estimate a single network for the entire data. This could cause potential loss of information when single cell datasets are generated from multiple treatment conditions/disease states. RESULTS To better characterize single cell GRNs under different but related conditions, we propose the joint estimation of multiple networks using multiple signed graph learning (scMSGL). The proposed method is based on recently developed graph signal processing (GSP) based graph learning, where GRNs and gene expressions are modeled as signed graphs and graph signals, respectively. scMSGL learns multiple GRNs by optimizing the total variation of gene expressions with respect to GRNs while ensuring that the learned GRNs are similar to each other through regularization with respect to a learned signed consensus graph. We further kernelize scMSGL with the kernel selected to suit the structure of single cell data. CONCLUSIONS scMSGL is shown to have superior performance over existing state of the art methods in GRN recovery on simulated datasets. Furthermore, scMSGL successfully identifies well-established regulators in a mouse embryonic stem cell differentiation study and a cancer clinical study of medulloblastoma.
Collapse
Affiliation(s)
- Abdullah Karaaslanli
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, USA.
| | - Satabdi Saha
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Tapabrata Maiti
- Department of Statistics and Probability, Michigan State University, East Lansing, MI, USA
| | - Selin Aviyente
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
42
|
McCalla SG, Fotuhi Siahpirani A, Li J, Pyne S, Stone M, Periyasamy V, Shin J, Roy S. Identifying strengths and weaknesses of methods for computational network inference from single-cell RNA-seq data. G3 (BETHESDA, MD.) 2023; 13:jkad004. [PMID: 36626328 PMCID: PMC9997554 DOI: 10.1093/g3journal/jkad004] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 11/09/2022] [Accepted: 12/16/2022] [Indexed: 01/11/2023]
Abstract
Single-cell RNA-sequencing (scRNA-seq) offers unparalleled insight into the transcriptional programs of different cellular states by measuring the transcriptome of thousands of individual cells. An emerging problem in the analysis of scRNA-seq is the inference of transcriptional gene regulatory networks and a number of methods with different learning frameworks have been developed to address this problem. Here, we present an expanded benchmarking study of eleven recent network inference methods on seven published scRNA-seq datasets in human, mouse, and yeast considering different types of gold standard networks and evaluation metrics. We evaluate methods based on their computing requirements as well as on their ability to recover the network structure. We find that, while most methods have a modest recovery of experimentally derived interactions based on global metrics such as Area Under the Precision Recall curve, methods are able to capture targets of regulators that are relevant to the system under study. Among the top performing methods that use only expression were SCENIC, PIDC, MERLIN or Correlation. Addition of prior biological knowledge and the estimation of transcription factor activities resulted in the best overall performance with the Inferelator and MERLIN methods that use prior knowledge outperforming methods that use expression alone. We found that imputation for network inference did not improve network inference accuracy and could be detrimental. Comparisons of inferred networks for comparable bulk conditions showed that the networks inferred from scRNA-seq datasets are often better or at par with the networks inferred from bulk datasets. Our analysis should be beneficial in selecting methods for network inference. At the same time, this highlights the need for improved methods and better gold standards for regulatory network inference from scRNAseq datasets.
Collapse
Affiliation(s)
- Sunnie Grace McCalla
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | - Jiaxin Li
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Saptarshi Pyne
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - Matthew Stone
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| | - Viswesh Periyasamy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Junha Shin
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| |
Collapse
|
43
|
Franchini M, Pellecchia S, Viscido G, Gambardella G. Single-cell gene set enrichment analysis and transfer learning for functional annotation of scRNA-seq data. NAR Genom Bioinform 2023; 5:lqad024. [PMID: 36879897 PMCID: PMC9985338 DOI: 10.1093/nargab/lqad024] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 01/16/2023] [Accepted: 02/20/2023] [Indexed: 03/07/2023] Open
Abstract
Although an essential step, cell functional annotation often proves particularly challenging from single-cell transcriptional data. Several methods have been developed to accomplish this task. However, in most cases, these rely on techniques initially developed for bulk RNA sequencing or simply make use of marker genes identified from cell clustering followed by supervised annotation. To overcome these limitations and automatize the process, we have developed two novel methods, the single-cell gene set enrichment analysis (scGSEA) and the single-cell mapper (scMAP). scGSEA combines latent data representations and gene set enrichment scores to detect coordinated gene activity at single-cell resolution. scMAP uses transfer learning techniques to re-purpose and contextualize new cells into a reference cell atlas. Using both simulated and real datasets, we show that scGSEA effectively recapitulates recurrent patterns of pathways' activity shared by cells from different experimental conditions. At the same time, we show that scMAP can reliably map and contextualize new single-cell profiles on a breast cancer atlas we recently released. Both tools are provided in an effective and straightforward workflow providing a framework to determine cell function and significantly improve annotation and interpretation of scRNA-seq data.
Collapse
Affiliation(s)
- Melania Franchini
- Telethon Institute of Genetics and Medicine, Pozzuoli 80078 Naples, Italy.,Department of Electrical Engineering and Information Technologies, University of Naples Federico II, 80125 Naples, Italy
| | - Simona Pellecchia
- Telethon Institute of Genetics and Medicine, Pozzuoli 80078 Naples, Italy
| | - Gaetano Viscido
- Telethon Institute of Genetics and Medicine, Pozzuoli 80078 Naples, Italy
| | - Gennaro Gambardella
- Telethon Institute of Genetics and Medicine, Pozzuoli 80078 Naples, Italy.,Department of Chemical Materials and Industrial Engineering, University of Naples Federico II, 80125 Naples, Italy
| |
Collapse
|
44
|
Oubounyt M, Elkjaer ML, Laske T, Grønning AB, Moeller M, Baumbach J. De-novo reconstruction and identification of transcriptional gene regulatory network modules differentiating single-cell clusters. NAR Genom Bioinform 2023; 5:lqad018. [PMID: 36879901 PMCID: PMC9985332 DOI: 10.1093/nargab/lqad018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 01/16/2023] [Accepted: 02/09/2023] [Indexed: 03/07/2023] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) technology provides an unprecedented opportunity to understand gene functions and interactions at single-cell resolution. While computational tools for scRNA-seq data analysis to decipher differential gene expression profiles and differential pathway expression exist, we still lack methods to learn differential regulatory disease mechanisms directly from the single-cell data. Here, we provide a new methodology, named DiNiro, to unravel such mechanisms de novo and report them as small, easily interpretable transcriptional regulatory network modules. We demonstrate that DiNiro is able to uncover novel, relevant, and deep mechanistic models that not just predict but explain differential cellular gene expression programs. DiNiro is available at https://exbio.wzw.tum.de/diniro/.
Collapse
Affiliation(s)
- Mhaned Oubounyt
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Maria L Elkjaer
- Department of Neurology, Odense University Hospital, Odense, Denmark
- Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
- Institute of Molecular Medicine, University of Southern Denmark, Odense, Denmark
| | - Tanja Laske
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Alexander G B Grønning
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Marcus J Moeller
- Heisenberg Chair of Preventive and Translational Nephrology, Department of Nephrology, Rheumatology and Clinical Immunology, RWTH Aachen University, Aachen, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
45
|
van der Sande M, Frölich S, van Heeringen SJ. Computational approaches to understand transcription regulation in development. Biochem Soc Trans 2023; 51:1-12. [PMID: 36695505 PMCID: PMC9988001 DOI: 10.1042/bst20210145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 01/07/2023] [Accepted: 01/13/2023] [Indexed: 01/26/2023]
Abstract
Gene regulatory networks (GRNs) serve as useful abstractions to understand transcriptional dynamics in developmental systems. Computational prediction of GRNs has been successfully applied to genome-wide gene expression measurements with the advent of microarrays and RNA-sequencing. However, these inferred networks are inaccurate and mostly based on correlative rather than causative interactions. In this review, we highlight three approaches that significantly impact GRN inference: (1) moving from one genome-wide functional modality, gene expression, to multi-omics, (2) single cell sequencing, to measure cell type-specific signals and predict context-specific GRNs, and (3) neural networks as flexible models. Together, these experimental and computational developments have the potential to significantly impact the quality of inferred GRNs. Ultimately, accurately modeling the regulatory interactions between transcription factors and their target genes will be essential to understand the role of transcription factors in driving developmental gene expression programs and to derive testable hypotheses for validation.
Collapse
Affiliation(s)
| | | | - Simon J. van Heeringen
- Radboud University, Department of Molecular Developmental Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, 6525GA Nijmegen, The Netherlands
| |
Collapse
|
46
|
Escorcia-Rodríguez JM, Gaytan-Nuñez E, Hernandez-Benitez EM, Zorro-Aranda A, Tello-Palencia MA, Freyre-González JA. Improving gene regulatory network inference and assessment: The importance of using network structure. Front Genet 2023; 14:1143382. [PMID: 36926589 PMCID: PMC10012345 DOI: 10.3389/fgene.2023.1143382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 02/20/2023] [Indexed: 03/03/2023] Open
Abstract
Gene regulatory networks are graph models representing cellular transcription events. Networks are far from complete due to time and resource consumption for experimental validation and curation of the interactions. Previous assessments have shown the modest performance of the available network inference methods based on gene expression data. Here, we study several caveats on the inference of regulatory networks and methods assessment through the quality of the input data and gold standard, and the assessment approach with a focus on the global structure of the network. We used synthetic and biological data for the predictions and experimentally-validated biological networks as the gold standard (ground truth). Standard performance metrics and graph structural properties suggest that methods inferring co-expression networks should no longer be assessed equally with those inferring regulatory interactions. While methods inferring regulatory interactions perform better in global regulatory network inference than co-expression-based methods, the latter is better suited to infer function-specific regulons and co-regulation networks. When merging expression data, the size increase should outweigh the noise inclusion and graph structure should be considered when integrating the inferences. We conclude with guidelines to take advantage of inference methods and their assessment based on the applications and available expression datasets.
Collapse
Affiliation(s)
- Juan M Escorcia-Rodríguez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Estefani Gaytan-Nuñez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.,Undergraduate Program in Genomic Sciences, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Ericka M Hernandez-Benitez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.,Undergraduate Program in Genomic Sciences, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Andrea Zorro-Aranda
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.,Department of Chemical Engineering, Universidad de Antioquia, Medellín, Colombia
| | - Marco A Tello-Palencia
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.,Undergraduate Program in Genomic Sciences, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Julio A Freyre-González
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| |
Collapse
|
47
|
Su M, Pan T, Chen QZ, Zhou WW, Gong Y, Xu G, Yan HY, Li S, Shi QZ, Zhang Y, He X, Jiang CJ, Fan SC, Li X, Cairns MJ, Wang X, Li YS. Data analysis guidelines for single-cell RNA-seq in biomedical studies and clinical applications. Mil Med Res 2022; 9:68. [PMID: 36461064 PMCID: PMC9716519 DOI: 10.1186/s40779-022-00434-8] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 11/18/2022] [Indexed: 12/03/2022] Open
Abstract
The application of single-cell RNA sequencing (scRNA-seq) in biomedical research has advanced our understanding of the pathogenesis of disease and provided valuable insights into new diagnostic and therapeutic strategies. With the expansion of capacity for high-throughput scRNA-seq, including clinical samples, the analysis of these huge volumes of data has become a daunting prospect for researchers entering this field. Here, we review the workflow for typical scRNA-seq data analysis, covering raw data processing and quality control, basic data analysis applicable for almost all scRNA-seq data sets, and advanced data analysis that should be tailored to specific scientific questions. While summarizing the current methods for each analysis step, we also provide an online repository of software and wrapped-up scripts to support the implementation. Recommendations and caveats are pointed out for some specific analysis tasks and approaches. We hope this resource will be helpful to researchers engaging with scRNA-seq, in particular for emerging clinical applications.
Collapse
Affiliation(s)
- Min Su
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Tao Pan
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Qiu-Zhen Chen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Wei-Wei Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081 Heilongjiang China
| | - Yi Gong
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
- Department of Immunology, Nanjing Medical University, Nanjing, 211166 China
| | - Gang Xu
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Huan-Yu Yan
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Si Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Qiao-Zhen Shi
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Ya Zhang
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Xiao He
- Department of Laboratory Medicine, Women and Children’s Hospital of Chongqing Medical University, Chongqing, 401174 China
| | | | - Shi-Cai Fan
- Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, 518110 Guangdong China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081 Heilongjiang China
| | - Murray J. Cairns
- School of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, the University of Newcastle, University Drive, Callaghan, NSW 2308 Australia
- Precision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Xi Wang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Yong-Sheng Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| |
Collapse
|
48
|
Athieniti E, Spyrou GM. A guide to multi-omics data collection and integration for translational medicine. Comput Struct Biotechnol J 2022; 21:134-149. [PMID: 36544480 PMCID: PMC9747357 DOI: 10.1016/j.csbj.2022.11.050] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 11/25/2022] [Accepted: 11/25/2022] [Indexed: 12/02/2022] Open
Abstract
The emerging high-throughput technologies have led to the shift in the design of translational medicine projects towards collecting multi-omics patient samples and, consequently, their integrated analysis. However, the complexity of integrating these datasets has triggered new questions regarding the appropriateness of the available computational methods. Currently, there is no clear consensus on the best combination of omics to include and the data integration methodologies required for their analysis. This article aims to guide the design of multi-omics studies in the field of translational medicine regarding the types of omics and the integration method to choose. We review articles that perform the integration of multiple omics measurements from patient samples. We identify five objectives in translational medicine applications: (i) detect disease-associated molecular patterns, (ii) subtype identification, (iii) diagnosis/prognosis, (iv) drug response prediction, and (v) understand regulatory processes. We describe common trends in the selection of omic types combined for different objectives and diseases. To guide the choice of data integration tools, we group them into the scientific objectives they aim to address. We describe the main computational methods adopted to achieve these objectives and present examples of tools. We compare tools based on how they deal with the computational challenges of data integration and comment on how they perform against predefined objective-specific evaluation criteria. Finally, we discuss examples of tools for downstream analysis and further extraction of novel insights from multi-omics datasets.
Collapse
Affiliation(s)
- Efi Athieniti
- Department of Bioinformatics, The Cyprus Institute of Neurology and Genetics, 6 Iroon Avenue, 2371 Ayios Dometios, Nicosia, Cyprus
| | - George M. Spyrou
- Department of Bioinformatics, The Cyprus Institute of Neurology and Genetics, 6 Iroon Avenue, 2371 Ayios Dometios, Nicosia, Cyprus
| |
Collapse
|
49
|
Luo Q, Maity AK, Teschendorff AE. Distance covariance entropy reveals primed states and bifurcation dynamics in single-cell RNA-Seq data. iScience 2022; 25:105709. [PMID: 36578319 PMCID: PMC9791356 DOI: 10.1016/j.isci.2022.105709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 11/08/2022] [Accepted: 11/29/2022] [Indexed: 12/03/2022] Open
Abstract
Cell-fate transitions are fundamental to development and differentiation. Studying them with single-cell omic data is important to advance our understanding of the cell-fate commitment process, yet this remains challenging. Here we present a computational method called DICE, which analyzes the entropy of expression covariation patterns and which is applicable to static and dynamically changing cell populations. Using only single-cell RNA-Seq data, DICE is able to predict multipotent primed states and their regulatory factors, which we subsequently validate with single-cell epigenomic data. DICE reveals that primed states are often defined by epigenetic regulators or pioneer factors alongside lineage-specific transcription factors. In developmental time course single-cell RNA-Seq datasets, DICE can pinpoint the timing of bifurcations more precisely than lineage-trajectory inference algorithms or competing variance-based methods. In summary, by studying the dynamic changes of expression covariation entropy, DICE can help elucidate primed states and bifurcation dynamics without the need for single-cell epigenomic data.
Collapse
Affiliation(s)
- Qi Luo
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Alok K. Maity
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Andrew E. Teschendorff
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China,Corresponding author
| |
Collapse
|
50
|
Zeng L, Yang K, Zhang T, Zhu X, Hao W, Chen H, Ge J. Research progress of single-cell transcriptome sequencing in autoimmune diseases and autoinflammatory disease: A review. J Autoimmun 2022; 133:102919. [PMID: 36242821 DOI: 10.1016/j.jaut.2022.102919] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 09/16/2022] [Accepted: 09/19/2022] [Indexed: 12/07/2022]
Abstract
Autoimmunity refers to the phenomenon that the body's immune system produces antibodies or sensitized lymphocytes to its own tissues to cause an immune response. Immune disorders caused by autoimmunity can mediate autoimmune diseases. Autoimmune diseases have complicated pathogenesis due to the many types of cells involved, and the mechanism is still unclear. The emergence of single-cell research technology can solve the problem that ordinary transcriptome technology cannot be accurate to cell type. It provides unbiased results through independent analysis of cells in tissues and provides more mRNA information for identifying cell subpopulations, which provides a novel approach to study disruption of immune tolerance and disturbance of pro-inflammatory pathways on a cellular basis. It may fundamentally change the understanding of molecular pathways in the pathogenesis of autoimmune diseases and develop targeted drugs. Single-cell transcriptome sequencing (scRNA-seq) has been widely applied in autoimmune diseases, which provides a powerful tool for demonstrating the cellular heterogeneity of tissues involved in various immune inflammations, identifying pathogenic cell populations, and revealing the mechanism of disease occurrence and development. This review describes the principles of scRNA-seq, introduces common sequencing platforms and practical procedures, and focuses on the progress of scRNA-seq in 41 autoimmune diseases, which include 9 systemic autoimmune diseases and autoinflammatory diseases (rheumatoid arthritis, systemic lupus erythematosus, etc.) and 32 organ-specific autoimmune diseases (5 Skin diseases, 3 Nervous system diseases, 4 Eye diseases, 2 Respiratory system diseases, 2 Circulatory system diseases, 6 Liver, Gallbladder and Pancreas diseases, 2 Gastrointestinal system diseases, 3 Muscle, Bones and joint diseases, 3 Urinary system diseases, 2 Reproductive system diseases). This review also prospects the molecular mechanism targets of autoimmune diseases from the multi-molecular level and multi-dimensional analysis combined with single-cell multi-omics sequencing technology (such as scRNA-seq, Single cell ATAC-seq and single cell immune group library sequencing), which provides a reference for further exploring the pathogenesis and marker screening of autoimmune diseases and autoimmune inflammatory diseases in the future.
Collapse
Affiliation(s)
- Liuting Zeng
- Department of Rheumatology, Peking Union Medical College Hospital, Chinese Academy of Medical Science & Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases, State Key Laboratory of Complex Severe and Rare Diseases, Beijing, China.
| | - Kailin Yang
- Key Laboratory of Hunan Province for Integrated Traditional Chinese and Western Medicine on Prevention and Treatment of Cardio-Cerebral Diseases, Hunan University of Chinese Medicine, Changsha, China.
| | - Tianqing Zhang
- Key Laboratory of Hunan Province for Integrated Traditional Chinese and Western Medicine on Prevention and Treatment of Cardio-Cerebral Diseases, Hunan University of Chinese Medicine, Changsha, China
| | - Xiaofei Zhu
- Key Laboratory of Hunan Province for Integrated Traditional Chinese and Western Medicine on Prevention and Treatment of Cardio-Cerebral Diseases, Hunan University of Chinese Medicine, Changsha, China.
| | - Wensa Hao
- Institute of Materia Medica, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Hua Chen
- Department of Rheumatology, Peking Union Medical College Hospital, Chinese Academy of Medical Science & Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases, State Key Laboratory of Complex Severe and Rare Diseases, Beijing, China.
| | - Jinwen Ge
- Key Laboratory of Hunan Province for Integrated Traditional Chinese and Western Medicine on Prevention and Treatment of Cardio-Cerebral Diseases, Hunan University of Chinese Medicine, Changsha, China; Hunan Academy of Chinese Medicine, Changsha, China.
| |
Collapse
|