1
|
Prater KE, Lin KZ. All the single cells: Single-cell transcriptomics/epigenomics experimental design and analysis considerations for glial biologists. Glia 2025; 73:451-473. [PMID: 39558887 PMCID: PMC11809281 DOI: 10.1002/glia.24633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 09/18/2024] [Accepted: 10/10/2024] [Indexed: 11/20/2024]
Abstract
Single-cell transcriptomics, epigenomics, and other 'omics applied at single-cell resolution can significantly advance hypotheses and understanding of glial biology. Omics technologies are revealing a large and growing number of new glial cell subtypes, defined by their gene expression profile. These subtypes have significant implications for understanding glial cell function, cell-cell communications, and glia-specific changes between homeostasis and conditions such as neurological disease. For many, the training in how to analyze, interpret, and understand these large datasets has been through reading and understanding literature from other fields like biostatistics. Here, we provide a primer for glial biologists on experimental design and analysis of single-cell RNA-seq datasets. Our goal is to further the understanding of why decisions are made about datasets and to enhance biologists' ability to interpret and critique their work and the work of others. We review the steps involved in single-cell analysis with a focus on decision points and particular notes for glia. The goal of this primer is to ensure that single-cell 'omics experiments continue to advance glial biology in a rigorous and replicable way.
Collapse
Affiliation(s)
- Katherine E. Prater
- Department of Neurology, University of Washington School of Medicine, Seattle 98195
| | - Kevin Z. Lin
- Department of Biostatistics, University of Washington, Seattle 98195
| |
Collapse
|
2
|
Gill J, Dasgupta A, Manry B, Markuzon N. Combining single-cell ATAC and RNA sequencing for supervised cell annotation. BMC Bioinformatics 2025; 26:67. [PMID: 40011801 PMCID: PMC11863512 DOI: 10.1186/s12859-025-06084-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2024] [Accepted: 02/13/2025] [Indexed: 02/28/2025] Open
Abstract
MOTIVATION Single-cell analysis offers insights into cellular heterogeneity and individual cell function. Cell type annotation is the first and critical step for performing such an analysis. Current methods mostly utilize single-cell RNA sequencing data. Several studies demonstrated improved unsupervised annotation when combining RNA with single-cell ATAC sequencing, but improvements in supervised methods have not been explored. RESULTS Single-cell 10x genomics multiome datasets containing paired ATAC and RNA from human peripheral blood mononuclear cells (PBMC) and neuronal cells with Alzheimer's Disease were used for supervised annotation. Using linear and nonlinear dimensionality reduction methods and random forest, support vector machine and logistic regression classification models, we demonstrate the improvement in supervised annotation and prediction confidence in PBMC data when using a combination of RNA seq and ATAC-seq data. No such improvement was observed when annotating neuronal cells. Specifically, F1 scores were improved when using scVI embeddings to annotate PBMC sub-types. CD4 T effector memory cells showed the largest improvement in F1 score.
Collapse
Affiliation(s)
- Jaidip Gill
- School of Public Health, Imperial College London, London, England
| | | | - Brychan Manry
- Oncology Data Science, AstraZeneca, Gaithersburg, MD, USA
| | | |
Collapse
|
3
|
Sun J, Philpott M, Loi D, Hoffman G, Robson J, Mehta N, Calcutt E, Gamble V, Brown T, Brown T, Oppermann U, Cribbs AP. Enhancing single-cell transcriptomics using interposed anchor oligonucleotide sequences. Commun Biol 2025; 8:67. [PMID: 39819888 PMCID: PMC11739374 DOI: 10.1038/s42003-025-07474-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 01/07/2025] [Indexed: 01/19/2025] Open
Abstract
Single-cell transcriptomics, which utilises barcodes and unique molecular identifiers (UMIs) for polyA+ mRNA capture, is compromised by oligonucleotide synthesis errors. To address this, we modified the oligonucleotide capture design and integrated an interposed anchor between the barcode and the UMI. This design significantly reduces the need to discard reads due to synthesis inaccuracies. Our results demonstrate that this anchor-enhanced design substantially improves gene expression profiles in droplet-based single-cell sequencing analyses.
Collapse
Affiliation(s)
- Jianfeng Sun
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, National Institute of Health Research Oxford Biomedical Research Unit (BRU), University of Oxford, Oxford, UK
| | - Martin Philpott
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, National Institute of Health Research Oxford Biomedical Research Unit (BRU), University of Oxford, Oxford, UK
| | - Danson Loi
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, National Institute of Health Research Oxford Biomedical Research Unit (BRU), University of Oxford, Oxford, UK
| | | | | | - Neelam Mehta
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, National Institute of Health Research Oxford Biomedical Research Unit (BRU), University of Oxford, Oxford, UK
| | - Eleanor Calcutt
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, National Institute of Health Research Oxford Biomedical Research Unit (BRU), University of Oxford, Oxford, UK
| | - Vicki Gamble
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, National Institute of Health Research Oxford Biomedical Research Unit (BRU), University of Oxford, Oxford, UK
| | - Tom Brown
- ATDBio Ltd (now part of Biotage), Oxford, UK
| | - Tom Brown
- Chemistry Research Laboratory, Department of Chemistry, University of Oxford, Oxford, UK
| | - Udo Oppermann
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, National Institute of Health Research Oxford Biomedical Research Unit (BRU), University of Oxford, Oxford, UK
- Oxford Centre for Translational Myeloma Research University of Oxford, Oxford, UK
| | - Adam P Cribbs
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, National Institute of Health Research Oxford Biomedical Research Unit (BRU), University of Oxford, Oxford, UK.
- Oxford Centre for Translational Myeloma Research University of Oxford, Oxford, UK.
| |
Collapse
|
4
|
Khetan N, Zuckerman B, Calia GP, Chen X, Garcia Arceo X, Weinberger LS. Single-cell RNA sequencing algorithms underestimate changes in transcriptional noise compared to single-molecule RNA imaging. CELL REPORTS METHODS 2024; 4:100933. [PMID: 39662473 PMCID: PMC11704610 DOI: 10.1016/j.crmeth.2024.100933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2024] [Revised: 08/07/2024] [Accepted: 11/15/2024] [Indexed: 12/13/2024]
Abstract
Stochastic fluctuations (noise) in transcription generate substantial cell-to-cell variability. However, how best to quantify genome-wide noise remains unclear. Here, we utilize a small-molecule perturbation (5'-iodo-2'-deoxyuridine [IdU]) to amplify noise and assess noise quantification from numerous single-cell RNA sequencing (scRNA-seq) algorithms on human and mouse datasets and then compare it to noise quantification from single-molecule RNA fluorescence in situ hybridization (smFISH) for a panel of representative genes. We find that various scRNA-seq analyses report amplified noise-without altered mean expression levels-for ∼90% of genes and that smFISH analysis verifies noise amplification for the vast majority of tested genes. Collectively, the analyses suggest that most scRNA-seq algorithms (including a simple normalization approach) are appropriate for quantifying noise, although all algorithms appear to systematically underestimate noise changes compared to smFISH. For practical purposes, this analysis further argues that IdU noise enhancement is globally penetrant-i.e., homeostatically increasing noise without altering mean expression levels-and could enable investigations of the physiological impacts of transcriptional noise.
Collapse
Affiliation(s)
- Neha Khetan
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Binyamin Zuckerman
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Giuliana P Calia
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Xinyue Chen
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Ximena Garcia Arceo
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Leor S Weinberger
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94158, USA; Institute for Evolvable Medicines, Oakland, CA, USA; Autonomous Therapeutics, Inc., Rockville, MD, USA.
| |
Collapse
|
5
|
Yuan L, Sun S, Jiang Y, Zhang Q, Ye L, Zheng CH, Huang DS. scRGCL: a cell type annotation method for single-cell RNA-seq data using residual graph convolutional neural network with contrastive learning. Brief Bioinform 2024; 26:bbae662. [PMID: 39708840 DOI: 10.1093/bib/bbae662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Revised: 11/13/2024] [Accepted: 12/04/2024] [Indexed: 12/23/2024] Open
Abstract
Cell type annotation is a critical step in analyzing single-cell RNA sequencing (scRNA-seq) data. A large number of deep learning (DL)-based methods have been proposed to annotate cell types of scRNA-seq data and have achieved impressive results. However, there are several limitations to these methods. First, they do not fully exploit cell-to-cell differential features. Second, they are developed based on shallow features and lack of flexibility in integrating high-order features in the data. Finally, the low-dimensional gene features may lead to overfitting in neural networks. To overcome those limitations, we propose a novel DL-based model, cell type annotation of single-cell RNA-seq data using residual graph convolutional neural network with contrastive learning (scRGCL), based on residual graph convolutional neural network and contrastive learning for cell type annotation of single-cell RNA-seq data. scRGCL mainly consists of a residual graph convolutional neural network, contrastive learning, and weight freezing. A residual graph convolutional neural network is utilized to extract complex high-order features from data. Contrastive learning can help the model learn meaningful cell-to-cell differential features. Weight freezing can avoid overfitting and help the model discover the impact of specific gene expression on cell type annotation. To verify the effectiveness of scRGCL, we compared its performance with six methods (three shallow learning algorithms and three state-of-the-art DL-based methods) on eight single-cell benchmark datasets from two species (seven in human and one in mouse). Experimental results not only show that scRGCL outperforms competing methods but also demonstrate the generalizability of scRGCL for cell type annotation. scRGCL is available at https://github.com/nathanyl/scRGCL.
Collapse
Affiliation(s)
- Lin Yuan
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China
- Shandong Provincial Key Laboratory of Industrial Network and Information System Security, Shandong Fundamental Research Center for Computer Science, 3501 Daxue Road, 250353, Shandong, China
| | - Shengguo Sun
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China
- Shandong Provincial Key Laboratory of Industrial Network and Information System Security, Shandong Fundamental Research Center for Computer Science, 3501 Daxue Road, 250353, Shandong, China
| | - Yufeng Jiang
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China
- Shandong Provincial Key Laboratory of Industrial Network and Information System Security, Shandong Fundamental Research Center for Computer Science, 3501 Daxue Road, 250353, Shandong, China
| | - Qinhu Zhang
- Ningbo Institute of Digital Twin, Eastern Institute of Technology, 568 Tongxin Road, 315201, Zhejiang, China
| | - Lan Ye
- Cancer Center, The Second Hospital of Shandong University, 247 Beiyuan Street, 250033, Shandong, China
| | - Chun-Hou Zheng
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - De-Shuang Huang
- Ningbo Institute of Digital Twin, Eastern Institute of Technology, 568 Tongxin Road, 315201, Zhejiang, China
| |
Collapse
|
6
|
Khetan N, Zuckerman B, Calia GP, Chen X, Arceo XG, Weinberger LS. Quantitative comparison of single-cell RNA sequencing versus single-molecule RNA imaging for quantifying transcriptional noise. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.09.607289. [PMID: 39149226 PMCID: PMC11326230 DOI: 10.1101/2024.08.09.607289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Stochastic fluctuations (noise) in transcription generate substantial cell-to-cell variability. However, how best to quantify genome-wide noise, remains unclear. Here we utilize a small-molecule perturbation (IdU) to amplify noise and assess noise quantification from numerous scRNA-seq algorithms on human and mouse datasets, and then compare to noise quantification from single-molecule RNA FISH (smFISH) for a panel of representative genes. We find that various scRNA-seq analyses report amplified noise, without altered mean-expression levels, for ~90% of genes and that smFISH analysis verifies noise amplification for the vast majority of genes tested. Collectively, the analyses suggest that most scRNA-seq algorithms are appropriate for quantifying noise including a simple normalization approach, although all of these systematically underestimate noise compared to smFISH. From a practical standpoint, this analysis argues that IdU is a globally penetrant noise-enhancer molecule-amplifying noise without altering mean-expression levels-which could enable investigations of the physiological impacts of transcriptional noise.
Collapse
Affiliation(s)
- Neha Khetan
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, CA 94158
| | - Binyamin Zuckerman
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, CA 94158
| | - Giuliana P. Calia
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, CA 94158
| | - Xinyue Chen
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, CA 94158
| | - Ximena Garcia Arceo
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, CA 94158
| | - Leor S. Weinberger
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, CA 94158
- Department of Biochemistry and Biophysics, University of California, San Francisco, CA 94158
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94158
- Lead contact
| |
Collapse
|
7
|
Mohammadi H, Baranpouyan M, Thirunarayan K, Chen L. HyperCell: Advancing Cell Type Classification with Hyperdimensional Computing. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-4. [PMID: 40039180 DOI: 10.1109/embc53108.2024.10782122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Single-cell RNA sequencing (scRNA-seq) has revolutionized genomics, enabling the exploration of cellular heterogeneity at an unprecedented resolution. However, scRNA-seq data poses challenges, including high dimensionality, inherent noise, and sparse gene expression. In this paper, we propose a novel approach, utilizing hyperdimensional computing, to enhance cell type classification accuracy in scRNA-seq datasets. We use the QuantHD method for high-dimensional hypervector encoding and iterative training. Experiments on diverse datasets subjected to both split by batch and random split settings demonstrate the superiority of our proposed model in handling noise and outperforming established classification methods such as XGBoost, Seurat reference mapping, and scANVI. Our findings highlight the potential of hyperdimensional computing to advance single-cell data analysis, yielding deep insights into cellular dynamics, tissue functions, and disease mechanisms. This work paves the way for more accurate cell type annotation and brings new opportunities for biomedical research and personalized medicine.
Collapse
|
8
|
Radley A, Boeing S, Smith A. Branching topology of the human embryo transcriptome revealed by Entropy Sort Feature Weighting. Development 2024; 151:dev202832. [PMID: 38691188 PMCID: PMC11213519 DOI: 10.1242/dev.202832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 04/24/2024] [Indexed: 05/03/2024]
Abstract
Analysis of single cell transcriptomics (scRNA-seq) data is typically performed after subsetting to highly variable genes (HVGs). Here, we show that Entropy Sorting provides an alternative mathematical framework for feature selection. On synthetic datasets, continuous Entropy Sort Feature Weighting (cESFW) outperforms HVG selection in distinguishing cell-state-specific genes. We apply cESFW to six merged scRNA-seq datasets spanning human early embryo development. Without smoothing or augmenting the raw counts matrices, cESFW generates a high-resolution embedding displaying coherent developmental progression from eight-cell to post-implantation stages and delineating 15 distinct cell states. The embedding highlights sequential lineage decisions during blastocyst development, while unsupervised clustering identifies branch point populations obscured in previous analyses. The first branching region, where morula cells become specified for inner cell mass or trophectoderm, includes cells previously asserted to lack a developmental trajectory. We quantify the relatedness of different pluripotent stem cell cultures to distinct embryo cell types and identify marker genes of naïve and primed pluripotency. Finally, by revealing genes with dynamic lineage-specific expression, we provide markers for staging progression from morula to blastocyst.
Collapse
Affiliation(s)
- Arthur Radley
- Living Systems Institute, University of Exeter, Stocker Road, Exeter EX4 4QD, UK
| | - Stefan Boeing
- Bioinformatics and Biostatistics Science Technology Platform, The Francis Crick Institute, London NW1 1AT, UK
| | - Austin Smith
- Living Systems Institute, University of Exeter, Stocker Road, Exeter EX4 4QD, UK
| |
Collapse
|
9
|
Shu C, Street K, Breton CV, Bastain TM, Wilson ML. A review of single-cell transcriptomics and epigenomics studies in maternal and child health. Epigenomics 2024; 16:775-793. [PMID: 38709139 PMCID: PMC11318716 DOI: 10.1080/17501911.2024.2343276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 04/11/2024] [Indexed: 05/07/2024] Open
Abstract
Single-cell sequencing technologies enhance our understanding of cellular dynamics throughout pregnancy. We outlined the workflow of single-cell sequencing techniques and reviewed single-cell studies in maternal and child health. We conducted a literature review of single cell studies on maternal and child health using PubMed. We summarized the findings from 16 single-cell atlases of the human and mammalian placenta across gestational stages and 31 single-cell studies on maternal exposures and complications including infection, obesity, diet, gestational diabetes, pre-eclampsia, environmental exposure and preterm birth. Single-cell studies provides insights on novel cell types in placenta and cell type-specific marks associated with maternal exposures and complications.
Collapse
Affiliation(s)
- Chang Shu
- Center for Genetic Epidemiology, Division of Epidemiology & Genetics, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA USA
| | - Kelly Street
- Division of Biostatistics, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA USA
| | - Carrie V Breton
- Division of Environmental Health, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA USA
| | - Theresa M Bastain
- Division of Environmental Health, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA USA
| | - Melissa L Wilson
- Division of Disease Prevention, Policy, & Global Health, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles,CA USA
| |
Collapse
|
10
|
Park Y, Hauschild AC. The effect of data transformation on low-dimensional integration of single-cell RNA-seq. BMC Bioinformatics 2024; 25:171. [PMID: 38689234 PMCID: PMC11059821 DOI: 10.1186/s12859-024-05788-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 04/16/2024] [Indexed: 05/02/2024] Open
Abstract
BACKGROUND Recent developments in single-cell RNA sequencing have opened up a multitude of possibilities to study tissues at the level of cellular populations. However, the heterogeneity in single-cell sequencing data necessitates appropriate procedures to adjust for technological limitations and various sources of noise when integrating datasets from different studies. While many analysis procedures employ various preprocessing steps, they often overlook the importance of selecting and optimizing the employed data transformation methods. RESULTS This work investigates data transformation approaches used in single-cell clustering analysis tools and their effects on batch integration analysis. In particular, we compare 16 transformations and their impact on the low-dimensional representations, aiming to reduce the batch effect and integrate multiple single-cell sequencing data. Our results show that data transformations strongly influence the results of single-cell clustering on low-dimensional data space, such as those generated by UMAP or PCA. Moreover, these changes in low-dimensional space significantly affect trajectory analysis using multiple datasets, as well. However, the performance of the data transformations greatly varies across datasets, and the optimal method was different for each dataset. Additionally, we explored how data transformation impacts the analysis of deep feature encodings using deep neural network-based models, including autoencoder-based models and proto-typical networks. Data transformation also strongly affects the outcome of deep neural network models. CONCLUSIONS Our findings suggest that the batch effect and noise in integrative analysis are highly influenced by data transformation. Low-dimensional features can integrate different batches well when proper data transformation is applied. Furthermore, we found that the batch mixing score on low-dimensional space can guide the selection of the optimal data transformation. In conclusion, data preprocessing is one of the most crucial analysis steps and needs to be cautiously considered in the integrative analysis of multiple scRNA-seq datasets.
Collapse
Affiliation(s)
- Youngjun Park
- Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
- International Max Planck Research Schools for Genome Science, Georg-August-Universität Göttingen, Göttingen, Germany
| | - Anne-Christin Hauschild
- Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany.
- Campus-Institute Data Science (CIDAS), Georg-August-Universität Göttingen, Göttingen, Germany.
| |
Collapse
|
11
|
Karakurt HU, Pir P. SUMA: a lightweight machine learning model-powered shared nearest neighbour-based clustering application interface for scRNA-Seq data. Turk J Biol 2023; 47:413-422. [PMID: 38681777 PMCID: PMC11045205 DOI: 10.55730/1300-0152.2675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/28/2023] [Accepted: 12/18/2023] [Indexed: 05/01/2024] Open
Abstract
Background/aim Single-cell transcriptomics (scRNA-Seq) explores cellular diversity at the gene expression level. Due to the inherent sparsity and noise in scRNA-Seq data and the uncertainty on the types of sequenced cells, effective clustering and cell type annotation are essential. The graph-based clustering of scRNA-Seq data is a simple yet powerful approach that presents data as a "shared nearest neighbour" graph and clusters the cells using graph clustering algorithms. These algorithms are dependent on several user-defined parameters.Here we present SUMA, a lightweight tool that uses a random forest model to predict the optimum number of neighbours to obtain the optimum clustering results. Moreover, we integrated our method with other commonly used methods in an RShiny application. SUMA can be used in a local environment (https://github.com/hkarakurt8742/SUMA) or as a browser tool (https://hkarakurt.shinyapps.io/suma/). Materials and methods Publicly available scRNA-Seq datasets and 3 different graph-based clustering algorithms were used to develop SUMA, and a large range for number of neighbours and variant genes was taken into consideration. The quality of clustering was assessed using the adjusted Rand index (ARI) and true labels of each dataset. The data were split into training and test datasets, and the model was built and optimised using Scikit-learn (Python) and randomForest (R) libraries. Results The accuracy of our machine learning model was 0.96, while the AUC of the ROC curve was 0.98. The model indicated that the number of cells in scRNA-Seq data is the most important feature when deciding the number of neighbours. Conclusion We developed and evaluated the SUMA model and implemented the method in the SUMAShiny app, which integrates SUMA with different clustering methods and enables nonbioinformatician users to cluster and visualise their scRNA data easily. The SUMAShiny app is available both for desktop and browser use.
Collapse
Affiliation(s)
- Hamza Umut Karakurt
- Department of Bioengineering, Faculty of Engineering, Gebze Technical University, Kocaeli, Turkiye
- Idea Technology Solutions R&D Center, İstanbul, Turkiye
| | - Pınar Pir
- Department of Bioengineering, Faculty of Engineering, Gebze Technical University, Kocaeli, Turkiye
- Idea Technology Solutions R&D Center, İstanbul, Turkiye
| |
Collapse
|
12
|
Lee Y, Song J, Jeong Y, Choi E, Ahn C, Jang W. Meta-analysis of single-cell RNA-sequencing data for depicting the transcriptomic landscape of chronic obstructive pulmonary disease. Comput Biol Med 2023; 167:107685. [PMID: 37976829 DOI: 10.1016/j.compbiomed.2023.107685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 10/17/2023] [Accepted: 11/06/2023] [Indexed: 11/19/2023]
Abstract
Chronic obstructive pulmonary disease (COPD) is a respiratory disease characterized by airflow limitation and chronic inflammation of the lungs that is a leading cause of death worldwide. Since the complete pathological mechanisms at the single-cell level are not fully understood yet, an integrative approach to characterizing the single-cell-resolution landscape of COPD is required. To identify the cell types and mechanisms associated with the development of COPD, we conducted a meta-analysis using three single-cell RNA-sequencing datasets of COPD. Among the 154,011 cells from 16 COPD patients and 18 healthy subjects, 17 distinct cell types were observed. Of the 17 cell types, monocytes, mast cells, and alveolar type 2 cells (AT2 cells) were found to be etiologically implicated in COPD based on genetic and transcriptomic features. The most transcriptomically diversified states of the three etiological cell types showed significant enrichment in immune/inflammatory responses (monocytes and mast cells) and/or mitochondrial dysfunction (monocytes and AT2 cells). We then identified three chemical candidates that may potentially induce COPD by modulating gene expression patterns in the three etiological cell types. Overall, our study suggests the single-cell level mechanisms underlying the pathogenesis of COPD and may provide information on toxic compounds that could be potential risk factors for COPD.
Collapse
Affiliation(s)
- Yubin Lee
- Department of Life Sciences, Dongguk University, Seoul, 04620, Republic of Korea.
| | - Jaeseung Song
- Department of Life Sciences, Dongguk University, Seoul, 04620, Republic of Korea.
| | - Yeonbin Jeong
- Department of Life Sciences, Dongguk University, Seoul, 04620, Republic of Korea.
| | - Eunyoung Choi
- Department of Life Sciences, Dongguk University, Seoul, 04620, Republic of Korea.
| | - Chulwoo Ahn
- Department of Internal Medicine, Yonsei University College of Medicine, Seoul, 03722, Republic of Korea.
| | - Wonhee Jang
- Department of Life Sciences, Dongguk University, Seoul, 04620, Republic of Korea.
| |
Collapse
|
13
|
Luo J, Wu X, Cheng Y, Chen G, Wang J, Song X. Expression quantitative trait locus studies in the era of single-cell omics. Front Genet 2023; 14:1182579. [PMID: 37284065 PMCID: PMC10239882 DOI: 10.3389/fgene.2023.1182579] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 04/26/2023] [Indexed: 06/08/2023] Open
Abstract
Genome-wide association studies have revealed that the regulation of gene expression bridges genetic variants and complex phenotypes. Profiling of the bulk transcriptome coupled with linkage analysis (expression quantitative trait locus (eQTL) mapping) has advanced our understanding of the relationship between genetic variants and gene regulation in the context of complex phenotypes. However, bulk transcriptomics has inherited limitations as the regulation of gene expression tends to be cell-type-specific. The advent of single-cell RNA-seq technology now enables the identification of the cell-type-specific regulation of gene expression through a single-cell eQTL (sc-eQTL). In this review, we first provide an overview of sc-eQTL studies, including data processing and the mapping procedure of the sc-eQTL. We then discuss the benefits and limitations of sc-eQTL analyses. Finally, we present an overview of the current and future applications of sc-eQTL discoveries.
Collapse
Affiliation(s)
- Jie Luo
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Xinyi Wu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Yuan Cheng
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Guang Chen
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Jian Wang
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Xijiao Song
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| |
Collapse
|
14
|
Yu X, Liu Z, Sun X. Single-cell and spatial multi-omics in the plant sciences: Technical advances, applications, and perspectives. PLANT COMMUNICATIONS 2023; 4:100508. [PMID: 36540021 DOI: 10.1016/j.xplc.2022.100508] [Citation(s) in RCA: 38] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 11/09/2022] [Accepted: 12/16/2022] [Indexed: 05/11/2023]
Abstract
Plants contain a large number of cell types and exhibit complex regulatory mechanisms. Studies at the single-cell level have gradually become more common in plant science. Single-cell transcriptomics, spatial transcriptomics, and spatial metabolomics techniques have been combined to analyze plant development. These techniques have been used to study the transcriptomes and metabolomes of plant tissues at the single-cell level, enabling the systematic investigation of gene expression and metabolism in specific tissues and cell types during defined developmental stages. In this review, we present an overview of significant breakthroughs in spatial multi-omics in plants, and we discuss how these approaches may soon play essential roles in plant research.
Collapse
Affiliation(s)
- Xiaole Yu
- State Key Laboratory of Cotton Biology, State Key Laboratory of Crop Stress Adaptation and Improvement, Key Laboratory of Plant Stress Biology, School of Life Sciences, Henan University, 85 Minglun Street, Kaifeng 475001, P.R. China
| | - Zhixin Liu
- State Key Laboratory of Cotton Biology, State Key Laboratory of Crop Stress Adaptation and Improvement, Key Laboratory of Plant Stress Biology, School of Life Sciences, Henan University, 85 Minglun Street, Kaifeng 475001, P.R. China
| | - Xuwu Sun
- State Key Laboratory of Cotton Biology, State Key Laboratory of Crop Stress Adaptation and Improvement, Key Laboratory of Plant Stress Biology, School of Life Sciences, Henan University, 85 Minglun Street, Kaifeng 475001, P.R. China.
| |
Collapse
|
15
|
Calia GP, Chen X, Zuckerman B, Weinberger LS. Comparative analysis between single-cell RNA-seq and single-molecule RNA FISH indicates that the pyrimidine nucleobase idoxuridine (IdU) globally amplifies transcriptional noise. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.14.532632. [PMID: 36993609 PMCID: PMC10055090 DOI: 10.1101/2023.03.14.532632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Stochastic fluctuations (noise) in transcription generate substantial cell-to-cell variability, but the physiological roles of noise have remained difficult to determine in the absence of generalized noise-modulation approaches. Previous single-cell RNA-sequencing (scRNA-seq) suggested that the pyrimidine-base analog (5'-iodo-2'-deoxyuridine, IdU) could generally amplify noise without substantially altering mean-expression levels but scRNA-seq technical drawbacks potentially obscured the penetrance of IdU-induced transcriptional noise amplification. Here we quantify global-vs.-partial penetrance of IdU-induced noise amplification by assessing scRNA-seq data using numerous normalization algorithms and directly quantifying noise using single-molecule RNA FISH (smFISH) for a panel of genes from across the transcriptome. Alternate scRNA-seq analyses indicate IdU-induced noise amplification for ~90% of genes, and smFISH data verified noise amplification for ~90% of tested genes. Collectively, this analysis indicates which scRNA-seq algorithms are appropriate for quantifying noise and argues that IdU is a globally penetrant noise-enhancer molecule that could enable investigations of the physiological impacts of transcriptional noise.
Collapse
Affiliation(s)
- Giuliana P. Calia
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, CA 94158
| | - Xinyue Chen
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, CA 94158
| | - Binyamin Zuckerman
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, CA 94158
| | - Leor S. Weinberger
- Gladstone|UCSF Center for Cell Circuitry, University of California, San Francisco, CA 94158
- Department of Biochemistry and Biophysics, University of California, San Francisco, CA 94158
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94158
| |
Collapse
|
16
|
Du JH, Cai Z, Roeder K. Robust probabilistic modeling for single-cell multimodal mosaic integration and imputation via scVAEIT. Proc Natl Acad Sci U S A 2022; 119:e2214414119. [PMID: 36459654 PMCID: PMC9894175 DOI: 10.1073/pnas.2214414119] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 11/03/2022] [Indexed: 12/04/2022] Open
Abstract
Recent advances in single-cell technologies enable joint profiling of multiple omics. These profiles can reveal the complex interplay of different regulatory layers in single cells; still, new challenges arise when integrating datasets with some features shared across experiments and others exclusive to a single source; combining information across these sources is called mosaic integration. The difficulties lie in imputing missing molecular layers to build a self-consistent atlas, finding a common latent space, and transferring learning to new data sources robustly. Existing mosaic integration approaches based on matrix factorization cannot efficiently adapt to nonlinear embeddings for the latent cell space and are not designed for accurate imputation of missing molecular layers. By contrast, we propose a probabilistic variational autoencoder model, scVAEIT, to integrate and impute multimodal datasets with mosaic measurements. A key advance is the use of a missing mask for learning the conditional distribution of unobserved modalities and features, which makes scVAEIT flexible to combine different panels of measurements from multimodal datasets accurately and in an end-to-end manner. Imputing the masked features serves as a supervised learning procedure while preventing overfitting by regularization. Focusing on gene expression, protein abundance, and chromatin accessibility, we validate that scVAEIT robustly imputes the missing modalities and features of cells biologically different from the training data. scVAEIT also adjusts for batch effects while maintaining the biological variation, which provides better latent representations for the integrated datasets. We demonstrate that scVAEIT significantly improves integration and imputation across unseen cell types, different technologies, and different tissues.
Collapse
Affiliation(s)
- Jin-Hong Du
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA15213
| | - Zhanrui Cai
- Department of Statistics, Iowa State University, Ames, IA50011
| | - Kathryn Roeder
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA15213
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA15213
| |
Collapse
|
17
|
Chattopadhyay P, Khare K, Kumar M, Mishra P, Anand A, Maurya R, Gupta R, Sahni S, Gupta A, Wadhwa S, Yadav A, Devi P, Tardalkar K, Joshi M, Sethi T, Pandey R. Single-cell multiomics revealed the dynamics of antigen presentation, immune response and T cell activation in the COVID-19 positive and recovered individuals. Front Immunol 2022; 13:1034159. [PMID: 36532041 PMCID: PMC9755500 DOI: 10.3389/fimmu.2022.1034159] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 11/16/2022] [Indexed: 12/05/2022] Open
Abstract
Introduction Despite numerous efforts to describe COVID-19's immunological landscape, there is still a gap in our understanding of the virus's infections after-effects, especially in the recovered patients. This would be important to understand as we now have huge number of global populations infected by the SARS-CoV-2 as well as variables inclusive of VOCs, reinfections, and vaccination breakthroughs. Furthermore, single-cell transcriptome alone is often insufficient to understand the complex human host immune landscape underlying differential disease severity and clinical outcome. Methods By combining single-cell multi-omics (Whole Transcriptome Analysis plus Antibody-seq) and machine learning-based analysis, we aim to better understand the functional aspects of cellular and immunological heterogeneity in the COVID-19 positive, recovered and the healthy individuals. Results Based on single-cell transcriptome and surface marker study of 163,197 cells (124,726 cells after data QC) from the 33 individuals (healthy=4, COVID-19 positive=16, and COVID-19 recovered=13), we observed a reduced MHC Class-I-mediated antigen presentation and dysregulated MHC Class-II-mediated antigen presentation in the COVID-19 patients, with restoration of the process in the recovered individuals. B-cell maturation process was also impaired in the positive and the recovered individuals. Importantly, we discovered that a subset of the naive T-cells from the healthy individuals were absent from the recovered individuals, suggesting a post-infection inflammatory stage. Both COVID-19 positive patients and the recovered individuals exhibited a CD40-CD40LG-mediated inflammatory response in the monocytes and T-cell subsets. T-cells, NK-cells, and monocyte-mediated elevation of immunological, stress and antiviral responses were also seen in the COVID-19 positive and the recovered individuals, along with an abnormal T-cell activation, inflammatory response, and faster cellular transition of T cell subtypes in the COVID-19 patients. Importantly, above immune findings were used for a Bayesian network model, which significantly revealed FOS, CXCL8, IL1β, CST3, PSAP, CD45 and CD74 as COVID-19 severity predictors. Discussion In conclusion, COVID-19 recovered individuals exhibited a hyper-activated inflammatory response with the loss of B cell maturation, suggesting an impeded post-infection stage, necessitating further research to delineate the dynamic immune response associated with the COVID-19. To our knowledge this is first multi-omic study trying to understand the differential and dynamic immune response underlying the sample subtypes.
Collapse
Affiliation(s)
- Partha Chattopadhyay
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Kriti Khare
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Manish Kumar
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
| | - Pallavi Mishra
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
| | - Alok Anand
- Indraprastha Institute of Information Technology Delhi, New Delhi, India
| | - Ranjeet Maurya
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Rohit Gupta
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
| | - Shweta Sahni
- CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
| | - Ayushi Gupta
- Indraprastha Institute of Information Technology Delhi, New Delhi, India
| | - Saruchi Wadhwa
- CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
| | - Aanchal Yadav
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Priti Devi
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Kishore Tardalkar
- Department of Stem Cells and Regenerative Medicine, Dr. D. Y. Patil Medical College, Hospital and Research Institute, Kolhapur, Maharashtra, India
| | - Meghnad Joshi
- Department of Stem Cells and Regenerative Medicine, Dr. D. Y. Patil Medical College, Hospital and Research Institute, Kolhapur, Maharashtra, India
| | - Tavpritesh Sethi
- Indraprastha Institute of Information Technology Delhi, New Delhi, India
| | - Rajesh Pandey
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| |
Collapse
|
18
|
Carangelo G, Magi A, Semeraro R. From multitude to singularity: An up-to-date overview of scRNA-seq data generation and analysis. Front Genet 2022; 13:994069. [PMID: 36263428 PMCID: PMC9575985 DOI: 10.3389/fgene.2022.994069] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 09/15/2022] [Indexed: 11/23/2022] Open
Abstract
Single cell RNA sequencing (scRNA-seq) is today a common and powerful technology in biomedical research settings, allowing to profile the whole transcriptome of a very large number of individual cells and reveal the heterogeneity of complex clinical samples. Traditionally, cells have been classified by their morphology or by expression of certain proteins in functionally distinct settings. The advent of next generation sequencing (NGS) technologies paved the way for the detection and quantitative analysis of cellular content. In this context, transcriptome quantification techniques made their advent, starting from the bulk RNA sequencing, unable to dissect the heterogeneity of a sample, and moving to the first single cell techniques capable of analyzing a small number of cells (1-100), arriving at the current single cell techniques able to generate hundreds of thousands of cells. As experimental protocols have improved rapidly, computational workflows for processing the data have also been refined, opening up to novel methods capable of scaling computational times more favorably with the dataset size and making scRNA-seq much better suited for biomedical research. In this perspective, we will highlight the key technological and computational developments which have enabled the analysis of this growing data, making the scRNA-seq a handy tool in clinical applications.
Collapse
Affiliation(s)
- Giulia Carangelo
- Department of Experimental and Clinical Biomedical Sciences “Mario Serio”, University of Florence, Florence, Italy
| | - Alberto Magi
- Department of Information Engineering, University of Florence, Florence, Italy
| | - Roberto Semeraro
- Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
| |
Collapse
|
19
|
Cho H, Kuo YH, Rockne RC. Comparison of cell state models derived from single-cell RNA sequencing data: graph versus multi-dimensional space. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:8505-8536. [PMID: 35801475 PMCID: PMC9308174 DOI: 10.3934/mbe.2022395] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Single-cell sequencing technologies have revolutionized molecular and cellular biology and stimulated the development of computational tools to analyze the data generated from these technology platforms. However, despite the recent explosion of computational analysis tools, relatively few mathematical models have been developed to utilize these data. Here we compare and contrast two cell state geometries for building mathematical models of cell state-transitions with single-cell RNA-sequencing data with hematopoeisis as a model system; (i) by using partial differential equations on a graph representing intermediate cell states between known cell types, and (ii) by using the equations on a multi-dimensional continuous cell state-space. As an application of our approach, we demonstrate how the calibrated models may be used to mathematically perturb normal hematopoeisis to simulate, predict, and study the emergence of novel cell states during the pathogenesis of acute myeloid leukemia. We particularly focus on comparing the strength and weakness of the graph model and multi-dimensional model.
Collapse
Affiliation(s)
- Heyrim Cho
- Department of Mathematics, University of California Riverside, Riverside, CA, USA
- Interdisciplinary Center for Quantitative Modeling in Biology, University of California Riverside, Riverside, CA, USA
| | - Ya-Huei Kuo
- Department of Hematologic Malignancies Translational Science, City of Hope, Duarte, CA, USA
| | - Russell C. Rockne
- Department of Computational and Quantitative Medicine, Division of Mathematical Oncology, City of Hope, Duarte, CA, USA
- Interdisciplinary Center for Quantitative Modeling in Biology, University of California Riverside, Riverside, CA, USA
| |
Collapse
|
20
|
Maria M, Pouyanfar N, Örd T, Kaikkonen MU. The Power of Single-Cell RNA Sequencing in eQTL Discovery. Genes (Basel) 2022; 13:502. [PMID: 35328055 PMCID: PMC8949403 DOI: 10.3390/genes13030502] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 03/07/2022] [Accepted: 03/10/2022] [Indexed: 02/05/2023] Open
Abstract
Genome-wide association studies have successfully mapped thousands of loci associated with complex traits. During the last decade, functional genomics approaches combining genotype information with bulk RNA-sequencing data have identified genes regulated by GWAS loci through expression quantitative trait locus (eQTL) analysis. Single-cell RNA-Sequencing (scRNA-Seq) technologies have created new exciting opportunities for spatiotemporal assessment of changes in gene expression at the single-cell level in complex and inherited conditions. A growing number of studies have demonstrated the power of scRNA-Seq in eQTL mapping across different cell types, developmental stages and stimuli that could be obscured when using bulk RNA-Seq methods. In this review, we outline the methodological principles, advantages, limitations and the future experimental and analytical considerations of single-cell eQTL studies. We look forward to the explosion of single-cell eQTL studies applied to large-scale population genetics to take us one step closer to understanding the molecular mechanisms of disease.
Collapse
Affiliation(s)
| | | | | | - Minna U. Kaikkonen
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland; (M.M.); (N.P.); (T.Ö.)
| |
Collapse
|