1
|
González-Velasco O, Simon M, Yilmaz R, Parlato R, Weishaupt J, Imbusch C, Brors B. Identifying similar populations across independent single cell studies without data integration. NAR Genom Bioinform 2025; 7:lqaf042. [PMID: 40276039 PMCID: PMC12019640 DOI: 10.1093/nargab/lqaf042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2024] [Revised: 03/13/2025] [Accepted: 03/26/2025] [Indexed: 04/26/2025] Open
Abstract
Supervised and unsupervised methods have emerged to address the complexity of single cell data analysis in the context of large pools of independent studies. Here, we present ClusterFoldSimilarity (CFS), a novel statistical method design to quantify the similarity between cell groups across any number of independent datasets, without the need for data correction or integration. By bypassing these processes, CFS avoids the introduction of artifacts and loss of information, offering a simple, efficient, and scalable solution. This method match groups of cells that exhibit conserved phenotypes across datasets, including different tissues and species, and in a multimodal scenario, including single-cell RNA-Seq, ATAC-Seq, single-cell proteomics, or, more broadly, data exhibiting differential abundance effects among groups of cells. Additionally, CFS performs feature selection, obtaining cross-dataset markers of the similar phenotypes observed, providing an inherent interpretability of relationships between cell populations. To showcase the effectiveness of our methodology, we generated single-nuclei RNA-Seq data from the motor cortex and spinal cord of adult mice. By using CFS, we identified three distinct sub-populations of astrocytes conserved on both tissues. CFS includes various visualization methods for the interpretation of the similarity scores and similar cell populations.
Collapse
Affiliation(s)
- Oscar González-Velasco
- Division Applied Bioinformatics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Division of Neurodegenerative Disorders, Department of Neurology, Medical Faculty Mannheim, Mannheim Center for Translational Neurosciences, Heidelberg University, 68167 Mannheim, Germany
| | - Malte Simon
- Division Applied Bioinformatics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Leibniz Institute for Immunotherapy, 93053 Regensburg, Germany
| | - Rüstem Yilmaz
- Division of Neurodegenerative Disorders, Department of Neurology, Medical Faculty Mannheim, Mannheim Center for Translational Neurosciences, Heidelberg University, 68167 Mannheim, Germany
| | - Rosanna Parlato
- Division of Neurodegenerative Disorders, Department of Neurology, Medical Faculty Mannheim, Mannheim Center for Translational Neurosciences, Heidelberg University, 68167 Mannheim, Germany
| | - Jochen Weishaupt
- Division of Neurodegenerative Disorders, Department of Neurology, Medical Faculty Mannheim, Mannheim Center for Translational Neurosciences, Heidelberg University, 68167 Mannheim, Germany
| | - Charles D Imbusch
- Division Applied Bioinformatics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Institute of Immunology, University Medical Center Mainz, 55131 Mainz, Germany
- Research Center for Immunotherapy, University Medical Center Mainz, 55131 Mainz, Germany
| | - Benedikt Brors
- Division Applied Bioinformatics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- German Cancer Consortium (DKTK), Core Center Heidelberg, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
- Medical Faculty Heidelberg and Faculty of Biosciences, Heidelberg University, 69120 Heidelberg, Germany
| |
Collapse
|
2
|
Zhang N, Sun Q, Zhang J, Zhang R, Liu S, Zhao X, Ma J, Li X. Intrapancreatic adipocytes and beta cell dedifferentiation in human type 2 diabetes. Diabetologia 2025; 68:1242-1260. [PMID: 40072535 DOI: 10.1007/s00125-025-06392-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2024] [Accepted: 01/20/2025] [Indexed: 03/14/2025]
Abstract
AIMS/HYPOTHESIS Fat deposition in the pancreas is implicated in beta cell dysfunction and the progress of type 2 diabetes. However, there is limited evidence to confirm the correlation and explore how pancreatic fat links with beta cell dysfunction in human type 2 diabetes. This study aimed to examine the spatial relationship between pancreatic fat and islets in human pancreases. METHODS Histological analysis of pancreatic specimens from 50 organ donors (15 with type 2 diabetes, 35 without) assessed pancreatic fat content variation among individuals with diabetes and its correlation with estimated beta cell mass and cell distribution within islets. Bioinformatic analysis of single-cell RNA-seq of 11 type 2 diabetic donors (from the Human Pancreatic Analysis Project database) explored the impact of high pancreatic fat content on beta cell gene expression and cell fate. Validation of bioinformatic results was performed with the above diabetic pancreases. RESULTS Pancreatic fat content was higher in individuals with type 2 diabetes (10.24% [3.29-13.89%] vs 0.74% [0.34-5.11%], p<0.001), negatively correlated with estimated beta cell mass (r=-0.675, p=0.006) and positively with alpha-to-beta cell ratio (r=0.608, p=0.016). Enrichment analysis indicated that in diabetic donors with higher pancreatic fat content, the expression of ALDH1A3, beta cell dedifferentiation marker, in both alpha and beta cells was significantly increased, and in beta cells, the expression of NPY decreased. Pseudotime analysis revealed beta cell dedifferentiation and transdifferentiation towards alpha cells in diabetic donors with higher pancreatic fat content, with decreased expression of genes related to beta cell maturation and function, including INSM1, MafA and NPY. Concurrently, pathways related to inflammation and immune response were activated. Histologically, pancreatic fat content correlated positively with the percentage of beta cells positive for aldehyde dehydrogenase 1 family member A3 (ALDH1A3) within the islets (r=0.594, p=0.020) and the ALDH1A3 positivity rate in beta cells (r=0.615, p=0.015). And the number of T cells adjacent to adipocytes was related to the distribution pattern of adipocytes and the dedifferentiation phenotype in islets. CONCLUSIONS/INTERPRETATION Higher pancreatic fat content was accompanied by increased beta cell dedifferentiation in the individuals with diabetes. Clusters of adipocytes significantly contribute to higher pancreatic fat content and immune cell recruitment. Overall, the interactions among adipocytes, immune cells and beta cells in the pancreas microenvironment might contribute to beta cell failure and dedifferentiation in type 2 diabetes.
Collapse
Affiliation(s)
- Na Zhang
- Department of Endocrinology and Metabolism, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Qiman Sun
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Jiaxin Zhang
- Department of Endocrinology and Metabolism, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Ruonan Zhang
- Department of Endocrinology and Metabolism, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Siyi Liu
- Fudan University, Shanghai, China
| | - Xuelian Zhao
- Department of Pathology, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Jing Ma
- Department of Endocrinology and Metabolism, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Xiaomu Li
- Department of Endocrinology and Metabolism, Zhongshan Hospital, Fudan University, Shanghai, China.
| |
Collapse
|
3
|
Lam VK, Byers JM, Robitaille MC, Kaler L, Christodoulides JA, Raphael MP. A self-supervised learning approach for high throughput and high content cell segmentation. Commun Biol 2025; 8:780. [PMID: 40399569 PMCID: PMC12095644 DOI: 10.1038/s42003-025-08190-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Accepted: 05/07/2025] [Indexed: 05/23/2025] Open
Abstract
In principle, ML/AI-based algorithms should enable rapid and accurate cell segmentation in high-throughput settings. However, reliance on large training datasets, human input, computational expertise, and limited generalizability has prevented this goal of completely automated, high-throughput segmentation from being achieved. To overcome these roadblocks, we introduce an innovative self-supervised learning method (SSL) for pixel classification that does not require parameter tuning or curated data sets, and instead trains itself on the end-users' own data in a completely automated fashion, thus providing a more efficient cell segmentation approach for high-throughput, high-content image analysis. We demonstrate that our algorithm meets the criteria of being fully automated with versatility across various magnifications, optical modalities, and cell types. Moreover, our SSL algorithm is capable of identifying complex cellular structures and organelles, which are otherwise easily missed, thereby broadening the machine learning applications to high-content imaging. Our SSL technique displayed consistently high F1 scores across segmented cell images, with scores ranging from 0.771 to 0.888, matching or outperforming the popular Cellpose algorithm, which showed a greater F1 variance of 0.454 to 0.882, primarily due to more false negatives.
Collapse
Affiliation(s)
- Van K Lam
- US Naval Research Laboratory, Washington, DC, USA
| | - Jeff M Byers
- US Naval Research Laboratory, Washington, DC, USA
| | | | - Logan Kaler
- US Naval Research Laboratory, Washington, DC, USA
| | | | | |
Collapse
|
4
|
Yang R, Celino-Brady FT, Dunleavy JEM, Vigh-Conrad KA, Atkins GR, Hvasta RL, Pombar CRX, Yatsenko AN, Orwig KE, O'Bryan MK, Lima AC, Conrad DF. SATINN v2: automated image analysis for mouse testis histology with multi-laboratory data integration†. Biol Reprod 2025; 112:996-1014. [PMID: 39961022 DOI: 10.1093/biolre/ioaf033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Revised: 11/08/2024] [Accepted: 02/16/2025] [Indexed: 03/21/2025] Open
Abstract
Analysis of testis histology is fundamental to the study of male fertility, but it is a slow task with a high skill threshold. Here, we describe new neural network models for the automated classification of cell types and tubule stages from whole-slide brightfield images of mouse testis. The cell type classifier recognizes 14 cell types, including multiple steps of meiosis I prophase, with an external validation accuracy of 96%. The tubule stage classifier distinguishes all 12 canonical tubule stages with external validation accuracy of 63%, which increases to 96% when allowing for ±1 stage tolerance. We addressed generalizability of SATINN, through extensive training diversification and testing on external (non-training population) wildtype and mutant datasets. This allowed us to use SATINN to successfully process data generated in multiple laboratories. We used SATINN to analyze testis images from eight different mutant lines, generated from three different labs with a range of tissue processing protocols. Finally, we show that it is possible to use SATINN output to cluster histology images in latent space, which, when applied to the eight mutant lines, reveals known relationships in their pathology. This work represents significant progress towards a tool for robust, automated testis histopathology that can be used by multiple labs.
Collapse
Affiliation(s)
- Ran Yang
- Division of Genetics, Oregon National Primate Research Center, Oregon Health and Science University, Portland, OR, United States
| | - Fritzie T Celino-Brady
- Division of Genetics, Oregon National Primate Research Center, Oregon Health and Science University, Portland, OR, United States
| | - Jessica E M Dunleavy
- School of Biosciences and Bio21 Molecular Science and Biotechnology Institute, Faculty of Science, The University of Melbourne, Melbourne, VIC, Australia
| | - Katinka A Vigh-Conrad
- Division of Genetics, Oregon National Primate Research Center, Oregon Health and Science University, Portland, OR, United States
| | - Georgia R Atkins
- Department of Obstetrics, Gynecology and Reproductive Sciences, Magee-Womens Research Institute, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
- Molecular Genetics and Developmental Biology Graduate Program, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| | - Rachel L Hvasta
- Department of Obstetrics, Gynecology and Reproductive Sciences, Magee-Womens Research Institute, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| | - Christopher R X Pombar
- Department of Obstetrics, Gynecology and Reproductive Sciences, Magee-Womens Research Institute, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| | - Alexander N Yatsenko
- Department of Obstetrics, Gynecology and Reproductive Sciences, Magee-Womens Research Institute, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| | - Kyle E Orwig
- Department of Obstetrics, Gynecology and Reproductive Sciences, Magee-Womens Research Institute, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| | - Moira K O'Bryan
- School of Biosciences and Bio21 Molecular Science and Biotechnology Institute, Faculty of Science, The University of Melbourne, Melbourne, VIC, Australia
| | - Ana C Lima
- Division of Genetics, Oregon National Primate Research Center, Oregon Health and Science University, Portland, OR, United States
| | - Donald F Conrad
- Division of Genetics, Oregon National Primate Research Center, Oregon Health and Science University, Portland, OR, United States
| |
Collapse
|
5
|
Emili E, Pérez-Posada A, Vanni V, Salamanca-Díaz D, Ródriguez-Fernández D, Christodoulou MD, Solana J. Allometry of cell types in planarians by single-cell transcriptomics. SCIENCE ADVANCES 2025; 11:eadm7042. [PMID: 40333969 PMCID: PMC12057665 DOI: 10.1126/sciadv.adm7042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 04/02/2025] [Indexed: 05/09/2025]
Abstract
Allometry explores the relationship between an organism's body size and its various components, offering insights into ecology, physiology, metabolism, and disease. The cell is the basic unit of biological systems, and yet the study of cell-type allometry remains relatively unexplored. Single-cell RNA sequencing (scRNA-seq) provides a promising tool for investigating cell-type allometry. Planarians, capable of growing and degrowing following allometric scaling rules, serve as an excellent model for these studies. We used scRNA-seq to examine cell-type allometry in asexual planarians of different sizes, revealing that they consist of the same basic cell types but in varying proportions. Notably, the gut basal cells are the most responsive to changes in size, suggesting a role in energy storage. We capture the regulated gene modules of distinct cell types in response to body size. This research sheds light on the molecular and cellular aspects of cell-type allometry in planarians and underscores the utility of scRNA-seq in these investigations.
Collapse
Affiliation(s)
- Elena Emili
- Department of Biological and Medical Sciences, Oxford Brookes University, Oxford, UK
| | - Alberto Pérez-Posada
- Department of Biological and Medical Sciences, Oxford Brookes University, Oxford, UK
- Living Systems Institute, University of Exeter, Exeter, UK
- Department of Biosciences, University of Exeter, Exeter, UK
| | - Virginia Vanni
- Department of Biological and Medical Sciences, Oxford Brookes University, Oxford, UK
- Living Systems Institute, University of Exeter, Exeter, UK
- Department of Biosciences, University of Exeter, Exeter, UK
| | - David Salamanca-Díaz
- Living Systems Institute, University of Exeter, Exeter, UK
- Department of Biosciences, University of Exeter, Exeter, UK
| | | | | | - Jordi Solana
- Department of Biological and Medical Sciences, Oxford Brookes University, Oxford, UK
- Living Systems Institute, University of Exeter, Exeter, UK
- Department of Biosciences, University of Exeter, Exeter, UK
| |
Collapse
|
6
|
Marshall L, Raychaudhuri S, Viatte S. Understanding rheumatic disease through continuous cell state analysis. Nat Rev Rheumatol 2025:10.1038/s41584-025-01253-6. [PMID: 40335652 DOI: 10.1038/s41584-025-01253-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/03/2025] [Indexed: 05/09/2025]
Abstract
Autoimmune rheumatic diseases are a heterogeneous group of conditions, including rheumatoid arthritis (RA) and systemic lupus erythematosus. With the increasing availability of large single-cell datasets, novel disease-associated cell types continue to be identified and characterized at multiple omics layers, for example, 'T peripheral helper' (TPH) (CXCR5- PD-1hi) cells in RA and systemic lupus erythematosus and MerTK+ myeloid cells in RA. Despite efforts to define disease-relevant cell atlases, the very definition of a 'cell type' or 'lineage' has proven controversial as higher resolution assays emerge. This Review explores the cell types and states involved in disease pathogenesis, with a focus on the shifting perspectives on immune and stromal cell taxonomy. These understandings of cell identity are closely related to the computational methods adopted for analysis, with implications for the interpretation of single-cell data. Understanding the underlying cellular architecture of disease is also crucial for therapeutic research as ambiguity hinders translation to the clinical setting. We discuss the implications of different frameworks for cell identity for disease treatment and the discovery of predictive biomarkers for stratified medicine - an unmet clinical need for autoimmune rheumatic diseases.
Collapse
Affiliation(s)
- Lysette Marshall
- Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, The University of Manchester, Manchester, UK
| | - Soumya Raychaudhuri
- Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, The University of Manchester, Manchester, UK
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Divisions of Rheumatology, Inflammation and Immunity and Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute, Cambridge, MA, USA
| | - Sebastien Viatte
- Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, The University of Manchester, Manchester, UK.
- NIHR Manchester Musculoskeletal Biomedical Research Centre, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, UK.
- Lydia Becker Institute of Immunology and Inflammation, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK.
| |
Collapse
|
7
|
Mocking TR, van de Loosdrecht AA, Cloos J, Bachas C. Applications of machine learning for immunophenotypic measurable residual disease assessment in acute myeloid leukemia. Hemasphere 2025; 9:e70138. [PMID: 40400510 PMCID: PMC12093103 DOI: 10.1002/hem3.70138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2025] [Revised: 03/16/2025] [Accepted: 04/08/2025] [Indexed: 05/23/2025] Open
Abstract
Immunophenotypic detection and quantification of residual leukemic cells by multiparameter flow cytometry is increasingly adopted in the clinical practice of acute myeloid leukemia (AML) to assess measurable residual disease (MRD). However, MRD levels quantified by manual gating analysis can differ based on differences in gating strategy between trained operators and clinical centers. Manual gating requires extensive training, is time-consuming in daily practice, and faces a significant hurdle in analyzing data from next-generation cytometry platforms. To address these challenges, several computational approaches involving machine learning and artificial intelligence algorithms have been proposed to automate or aid the assessment of MRD. However, the immunophenotypic variability between patients and the relatively low proportions of residual leukemic cells in AML challenge most algorithms and require innovative approaches. This review provides an overview of recent efforts in using computational methods for immunophenotypic AML-MRD assessment. We first explain the technical and conceptual background of the different algorithms that have been explored. Next, we discuss their strengths and limitations in the disease-specific context of AML. Finally, we highlight how computational approaches offer a unique opportunity to standardize or even outperform current manual gating analyses, and ultimately, improve the treatment of AML patients.
Collapse
Affiliation(s)
- Tim R. Mocking
- Department of Hematology, Amsterdam UMCVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Cancer Center AmsterdamImaging and BiomarkersAmsterdamThe Netherlands
| | - Arjan A. van de Loosdrecht
- Department of Hematology, Amsterdam UMCVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Cancer Center AmsterdamImaging and BiomarkersAmsterdamThe Netherlands
| | - Jacqueline Cloos
- Department of Hematology, Amsterdam UMCVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Cancer Center AmsterdamImaging and BiomarkersAmsterdamThe Netherlands
| | - Costa Bachas
- Department of Hematology, Amsterdam UMCVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Cancer Center AmsterdamImaging and BiomarkersAmsterdamThe Netherlands
| |
Collapse
|
8
|
Russo CJ, Husain K, Murugan A. Soft Modes as a Predictive Framework for Low-Dimensional Biological Systems Across Scales. Annu Rev Biophys 2025; 54:401-426. [PMID: 39971349 PMCID: PMC12079786 DOI: 10.1146/annurev-biophys-081624-030543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
All biological systems are subject to perturbations arising from thermal fluctuations, external environments, or mutations. Yet, while biological systems consist of thousands of interacting components, recent high-throughput experiments have shown that their response to perturbations is surprisingly low dimensional: confined to only a few stereotyped changes out of the many possible. In this review, we explore a unifying dynamical systems framework-soft modes-to explain and analyze low dimensionality in biology, from molecules to ecosystems. We argue that this soft mode framework makes nontrivial predictions that generalize classic ideas from developmental biology to disparate systems, namely phenocopying, dual buffering, and global epistasis. While some of these predictions have been borne out in experiments, we discuss how soft modes allow for a surprisingly far-reaching and unifying framework in which to analyze data from protein biophysics to microbial ecology.
Collapse
Affiliation(s)
- Christopher Joel Russo
- James Franck Institute, University of Chicago, Chicago, Illinois, USA
- Program in Biophysical Sciences, University of Chicago, Chicago, Illinois, USA
| | - Kabir Husain
- James Franck Institute, University of Chicago, Chicago, Illinois, USA
- Department of Physics, University College London, London, United Kingdom
| | - Arvind Murugan
- James Franck Institute, University of Chicago, Chicago, Illinois, USA
- Department of Physics, University of Chicago, Chicago, Illinois, USA;
| |
Collapse
|
9
|
Ruiz Daniels R, Salisbury SJ, Sveen L, Villamayor PR, Taylor RS, Vaadal M, Tengs T, Krasnov A, Monaghan SJ, Ballantyne M, Penaloza C, Fast MD, Bron JE, Houston R, Robinson N, Robledo D. Transcriptomic characterization of transitioning cell types in the skin of Atlantic salmon. BMC Biol 2025; 23:109. [PMID: 40289111 PMCID: PMC12036301 DOI: 10.1186/s12915-025-02196-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2024] [Accepted: 03/21/2025] [Indexed: 04/30/2025] Open
Abstract
BACKGROUND The skin maintains the body's integrity and serves as the first line of defence against pathogens, stressors and mechanical injuries. Despite the global significance of salmon in aquaculture, how the transcriptomic profile of cells varies during wound healing remains unexplored. Teleost's skin contains adult pluripotent cells that differentiate into various tissues, including bone, cartilage, tendon, ligament, adipose, dermis, muscle and connective tissue within the skin. These cells are pivotal for preserving the integrity of skin tissue throughout an organism's lifespan and actively participate in the wound healing processes. In this study, we characterize the transcriptomic profiles of putative mesenchymal stromal cells (fibroblast-like adult stem cells) in healthy Atlantic salmon tissue and during the wound healing process. RESULTS Single-nucleus sequencing and spatial transcriptomics were used to detect transcriptomic changes occurring during wound healing that are commonly associated with mesenchymal stromal cells. We followed the transcriptomic activity of these cells during an in vivo wound healing time course study showing that these cells become more transcriptionally active during the remodelling stage of wound healing. The changes detected give insights into the potential differentiation pathways leading to osteogenic and fibroblast lineages in the skin of Atlantic salmon. CONCLUSIONS We chart the transcriptomic activity of subclusters of putative differentiating stromal cells during the process of wound healing for the first time, revealing different spatial niches of the various putative MSC subclusters, and setting the stage for further investigation of the manipulation of transitioning cell types to improve fish health.
Collapse
Affiliation(s)
- R Ruiz Daniels
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UK.
- Institute of Aquaculture, University of Stirling, Stirling, UK.
| | - S J Salisbury
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UK
| | | | - P R Villamayor
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UK
| | - R S Taylor
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UK
| | | | | | | | - S J Monaghan
- Institute of Aquaculture, University of Stirling, Stirling, UK
| | - M Ballantyne
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UK
| | - C Penaloza
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UK
- Benchmark Genetics, Penicuik, UK
| | - M D Fast
- Hoplite Research Lab, Department of Pathology and Microbiology, Atlantic Veterinary College, University of Prince Edward Island, Charlottetown, PEI, Canada
| | - J E Bron
- Institute of Aquaculture, University of Stirling, Stirling, UK
| | | | - N Robinson
- Nofima AS, Ås, Norway
- Sustainable Aquaculture Laboratory, Deakin University, Victoria, Australia
| | - D Robledo
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UK.
- University of Santiago de Compostela, Santiago de Compostela, Spain.
| |
Collapse
|
10
|
Olzhabaev T, Müller L, Krause D, Schwudke D, Torda AE. Lipidome visualisation, comparison, and analysis in a vector space. PLoS Comput Biol 2025; 21:e1012892. [PMID: 40233092 PMCID: PMC12058142 DOI: 10.1371/journal.pcbi.1012892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Revised: 05/07/2025] [Accepted: 02/20/2025] [Indexed: 04/17/2025] Open
Abstract
A shallow neural network was used to embed lipid structures in a 2- or 3-dimensional space with the goal that structurally similar species have similar vectors. Tests on complete lipid databanks show that the method automatically produces distributions which follow conventional lipid classifications. The embedding is accompanied by the web-based software, Lipidome Projector. This displays user lipidomes as 2D or 3D scatterplots for quick exploratory analysis, quantitative comparison and interpretation at a structural level. Examples of published data sets were used for a qualitative comparison with literature interpretation.
Collapse
Affiliation(s)
- Timur Olzhabaev
- Centre for Bioinformatics, University of Hamburg, Hamburg, Germany
- Bioanalytical Chemistry, Research Center Borstel Leibniz Lung Center, Borstel, Germany
| | - Lukas Müller
- Centre for Bioinformatics, University of Hamburg, Hamburg, Germany
- Bioanalytical Chemistry, Research Center Borstel Leibniz Lung Center, Borstel, Germany
| | - Daniel Krause
- Bioanalytical Chemistry, Research Center Borstel Leibniz Lung Center, Borstel, Germany
| | - Dominik Schwudke
- Bioanalytical Chemistry, Research Center Borstel Leibniz Lung Center, Borstel, Germany
- German Center for Infection Research, Thematic Translational Unit Tuberculosis, Borstel, Germany
- German Center for Lung Research (DZL), Airway Research Center North (ARCN), Borstel, Germany
| | | |
Collapse
|
11
|
Pickard J, Sturgess VE, McDonald KO, Rossiter N, Arnold KB, Shah YM, Rajapakse I, Beard DA. A Hands-On Introduction to Data Analytics for Biomedical Research. FUNCTION 2025; 6:zqaf015. [PMID: 40199731 PMCID: PMC11999024 DOI: 10.1093/function/zqaf015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2024] [Revised: 03/07/2025] [Accepted: 03/12/2025] [Indexed: 04/10/2025] Open
Abstract
Artificial intelligence (AI) applications are having increasing impacts in the biomedical sciences. Modern AI tools enable uncovering hidden patterns in large datasets, forecasting outcomes, and numerous other applications. Despite the availability and power of these tools, the rapid expansion and complexity of AI applications can be daunting, and there is a conspicuous absence of consensus on their ethical and responsible use. Misapplication of AI can result in invalid, unclear, or biased outcomes, exacerbated by the unfamiliarity of many biomedical researchers with the underlying mathematical and computational principles. To address these challenges, this review and tutorial paper aims to achieve three primary objectives: (1) highlight prevalent data science applications in biomedical research, including data visualization, dimensionality reduction, missing data imputation, and predictive model training and evaluation; (2) provide comprehensible explanations of the mathematical foundations underpinning these methodologies; and (3) guide readers on the effective use and interpretation of software tools for implementing these methods in biomedical contexts. While introductory, this guide covers core principles essential for understanding advanced applications, empowering readers to critically interpret results, assess tools, and explore the potential and limitations of machine learning in their research. Ultimately, this paper serves as a practical foundation for biomedical researchers to confidently navigate the growing intersection of AI and biomedicine.
Collapse
Affiliation(s)
- Joshua Pickard
- Department of Computational Medicine and Bioinformatics, University Michigan, Ann Arbor, MI 48105, USA
| | - Victoria E Sturgess
- Department of Biomedical Engineering, University Michigan, Ann Arbor, MI 48105, USA
| | - Katherine O McDonald
- Department of Molecular and Integrative Physiology, University Michigan, Ann Arbor, MI 48105, USA
| | - Nicholas Rossiter
- Cellular and Molecular Biology Program, University of Michigan, Ann Arbor, MI 48105, USA
| | - Kelly B Arnold
- Department of Biomedical Engineering, University Michigan, Ann Arbor, MI 48105, USA
| | - Yatrik M Shah
- Department of Molecular and Integrative Physiology, University Michigan, Ann Arbor, MI 48105, USA
| | - Indika Rajapakse
- Department of Molecular and Integrative Physiology, University Michigan, Ann Arbor, MI 48105, USA
| | - Daniel A Beard
- Department of Molecular and Integrative Physiology, University Michigan, Ann Arbor, MI 48105, USA
| |
Collapse
|
12
|
De Vries M, Dent LG, Curry N, Rowe-Brown L, Bousgouni V, Fourkioti O, Naidoo R, Sparks H, Tyson A, Dunsby C, Bakal C. Geometric deep learning and multiple-instance learning for 3D cell-shape profiling. Cell Syst 2025; 16:101229. [PMID: 40112779 DOI: 10.1016/j.cels.2025.101229] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 10/23/2024] [Accepted: 02/13/2025] [Indexed: 03/22/2025]
Abstract
The three-dimensional (3D) morphology of cells emerges from complex cellular and environmental interactions, serving as an indicator of cell state and function. In this study, we used deep learning to discover morphology representations and understand cell states. This study introduced MorphoMIL, a computational pipeline combining geometric deep learning and attention-based multiple-instance learning to profile 3D cell and nuclear shapes. We used 3D point-cloud input and captured morphological signatures at single-cell and population levels, accounting for phenotypic heterogeneity. We applied these methods to over 95,000 melanoma cells treated with clinically relevant and cytoskeleton-modulating chemical and genetic perturbations. The pipeline accurately predicted drug perturbations and cell states. Our framework revealed subtle morphological changes associated with perturbations, key shapes correlating with signaling activity, and interpretable insights into cell-state heterogeneity. MorphoMIL demonstrated superior performance and generalized across diverse datasets, paving the way for scalable, high-throughput morphological profiling in drug discovery. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Matt De Vries
- Department of Cancer Biology, Institute of Cancer Research, London, UK; Department of Physics, Imperial College London, London, UK; Sentinal4D, London, UK
| | - Lucas G Dent
- Department of Cancer Biology, Institute of Cancer Research, London, UK
| | - Nathan Curry
- Department of Physics, Imperial College London, London, UK
| | - Leo Rowe-Brown
- Department of Physics, Imperial College London, London, UK
| | - Vicky Bousgouni
- Department of Cancer Biology, Institute of Cancer Research, London, UK
| | - Olga Fourkioti
- Department of Cancer Biology, Institute of Cancer Research, London, UK
| | - Reed Naidoo
- Department of Cancer Biology, Institute of Cancer Research, London, UK
| | - Hugh Sparks
- Department of Physics, Imperial College London, London, UK
| | - Adam Tyson
- Gatsby Computational Neuroscience Unit, University College London, London, UK
| | - Chris Dunsby
- Department of Physics, Imperial College London, London, UK
| | - Chris Bakal
- Department of Cancer Biology, Institute of Cancer Research, London, UK; Sentinal4D, London, UK.
| |
Collapse
|
13
|
Peled O, Greenbaum G, Bloch G. Diversification of social complexity following a major evolutionary transition in bees. Curr Biol 2025; 35:981-993.e5. [PMID: 39933519 DOI: 10.1016/j.cub.2025.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 10/16/2024] [Accepted: 01/07/2025] [Indexed: 02/13/2025]
Abstract
How social complexity evolved remains a long-standing enigma. In most animal groups, social complexity is typically classified into a few discrete classes. This approach is oversimplified and constrains our inference of social evolution to a narrow trajectory consisting of transitions between classes. Such categorical classifications also limit quantitative studies on the molecular and environmental drivers of social complexity. The recent accumulation of relevant quantitative data has set the stage to overcome these limitations. Here, we propose a data-driven, high-dimensional approach for studying the full diversity of social phenotypes. We curated and analyzed a comprehensive dataset encompassing 17 social traits across 80 species and studied the evolution of social complexity in bees. We found that honey bees, stingless bees, and bumble bees underwent a major evolutionary transition ∼80 mya, inconsistent with the stepwise progression of the social ladder conceptual framework. This major evolutionary transition was followed by a phase of substantial phenotypic diversification of social complexity. Other bee lineages display a continuum of social complexity, ranging from solitary to simple societies, but do not reach the levels of social complexity seen in honey bees, stingless bees, and bumble bees. Bee evolution, therefore, provides a remarkable demonstration of a macroevolutionary process in which a major transition removed biological constraints and opened novel evolutionary opportunities, driving the exploration of the landscape of social phenotypes. Our approach can be extended to incorporate additional data types and readily applied to illuminate the evolution of social complexity in other animal groups.
Collapse
Affiliation(s)
- Ohad Peled
- Department of Ecology, Evolution, and Behavior, The Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, 91904 Jerusalem, Israel
| | - Gili Greenbaum
- Department of Ecology, Evolution, and Behavior, The Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, 91904 Jerusalem, Israel.
| | - Guy Bloch
- Department of Ecology, Evolution, and Behavior, The Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, 91904 Jerusalem, Israel.
| |
Collapse
|
14
|
Laidlaw RF, Briggs EM, Matthews KR, Madany Mamlouk A, McCulloch R, Otto TD. TrAGEDy-trajectory alignment of gene expression dynamics. Bioinformatics 2025; 41:btaf073. [PMID: 40065693 PMCID: PMC11908647 DOI: 10.1093/bioinformatics/btaf073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 01/17/2025] [Accepted: 03/06/2025] [Indexed: 03/16/2025] Open
Abstract
MOTIVATION Single-cell transcriptomics sequencing is used to compare different biological processes. However, often, those processes are asymmetric which are difficult to integrate. Current approaches often rely on integrating samples from each condition before either cluster-based comparisons or analysis of an inferred shared trajectory. RESULTS We present Trajectory Alignment of Gene Expression Dynamics (TrAGEDy), which allows the alignment of independent trajectories to avoid the need for error-prone integration steps. Across simulated datasets, TrAGEDy returns the correct underlying alignment of the datasets, outperforming current tools which fail to capture the complexity of asymmetric alignments. When applied to real datasets, TrAGEDy captures more biologically relevant genes and processes, which other differential expression methods fail to detect when looking at the developments of T cells and the bloodstream forms of Trypanosoma brucei when affected by genetic knockouts. AVAILABILITY AND IMPLEMENTATION TrAGEDy is freely available at https://github.com/No2Ross/TrAGEDy, and implemented in R.
Collapse
Affiliation(s)
- Ross F Laidlaw
- Centre for Parasitology, University of Glasgow, Glasgow, G12 8QQ, United Kingdom
| | - Emma M Briggs
- Centre for Parasitology, University of Glasgow, Glasgow, G12 8QQ, United Kingdom
- Institute for Immunology and Infection Research, University of Edinburgh, Edinburgh, EH8 9YL, United Kingdom
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU, United Kingdom
| | - Keith R Matthews
- Institute for Immunology and Infection Research, University of Edinburgh, Edinburgh, EH8 9YL, United Kingdom
| | - Amir Madany Mamlouk
- Institute for Neuro- and Bioinformatics, University of Lübeck, Lübeck, 23562, Germany
| | - Richard McCulloch
- Centre for Parasitology, University of Glasgow, Glasgow, G12 8QQ, United Kingdom
| | - Thomas D Otto
- Centre for Parasitology, University of Glasgow, Glasgow, G12 8QQ, United Kingdom
- Laboratory of Pathogens and Host Immunity, Universite de Montpellier, Montpellier, 34090, France
| |
Collapse
|
15
|
Willem T, Shitov VA, Luecken MD, Kilbertus N, Bauer S, Piraud M, Buyx A, Theis FJ. Biases in machine-learning models of human single-cell data. Nat Cell Biol 2025; 27:384-392. [PMID: 39972066 DOI: 10.1038/s41556-025-01619-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 01/09/2025] [Indexed: 02/21/2025]
Abstract
Recent machine-learning (ML)-based advances in single-cell data science have enabled the stratification of human tissue donors at single-cell resolution, promising to provide valuable diagnostic and prognostic insights. However, such insights are susceptible to biases. Here we discuss various biases that emerge along the pipeline of ML-based single-cell analysis, ranging from societal biases affecting whose samples are collected, to clinical and cohort biases that influence the generalizability of single-cell datasets, biases stemming from single-cell sequencing, ML biases specific to (weakly supervised or unsupervised) ML models trained on human single-cell samples and biases during the interpretation of results from ML models. We end by providing methods for single-cell data scientists to assess and mitigate biases, and call for efforts to address the root causes of biases.
Collapse
Affiliation(s)
- Theresa Willem
- TUM School for Medicine and Health, Institute of History and Ethics in Medicine, Technical University of Munich, Munich, Germany.
- Helmholtz Munich, Munich, Germany.
| | - Vladimir A Shitov
- Department of Computational Health, Institute of Computational Biology, Helmholtz Munich, Munich, Germany
- Comprehensive Pneumology Center (CPC) with the CPC-M bioArchive and Institute of Lung Health and Immunity (LHI), Helmholtz Munich; Member of the German Center for Lung Research (DZL), Munich, Germany
| | - Malte D Luecken
- Department of Computational Health, Institute of Computational Biology, Helmholtz Munich, Munich, Germany
- Comprehensive Pneumology Center (CPC) with the CPC-M bioArchive and Institute of Lung Health and Immunity (LHI), Helmholtz Munich; Member of the German Center for Lung Research (DZL), Munich, Germany
| | - Niki Kilbertus
- Helmholtz Munich, Munich, Germany
- School for Computation, Information and Technology, Technical University of Munich, Munich, Germany
- Munich Center for Machine Learning (MCML), Munich, Germany
| | - Stefan Bauer
- Helmholtz Munich, Munich, Germany
- School for Computation, Information and Technology, Technical University of Munich, Munich, Germany
- Munich Center for Machine Learning (MCML), Munich, Germany
| | | | - Alena Buyx
- TUM School for Medicine and Health, Institute of History and Ethics in Medicine, Technical University of Munich, Munich, Germany
| | - Fabian J Theis
- Helmholtz Munich, Munich, Germany.
- School for Computation, Information and Technology, Technical University of Munich, Munich, Germany.
- School of Life Sciences, Technical University of Munich, Munich, Germany.
| |
Collapse
|
16
|
Prater KE, Lin KZ. All the single cells: Single-cell transcriptomics/epigenomics experimental design and analysis considerations for glial biologists. Glia 2025; 73:451-473. [PMID: 39558887 PMCID: PMC11809281 DOI: 10.1002/glia.24633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 09/18/2024] [Accepted: 10/10/2024] [Indexed: 11/20/2024]
Abstract
Single-cell transcriptomics, epigenomics, and other 'omics applied at single-cell resolution can significantly advance hypotheses and understanding of glial biology. Omics technologies are revealing a large and growing number of new glial cell subtypes, defined by their gene expression profile. These subtypes have significant implications for understanding glial cell function, cell-cell communications, and glia-specific changes between homeostasis and conditions such as neurological disease. For many, the training in how to analyze, interpret, and understand these large datasets has been through reading and understanding literature from other fields like biostatistics. Here, we provide a primer for glial biologists on experimental design and analysis of single-cell RNA-seq datasets. Our goal is to further the understanding of why decisions are made about datasets and to enhance biologists' ability to interpret and critique their work and the work of others. We review the steps involved in single-cell analysis with a focus on decision points and particular notes for glia. The goal of this primer is to ensure that single-cell 'omics experiments continue to advance glial biology in a rigorous and replicable way.
Collapse
Affiliation(s)
- Katherine E. Prater
- Department of Neurology, University of Washington School of Medicine, Seattle 98195
| | - Kevin Z. Lin
- Department of Biostatistics, University of Washington, Seattle 98195
| |
Collapse
|
17
|
Tarozo MM, Pessa AAB, Zunino L, Rosso OA, Perc M, Ribeiro HV. Two-by-two ordinal patterns in art paintings. PNAS NEXUS 2025; 4:pgaf092. [PMID: 40144776 PMCID: PMC11937958 DOI: 10.1093/pnasnexus/pgaf092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Accepted: 02/06/2025] [Indexed: 03/28/2025]
Abstract
Quantitative analysis of visual arts has recently expanded to encompass a more extensive array of artworks due to the availability of large-scale digitized art collections. Consistent with formal analyses by art historians, many of these studies highlight the significance of encoding spatial structures within artworks to enhance our understanding of visual arts. However, defining universally applicable, interpretable, and sufficiently simple units that capture the essence of paintings and their artistic styles remains challenging. Here, we examine ordering patterns in pixel intensities within two-by-two partitions of images from nearly 140,000 paintings created over the past 1,000 years. These patterns, categorized into 11 types based on arguments of continuity and symmetry, are both universally applicable and detailed enough to correlate with low-level visual features of paintings. We uncover a universal distribution of these patterns, with consistent prevalence within groups, yet modulated across groups by a nontrivial interplay between pattern smoothness and the likelihood of identical pixel intensities. This finding provides a standardized metric for comparing paintings and styles, further establishing a scale to measure deviations from the average prevalence. Our research also shows that these simple patterns carry valuable information for identifying painting styles, though styles generally exhibit considerable variability in the prevalence of ordinal patterns. Moreover, shifts in the prevalence of these patterns reveal a trend in which artworks increasingly diverge from the average incidence over time; however, this evolution is neither smooth nor uniform, with substantial variability in pattern prevalence, particularly after the 1930s.
Collapse
Affiliation(s)
- Mateus M Tarozo
- Departamento de Física, Universidade Estadual de Maringá, Maringá, PR 87020-900, Brazil
| | - Arthur A B Pessa
- Departamento de Física, Universidade Estadual de Maringá, Maringá, PR 87020-900, Brazil
| | - Luciano Zunino
- Centro de Investigaciones Ópticas (CONICET La Plata - CIC - UNLP), Gonnet, La Plata 1897, Argentina
- Departamento de Ciencias Básicas, Facultad de Ingeniería, Universidad Nacional de La Plata (UNLP), La Plata 1900, Argentina
| | - Osvaldo A Rosso
- Instituto de Física, Universidade Federal de Alagoas, Maceió 57072-900, Brazil
| | - Matjaž Perc
- Faculty of Natural Sciences and Mathematics, University of Maribor, Koroška cesta 160, Maribor 2000, Slovenia
- Community Healthcare Center Dr. Adolf Drolc Maribor, Ulica talcev 9, Maribor 2000, Slovenia
- Department of Physics, Kyung Hee University, 26 Kyungheedae-ro, Dongdaemun-gu, Seoul 02447, Republic of Korea
- Complexity Science Hub, Metternichgasse 8, Vienna 1030, Austria
- University College, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Republic of Korea
| | - Haroldo V Ribeiro
- Departamento de Física, Universidade Estadual de Maringá, Maringá, PR 87020-900, Brazil
| |
Collapse
|
18
|
Hu S, Lu Y, Yu G, Zheng Z, Wang W, Ni K, Giri A, Zhang J, Zhang Y, Watanabe K, Yao G, Xing J. Epithelial-mesenchymal transition couples with cell cycle arrest at various stages. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.24.639880. [PMID: 40060597 PMCID: PMC11888286 DOI: 10.1101/2025.02.24.639880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/20/2025]
Abstract
Numerous computational approaches have been developed to infer cell state transition trajectories from snapshot single-cell data. Most approaches first require projecting high-dimensional data onto a low-dimensional representation, raising the question of whether the dynamics of the system become distorted. Using epithelial-to-mesenchymal transition (EMT) as a test system, we show that both biology-guided low-dimensional representations and stochastic trajectory simulations in high-dimensional state space, not representations obtained with brute force dimension-reduction methods, reveal multiple distinct paths of TGF-β-induced EMT. The paths arise from coupling between EMT and cell cycle arrest at either the G1/S, G2/M or M checkpoints, contributing to cell-cycle related EMT heterogeneity. The present study emphasizes that caution should be taken when inferring transition dynamics from snapshot single-cell data in two- or three-dimensional representations, and that incorporating dynamical information can improve prediction accuracy.
Collapse
Affiliation(s)
- Sophia Hu
- Department of Computational and Systems Biology, University of Pittsburgh, USA
- Joint CMU-Pitt Ph.D. Program in Computational Biology, University of Pittsburgh, USA
| | - Yong Lu
- Department of Computational and Systems Biology, University of Pittsburgh, USA
| | - Gaohan Yu
- Department of Physics and Astronomy, University of Pittsburgh, USA
| | - Zhiqian Zheng
- Department of Computational and Systems Biology, University of Pittsburgh, USA
| | - Weikang Wang
- CAS Key Laboratory for Theoretical Physics, Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, China
- School of Physical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Ke Ni
- Department of Computational and Systems Biology, University of Pittsburgh, USA
- Joint CMU-Pitt Ph.D. Program in Computational Biology, University of Pittsburgh, USA
| | - Amitava Giri
- Department of Computational and Systems Biology, University of Pittsburgh, USA
| | - Jingyu Zhang
- Department of Computational and Systems Biology, University of Pittsburgh, USA
- Joint CMU-Pitt Ph.D. Program in Computational Biology, University of Pittsburgh, USA
| | - Yan Zhang
- Department of Computational and Systems Biology, University of Pittsburgh, USA
| | | | - Guang Yao
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ 85721, USA
- Arizona Cancer Center, University of Arizona, Tucson, AZ 85719, USA
| | - Jianhua Xing
- Department of Computational and Systems Biology, University of Pittsburgh, USA
- Department of Physics and Astronomy, University of Pittsburgh, USA
- UPMC-Hillman Cancer Center, University of Pittsburgh, USA
| |
Collapse
|
19
|
Ng-Kee-Kwong J, Philps B, Smith FNC, Sobieska A, Chen N, Alabert C, Bilen H, Buonomo SCB. Supervised and unsupervised deep learning-based approaches for studying DNA replication spatiotemporal dynamics. Commun Biol 2025; 8:311. [PMID: 40011665 PMCID: PMC11865476 DOI: 10.1038/s42003-025-07744-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2025] [Accepted: 02/14/2025] [Indexed: 02/28/2025] Open
Abstract
In eukaryotic cells, DNA replication is organised both spatially and temporally, as evidenced by the stage-specific spatial distribution of replication foci in the nucleus. Despite the genetic association of aberrant DNA replication with numerous human diseases, the labour-intensive methods employed to study DNA replication have hindered large-scale analyses of its roles in pathological processes. In this study, we employ two distinct methodologies. We first apply supervised machine learning, successfully classifying S-phase patterns in wild-type mouse embryonic stem cells (mESCs), while additionally identifying altered replication dynamics in Rif1-deficient mESCs. Given the constraints imposed by a classification-based approach, we then develop an unsupervised method for large-scale detection of aberrant S-phase cells. Such a method, which does not aim to classify patterns based on pre-defined categories but rather detects differences autonomously, closely recapitulates expected differences across genotypes. We therefore extend our approach to a well-characterised cellular model of inducible deregulated origin firing, involving cyclin E overexpression. Through parallel EdU- and PCNA-based analyses, we demonstrate the potential applicability of our method to patient samples, offering a means to identify the contribution of deregulated DNA replication to a plethora of pathogenic processes.
Collapse
Affiliation(s)
- Julian Ng-Kee-Kwong
- Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Roger Land Building, Alexander Crum Brown Road, Edinburgh, EH9 3FF, UK
| | - Ben Philps
- School of Informatics, University of Edinburgh, Edinburgh, EH8 9AB, UK
| | - Fiona N C Smith
- School of Informatics, University of Edinburgh, Edinburgh, EH8 9AB, UK
| | | | - Naiming Chen
- Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Roger Land Building, Alexander Crum Brown Road, Edinburgh, EH9 3FF, UK
| | - Constance Alabert
- Division of Molecular, Cell & Developmental Biology, School of Life Sciences, University of Dundee, Dundee, DD15EH, UK
| | - Hakan Bilen
- School of Informatics, University of Edinburgh, Edinburgh, EH8 9AB, UK
| | - Sara C B Buonomo
- Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Roger Land Building, Alexander Crum Brown Road, Edinburgh, EH9 3FF, UK.
| |
Collapse
|
20
|
van Dorp CH, Gray JI, Paik DH, Farber DL, Yates AJ. A variational deep-learning approach to modeling memory T cell dynamics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.07.08.602409. [PMID: 40060443 PMCID: PMC11888226 DOI: 10.1101/2024.07.08.602409] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/15/2025]
Abstract
Mechanistic models of dynamic, interacting cell populations have yielded many insights into the growth and resolution of immune responses. Historically these models have described the behavior of pre-defined cell types based on small numbers of phenotypic markers. The ubiquity of deep phenotyping therefore presents a new challenge; how do we confront tractable and interpretable mathematical models with high-dimensional data? To tackle this problem, we studied the development and persistence of lung-resident memory CD4 and CD8 T cells (TRM) in mice infected with influenza virus. We developed an approach in which dynamical model parameters and the population structure are inferred simultaneously. This method uses deep learning and stochastic variational inference and is trained on the single-cell flow-cytometry data directly, rather than on the kinetics of pre-identified clusters. We show that during the resolution phase of the immune response, memory CD4 and CD8 T cells within the lung are phenotypically diverse, with subsets exhibiting highly distinct and time-dependent dynamics. TRM heterogeneity is maintained long-term by ongoing differentiation of relatively persistent Bcl-2hi CD4 and CD8 TRM subsets which resolve into distinct functional populations. Our approach yields new insights into the dynamics of tissue-localized immune memory, and is a novel basis for interpreting time series of high-dimensional data, broadly applicable to diverse biological systems.
Collapse
Affiliation(s)
- Christiaan H van Dorp
- Department of Pathology and Cell Biology, Columbia University Irving Medical Center, New York City, USA
| | - Joshua I Gray
- Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York City, USA
| | - Daniel H Paik
- Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York City, USA
| | - Donna L Farber
- Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York City, USA
| | - Andrew J Yates
- Department of Pathology and Cell Biology, Columbia University Irving Medical Center, New York City, USA
| |
Collapse
|
21
|
Sun ED, Zhou OY, Hauptschein M, Rappoport N, Xu L, Navarro Negredo P, Liu L, Rando TA, Zou J, Brunet A. Spatial transcriptomic clocks reveal cell proximity effects in brain ageing. Nature 2025; 638:160-171. [PMID: 39695234 PMCID: PMC11798877 DOI: 10.1038/s41586-024-08334-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 11/01/2024] [Indexed: 12/20/2024]
Abstract
Old age is associated with a decline in cognitive function and an increase in neurodegenerative disease risk1. Brain ageing is complex and is accompanied by many cellular changes2. Furthermore, the influence that aged cells have on neighbouring cells and how this contributes to tissue decline is unknown. More generally, the tools to systematically address this question in ageing tissues have not yet been developed. Here we generate a spatially resolved single-cell transcriptomics brain atlas of 4.2 million cells from 20 distinct ages across the adult lifespan and across two rejuvenating interventions-exercise and partial reprogramming. We build spatial ageing clocks, machine learning models trained on this spatial transcriptomics atlas, to identify spatial and cell-type-specific transcriptomic fingerprints of ageing, rejuvenation and disease, including for rare cell types. Using spatial ageing clocks and deep learning, we find that T cells, which increasingly infiltrate the brain with age, have a marked pro-ageing proximity effect on neighbouring cells. Surprisingly, neural stem cells have a strong pro-rejuvenating proximity effect on neighbouring cells. We also identify potential mediators of the pro-ageing effect of T cells and the pro-rejuvenating effect of neural stem cells on their neighbours. These results suggest that rare cell types can have a potent influence on their neighbours and could be targeted to counter tissue ageing. Spatial ageing clocks represent a useful tool for studying cell-cell interactions in spatial contexts and should allow scalable assessment of the efficacy of interventions for ageing and disease.
Collapse
Affiliation(s)
- Eric D Sun
- Biomedical Data Science Graduate Program, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Olivia Y Zhou
- Department of Genetics, Stanford University, Stanford, CA, USA
- Biophysics Graduate Program, Stanford University, Stanford, CA, USA
- Medical Scientist Training Program, Stanford University, Stanford, CA, USA
| | - Max Hauptschein
- Department of Genetics, Stanford University, Stanford, CA, USA
| | | | - Lucy Xu
- Department of Genetics, Stanford University, Stanford, CA, USA
- Biology Graduate Program, Stanford University, Stanford, CA, USA
| | | | - Ling Liu
- Department of Neurology, Stanford University, Stanford, CA, USA
- Department of Neurology, UCLA, Los Angeles, CA, USA
- Eli and Edythe Broad Center for Regenerative Medicine and Stem Cell Biology, UCLA, Los Angeles, CA, USA
| | - Thomas A Rando
- Department of Neurology, Stanford University, Stanford, CA, USA
- Department of Neurology, UCLA, Los Angeles, CA, USA
- Eli and Edythe Broad Center for Regenerative Medicine and Stem Cell Biology, UCLA, Los Angeles, CA, USA
| | - James Zou
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
| | - Anne Brunet
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Glenn Center for the Biology of Aging, Stanford University, Stanford, CA, USA.
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA.
- The Phil & Penny Knight Initiative for Brain Resilience at the Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA.
| |
Collapse
|
22
|
Klein D, Palla G, Lange M, Klein M, Piran Z, Gander M, Meng-Papaxanthos L, Sterr M, Saber L, Jing C, Bastidas-Ponce A, Cota P, Tarquis-Medina M, Parikh S, Gold I, Lickert H, Bakhti M, Nitzan M, Cuturi M, Theis FJ. Mapping cells through time and space with moscot. Nature 2025; 638:1065-1075. [PMID: 39843746 PMCID: PMC11864987 DOI: 10.1038/s41586-024-08453-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 11/25/2024] [Indexed: 01/24/2025]
Abstract
Single-cell genomic technologies enable the multimodal profiling of millions of cells across temporal and spatial dimensions. However, experimental limitations hinder the comprehensive measurement of cells under native temporal dynamics and in their native spatial tissue niche. Optimal transport has emerged as a powerful tool to address these constraints and has facilitated the recovery of the original cellular context1-4. Yet, most optimal transport applications are unable to incorporate multimodal information or scale to single-cell atlases. Here we introduce multi-omics single-cell optimal transport (moscot), a scalable framework for optimal transport in single-cell genomics that supports multimodality across all applications. We demonstrate the capability of moscot to efficiently reconstruct developmental trajectories of 1.7 million cells from mouse embryos across 20 time points. To illustrate the capability of moscot in space, we enrich spatial transcriptomic datasets by mapping multimodal information from single-cell profiles in a mouse liver sample and align multiple coronal sections of the mouse brain. We present moscot.spatiotemporal, an approach that leverages gene-expression data across both spatial and temporal dimensions to uncover the spatiotemporal dynamics of mouse embryogenesis. We also resolve endocrine-lineage relationships of delta and epsilon cells in a previously unpublished mouse, time-resolved pancreas development dataset using paired measurements of gene expression and chromatin accessibility. Our findings are confirmed through experimental validation of NEUROD2 as a regulator of epsilon progenitor cells in a model of human induced pluripotent stem cell islet cell differentiation. Moscot is available as open-source software, accompanied by extensive documentation.
Collapse
Affiliation(s)
- Dominik Klein
- Institute of Computational Biology, Helmholtz Center, Munich, Germany
- Department of Mathematics, Technical University of Munich, Garching, Germany
| | - Giovanni Palla
- Institute of Computational Biology, Helmholtz Center, Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Marius Lange
- Institute of Computational Biology, Helmholtz Center, Munich, Germany
- Department of Mathematics, Technical University of Munich, Garching, Germany
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | | | - Zoe Piran
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Manuel Gander
- Institute of Computational Biology, Helmholtz Center, Munich, Germany
| | | | - Michael Sterr
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
| | - Lama Saber
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
- School of Medicine, Technical University of Munich, Munich, Germany
| | - Changying Jing
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
- Munich Medical Research School (MMRS), Ludwig Maximilian University (LMU), Munich, Germany
| | - Aimée Bastidas-Ponce
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
| | - Perla Cota
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
- School of Medicine, Technical University of Munich, Munich, Germany
| | - Marta Tarquis-Medina
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
| | - Shrey Parikh
- Institute of Computational Biology, Helmholtz Center, Munich, Germany
| | - Ilan Gold
- Institute of Computational Biology, Helmholtz Center, Munich, Germany
| | - Heiko Lickert
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany.
- German Center for Diabetes Research, Neuherberg, Germany.
- School of Medicine, Technical University of Munich, Munich, Germany.
| | - Mostafa Bakhti
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
| | - Mor Nitzan
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
- Racah Institute of Physics, The Hebrew University of Jerusalem, Jerusalem, Israel
- Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | | | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Center, Munich, Germany.
- Department of Mathematics, Technical University of Munich, Garching, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.
| |
Collapse
|
23
|
Jansma A, Yao Y, Wolfe J, Del Debbio L, Beentjes SV, Ponting CP, Khamseh A. High order expression dependencies finely resolve cryptic states and subtypes in single cell data. Mol Syst Biol 2025; 21:173-207. [PMID: 39748128 PMCID: PMC11790937 DOI: 10.1038/s44320-024-00074-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 10/24/2024] [Accepted: 10/31/2024] [Indexed: 01/04/2025] Open
Abstract
Single cells are typically typed by clustering into discrete locations in reduced dimensional transcriptome space. Here we introduce Stator, a data-driven method that identifies cell (sub)types and states without relying on cells' local proximity in transcriptome space. Stator labels the same single cell multiply, not just by type and subtype, but also by state such as activation, maturity or cell cycle sub-phase, through deriving higher-order gene expression dependencies from a sparse gene-by-cell expression matrix. Stator's finer resolution is clear from analyses of mouse embryonic brain, and human healthy or diseased liver. Rather than only coarse-scale labels of cell type, Stator further resolves cell types into subtypes, and these subtypes into stages of maturity and/or cell cycle phases, and yet further into portions of these phases. Among cryptically homogeneous embryonic cells, for example, Stator finds 34 distinct radial glia states whose gene expression forecasts their future GABAergic or glutamatergic neuronal fate. Further, Stator's fine resolution of liver cancer states reveals expression programmes that predict patient survival. We provide Stator as a Nextflow pipeline and Shiny App.
Collapse
Affiliation(s)
- Abel Jansma
- MRC Human Genetics Unit, Institute of Genetics & Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
- Higgs Centre for Theoretical Physics, School of Physics & Astronomy, University of Edinburgh, Edinburgh, EH9 3FD, UK
| | - Yuelin Yao
- MRC Human Genetics Unit, Institute of Genetics & Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
- School of Informatics, University of Edinburgh, Edinburgh, EH8 9AB, UK
| | - Jareth Wolfe
- MRC Human Genetics Unit, Institute of Genetics & Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Luigi Del Debbio
- Higgs Centre for Theoretical Physics, School of Physics & Astronomy, University of Edinburgh, Edinburgh, EH9 3FD, UK
| | - Sjoerd V Beentjes
- MRC Human Genetics Unit, Institute of Genetics & Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
- School of Mathematics, University of Edinburgh, Edinburgh, EH9 3FD, UK
| | - Chris P Ponting
- MRC Human Genetics Unit, Institute of Genetics & Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK.
| | - Ava Khamseh
- MRC Human Genetics Unit, Institute of Genetics & Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK.
- Higgs Centre for Theoretical Physics, School of Physics & Astronomy, University of Edinburgh, Edinburgh, EH9 3FD, UK.
- School of Informatics, University of Edinburgh, Edinburgh, EH8 9AB, UK.
| |
Collapse
|
24
|
Wei J, Zhang B, Wang Q, Zhou T, Tian T, Chen L. Diffusive topology preserving manifold distances for single-cell data analysis. Proc Natl Acad Sci U S A 2025; 122:e2404860121. [PMID: 39854240 PMCID: PMC11789025 DOI: 10.1073/pnas.2404860121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 11/25/2024] [Indexed: 01/26/2025] Open
Abstract
Manifold learning techniques have emerged as crucial tools for uncovering latent patterns in high-dimensional single-cell data. However, most existing dimensionality reduction methods primarily rely on 2D visualization, which can distort true data relationships and fail to extract reliable biological information. Here, we present DTNE (diffusive topology neighbor embedding), a dimensionality reduction framework that faithfully approximates manifold distance to enhance cellular relationships and dynamics. DTNE constructs a manifold distance matrix using a modified personalized PageRank algorithm, thereby preserving topological structure while enabling diverse single-cell analyses. This approach facilitates distribution-based cellular relationship analysis, pseudotime inference, and clustering within a unified framework. Extensive benchmarking against mainstream algorithms on diverse datasets demonstrates DTNE's superior performance in maintaining geodesic distances and revealing significant biological patterns. Our results establish DTNE as a powerful tool for high-dimensional data analysis in uncovering meaningful biological insights.
Collapse
Affiliation(s)
- Jiangyong Wei
- Guangdong Institute of Intelligence Science and Technology, 519031Hengqin, Zhuhai, Guangdong, China
| | - Bin Zhang
- Guangdong Institute of Intelligence Science and Technology, 519031Hengqin, Zhuhai, Guangdong, China
| | - Qiu Wang
- Guangdong Institute of Intelligence Science and Technology, 519031Hengqin, Zhuhai, Guangdong, China
| | - Tianshou Zhou
- School of Mathematics and Statistics, Sun Yat-sen University, 510275Guangzhou, China
| | - Tianhai Tian
- School of Mathematics, Monash University, Melbourne, VIC3800, Australia
| | - Luonan Chen
- Guangdong Institute of Intelligence Science and Technology, 519031Hengqin, Zhuhai, Guangdong, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 310024Hangzhou, China
- Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai200031, China
| |
Collapse
|
25
|
Xiao R, Baptista RP, Agyabeng-Dadzie F, Li Y, Dong Y, Schmitz RJ, Glenn TC, Kissinger JC. Deciphering Transcription in Cryptosporidium parvum: Polycistronic Gene Expression and Chromatin Accessibility. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.17.633476. [PMID: 39868316 PMCID: PMC11761812 DOI: 10.1101/2025.01.17.633476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/28/2025]
Abstract
Once considered rare in eukaryotes, polycistronic mRNA expression has been identified in kinetoplastids and, more recently, green algae, red algae, and certain fungi. This study provides comprehensive evidence supporting the existence of polycistronic mRNA expression in the apicomplexan parasite Cryptosporidium parvum. Leveraging long-read RNA-seq data from different parasite strains and using multiple long-read technologies, we demonstrate the existence of defined polycistronic transcripts containing 2-4 protein encoding genes, several validated with RT-PCR. Some polycistrons exhibit differential expression profiles, usually involving the generation of internal monocistronic transcripts at different times during development. ATAC-seq in sporozoites reveals that polycistronic transcripts usually have a single open chromatin peak at their 5-prime ends, which contains a single E2F binding site motif. Polycistronic genes do not appear enriched for either male or female exclusive genes. This study elucidates a potentially complex layer of gene regulation with distinct chromatin accessibility akin to monocistronic transcripts. This is the first report of polycistronic transcription in an apicomplexan and expands our understanding of gene expression strategies in this medically important organism.
Collapse
Affiliation(s)
- Rui Xiao
- Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA
- University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Rodrigo P. Baptista
- Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA
- Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, GA, 30602, USA
- Houston Methodist Research Hospital, Houston, TX, 77030, USA
| | | | - Yiran Li
- Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA
- St. Jude Children’s Research Hospital, Memphis, TN, 38105, USA
| | - Yinxin Dong
- Department of Genetics, University of Georgia, Athens, GA, 30602, USA
| | - Robert J. Schmitz
- Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA
- Department of Genetics, University of Georgia, Athens, GA, 30602, USA
| | - Travis C. Glenn
- Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA
- Department of Genetics, University of Georgia, Athens, GA, 30602, USA
- Department of Environmental Health Science, University of Georgia, Athens, GA, 30602, USA
| | - Jessica C. Kissinger
- Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA
- Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, GA, 30602, USA
- Department of Genetics, University of Georgia, Athens, GA, 30602, USA
| |
Collapse
|
26
|
Kretschmer F, Seipp J, Ludwig M, Klau GW, Böcker S. Coverage bias in small molecule machine learning. Nat Commun 2025; 16:554. [PMID: 39788952 PMCID: PMC11718084 DOI: 10.1038/s41467-024-55462-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 12/12/2024] [Indexed: 01/12/2025] Open
Abstract
Small molecule machine learning aims to predict chemical, biochemical, or biological properties from molecular structures, with applications such as toxicity prediction, ligand binding, and pharmacokinetics. A recent trend is developing end-to-end models that avoid explicit domain knowledge. These models assume no coverage bias in training and evaluation data, meaning the data are representative of the true distribution. However, the domain of applicability is rarely considered in such models. Here, we investigate how well large-scale datasets cover the space of known biomolecular structures. For doing so, we propose a distance measure based on solving the Maximum Common Edge Subgraph (MCES) problem, which aligns well with chemical similarity. Although this method is computationally hard, we introduce an efficient approach combining Integer Linear Programming and heuristic bounds. Our findings reveal that many widely-used datasets lack uniform coverage of biomolecular structures, limiting the predictive power of models trained on them. We propose two additional methods to assess whether training datasets diverge from known molecular distributions, potentially guiding future dataset creation to improve model performance.
Collapse
Affiliation(s)
- Fleming Kretschmer
- Chair for Bioinformatics, Institute for Computer Science, Friedrich Schiller University Jena, Jena, Germany
| | - Jan Seipp
- Algorithmic Bioinformatics, Institute for Computer Science, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Marcus Ludwig
- Chair for Bioinformatics, Institute for Computer Science, Friedrich Schiller University Jena, Jena, Germany
- Currently at Bright Giant, Jena, Germany
| | - Gunnar W Klau
- Algorithmic Bioinformatics, Institute for Computer Science, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Sebastian Böcker
- Chair for Bioinformatics, Institute for Computer Science, Friedrich Schiller University Jena, Jena, Germany.
| |
Collapse
|
27
|
Liu X, Chapple RH, Bennett D, Wright WC, Sanjali A, Culp E, Zhang Y, Pan M, Geeleher P. CSI-GEP: A GPU-based unsupervised machine learning approach for recovering gene expression programs in atlas-scale single-cell RNA-seq data. CELL GENOMICS 2025; 5:100739. [PMID: 39788105 PMCID: PMC11770216 DOI: 10.1016/j.xgen.2024.100739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Revised: 11/06/2024] [Accepted: 12/13/2024] [Indexed: 01/12/2025]
Abstract
Exploratory analysis of single-cell RNA sequencing (scRNA-seq) typically relies on hard clustering over two-dimensional projections like uniform manifold approximation and projection (UMAP). However, such methods can severely distort the data and have many arbitrary parameter choices. Methods that can model scRNA-seq data as non-discrete "gene expression programs" (GEPs) can better preserve the data's structure, but currently, they are often not scalable, not consistent across repeated runs, and lack an established method for choosing key parameters. Here, we developed a GPU-based unsupervised learning approach, "consensus and scalable inference of gene expression programs" (CSI-GEP). We show that CSI-GEP can recover ground truth GEPs in real and simulated atlas-scale scRNA-seq datasets, significantly outperforming cutting-edge methods, including GPT-based neural networks. We applied CSI-GEP to a whole mouse brain atlas of 2.2 million cells, disentangling endothelial cell types missed by other methods, and to an integrated scRNA-seq atlas of human tumors and cell lines, discovering mesenchymal-like GEPs unique to cancer cells growing in culture.
Collapse
Affiliation(s)
- Xueying Liu
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Richard H Chapple
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Declan Bennett
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - William C Wright
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Ankita Sanjali
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Erielle Culp
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Yinwen Zhang
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Min Pan
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Paul Geeleher
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA.
| |
Collapse
|
28
|
Aeschbach S, Mata R, Wulff DU. Mapping Mental Representations With Free Associations: A Tutorial Using the R Package associatoR. J Cogn 2025; 8:3. [PMID: 39803181 PMCID: PMC11720478 DOI: 10.5334/joc.407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 10/03/2024] [Indexed: 01/16/2025] Open
Abstract
People's understanding of topics and concepts such as risk, sustainability, and intelligence can be important for psychological researchers and policymakers alike. One underexplored way of accessing this information is to use free associations to map people's mental representations. In this tutorial, we describe how free association responses can be collected, processed, mapped, and compared across groups using the R package associatoR. We discuss study design choices and different approaches to uncovering the structure of mental representations using natural language processing, including the use of embeddings from large language models. We posit that free association analysis presents a powerful approach to revealing how people and machines represent key social and technological issues.
Collapse
Affiliation(s)
- Samuel Aeschbach
- Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany
- Center for Cognitive and Decision Sciences, University of Basel, Basel, Switzerland
| | - Rui Mata
- Center for Cognitive and Decision Sciences, University of Basel, Basel, Switzerland
| | - Dirk U. Wulff
- Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany
- Center for Cognitive and Decision Sciences, University of Basel, Basel, Switzerland
| |
Collapse
|
29
|
Miao Z, Wang J, Park K, Kuang D, Kim J. Depth-corrected multi-factor dissection of chromatin accessibility for scATAC-seq data with PACS. Nat Commun 2025; 16:401. [PMID: 39757254 DOI: 10.1038/s41467-024-55580-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 12/10/2024] [Indexed: 01/07/2025] Open
Abstract
Single cell ATAC-seq (scATAC-seq) experimental designs have become increasingly complex, with multiple factors that might affect chromatin accessibility, including genotype, cell type, tissue of origin, sample location, batch, etc., whose compound effects are difficult to test by existing methods. In addition, current scATAC-seq data present statistical difficulties due to their sparsity and variations in individual sequence capture. To address these problems, we present a zero-adjusted statistical model, Probability model of Accessible Chromatin of Single cells (PACS), that allows complex hypothesis testing of accessibility-modulating factors while accounting for sparse and incomplete data. For differential accessibility analysis, PACS controls the false positive rate and achieves a 17% to 122% higher power on average than existing tools. We demonstrate the effectiveness of PACS through several analysis tasks, including supervised cell type annotation, compound hypothesis testing, batch effect correction, and spatiotemporal modeling. We apply PACS to datasets from various tissues and show its ability to reveal previously undiscovered insights in scATAC-seq data.
Collapse
Affiliation(s)
- Zhen Miao
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - Jianqiao Wang
- Department of Biostatistics, Harvard T.H. Chan School of Health, Boston, MA, USA
- Department of Statistics and Data Science, Tsinghua University, Beijing, China
| | - Kernyu Park
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - Da Kuang
- Deptartment Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Junhyong Kim
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA.
- Deptartment Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
30
|
Uno M, Nono M, Takahashi C, Kishimoto S, Okabe E, Yamamoto T, Nishida E. A Transition From Interindividual Uniformity to Diversity in Appearance and Transcriptional Features at Midlife in Caenorhabditis elegans. Genes Cells 2025; 30:e13187. [PMID: 39743742 DOI: 10.1111/gtc.13187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Revised: 11/21/2024] [Accepted: 11/30/2024] [Indexed: 01/04/2025]
Abstract
During embryogenesis, organisms function as a robust system that ensures uniformity within individuals, but they lose robustness and develop variations at advanced ages. However, when and how organisms lose this robustness remains largely elusive. Here, we identified a sharp transition from interindividual uniformity to diversity in the appearance and transcriptional features of age-matched Caenorhabditis elegans in midlife. Convolutional neural network analysis of individual appearance alterations revealed that the transition occurs in midlife, which coincides with the cessation of egg-laying activity and increased motility defects. This period represents the transition from the young state, marked by shared homogeneous features among same-age individuals, to the old state, marked by shared among old individuals. Transcriptional coherence within the age-matched individuals shows essentially the same transition, coinciding with the appearance features. These findings provide a new framework for understanding the aging trajectory in C. elegans, demonstrating the occurrence of the loss of robust control over appearance and transcriptional homeostasis in midlife.
Collapse
Affiliation(s)
- Masaharu Uno
- Laboratory for Molecular Biology of Aging, RIKEN Center for Biosystems Dynamics Research (BDR), Hyogo, Japan
| | - Masanori Nono
- Laboratory for Molecular Biology of Aging, RIKEN Center for Biosystems Dynamics Research (BDR), Hyogo, Japan
| | - Chika Takahashi
- Laboratory for Molecular Biology of Aging, RIKEN Center for Biosystems Dynamics Research (BDR), Hyogo, Japan
| | - Saya Kishimoto
- Laboratory for Molecular Biology of Aging, RIKEN Center for Biosystems Dynamics Research (BDR), Hyogo, Japan
| | - Emiko Okabe
- Laboratory for Molecular Biology of Aging, RIKEN Center for Biosystems Dynamics Research (BDR), Hyogo, Japan
| | - Takuya Yamamoto
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
- Medical-Risk Avoidance Based on iPS Cells Team, RIKEN Center for Advanced Intelligence Project (AIP), Kyoto, Japan
| | - Eisuke Nishida
- Laboratory for Molecular Biology of Aging, RIKEN Center for Biosystems Dynamics Research (BDR), Hyogo, Japan
| |
Collapse
|
31
|
Stringer C, Zhong L, Syeda A, Du F, Kesa M, Pachitariu M. Rastermap: a discovery method for neural population recordings. Nat Neurosci 2025; 28:201-212. [PMID: 39414974 PMCID: PMC11706777 DOI: 10.1038/s41593-024-01783-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 09/11/2024] [Indexed: 10/18/2024]
Abstract
Neurophysiology has long progressed through exploratory experiments and chance discoveries. Anecdotes abound of researchers listening to spikes in real time and noticing patterns of activity related to ongoing stimuli or behaviors. With the advent of large-scale recordings, such close observation of data has become difficult. To find patterns in large-scale neural data, we developed 'Rastermap', a visualization method that displays neurons as a raster plot after sorting them along a one-dimensional axis based on their activity patterns. We benchmarked Rastermap on realistic simulations and then used it to explore recordings of tens of thousands of neurons from mouse cortex during spontaneous, stimulus-evoked and task-evoked epochs. We also applied Rastermap to whole-brain zebrafish recordings; to wide-field imaging data; to electrophysiological recordings in rat hippocampus, monkey frontal cortex and various cortical and subcortical regions in mice; and to artificial neural networks. Finally, we illustrate high-dimensional scenarios where Rastermap and similar algorithms cannot be used effectively.
Collapse
Affiliation(s)
- Carsen Stringer
- Howard Hughes Medical Institute Janelia Research Campus, Ashburn, VA, USA.
| | - Lin Zhong
- Howard Hughes Medical Institute Janelia Research Campus, Ashburn, VA, USA
| | - Atika Syeda
- Howard Hughes Medical Institute Janelia Research Campus, Ashburn, VA, USA
| | - Fengtong Du
- Howard Hughes Medical Institute Janelia Research Campus, Ashburn, VA, USA
| | - Maria Kesa
- Howard Hughes Medical Institute Janelia Research Campus, Ashburn, VA, USA
| | - Marius Pachitariu
- Howard Hughes Medical Institute Janelia Research Campus, Ashburn, VA, USA.
| |
Collapse
|
32
|
Hrovatin K, Sikkema L, Shitov VA, Heimberg G, Shulman M, Oliver AJ, Mueller MF, Ibarra IL, Wang H, Ramírez-Suástegui C, He P, Schaar AC, Teichmann SA, Theis FJ, Luecken MD. Considerations for building and using integrated single-cell atlases. Nat Methods 2025; 22:41-57. [PMID: 39672979 DOI: 10.1038/s41592-024-02532-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 10/22/2024] [Indexed: 12/15/2024]
Abstract
The rapid adoption of single-cell technologies has created an opportunity to build single-cell 'atlases' integrating diverse datasets across many laboratories. Such atlases can serve as a reference for analyzing and interpreting current and future data. However, it has become apparent that atlasing approaches differ, and the impact of these differences are often unclear. Here we review the current atlasing literature and present considerations for building and using atlases. Importantly, we find that no one-size-fits-all protocol for atlas building exists, but rather we discuss context-specific considerations and workflows, including atlas conceptualization, data collection, curation and integration, atlas evaluation and atlas sharing. We further highlight the benefits of integrated atlases for analyses of new datasets and deriving biological insights beyond what is possible from individual datasets. Our overview of current practices and associated recommendations will improve the quality of atlases to come, facilitating the shift to a unified, reference-based understanding of single-cell biology.
Collapse
Affiliation(s)
- Karin Hrovatin
- Department of Computational Health, Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Lisa Sikkema
- Department of Computational Health, Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Vladimir A Shitov
- Department of Computational Health, Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
- Comprehensive Pneumology Center (CPC) with the CPC-M bioArchive / Institute of Lung Health and Immunity (LHI), Helmholtz Zentrum München; Member of the German Center for Lung Research (DZL), Munich, Germany
| | - Graham Heimberg
- Department of OMNI Bioinformatics, Genentech, South San Francisco, CA, USA
- Department of Biological Research | AI Development, Genentech, South San Francisco, CA, USA
| | - Maiia Shulman
- Department of Computational Health, Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Amanda J Oliver
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Michaela F Mueller
- Department of Computational Health, Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
| | - Ignacio L Ibarra
- Department of Computational Health, Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
| | - Hanchen Wang
- Department of Biological Research | AI Development, Genentech, South San Francisco, CA, USA
- Department of Computer Science, Stanford University, Palo Alto, CA, USA
| | - Ciro Ramírez-Suástegui
- Department of Computational Health, Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Peng He
- Department of Pathology, University of California, San Francisco, San Francisco, CA, USA
| | - Anna C Schaar
- Department of Computational Health, Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
- TUM School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Sarah A Teichmann
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- Theory of Condensed Matter Group, Department of Physics, Cavendish Laboratory, University of Cambridge, Cambridge, UK
- Cambridge Stem Cell Institute and Department of Medicine, University of Cambridge, Cambridge, UK
- CIFAR MacMillan Multiscale Human Programme, Toronto, Ontario, Canada
| | - Fabian J Theis
- Department of Computational Health, Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.
- Department of Mathematics, Technical University of Munich, Garching, Germany.
| | - Malte D Luecken
- Department of Computational Health, Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany.
- Comprehensive Pneumology Center (CPC) with the CPC-M bioArchive / Institute of Lung Health and Immunity (LHI), Helmholtz Zentrum München; Member of the German Center for Lung Research (DZL), Munich, Germany.
| |
Collapse
|
33
|
Mocking TR, Kelder A, Reuvekamp T, Ngai LL, Rutten P, Gradowska P, van de Loosdrecht AA, Cloos J, Bachas C. Computational assessment of measurable residual disease in acute myeloid leukemia using mixture models. COMMUNICATIONS MEDICINE 2024; 4:271. [PMID: 39702555 DOI: 10.1038/s43856-024-00700-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Accepted: 12/05/2024] [Indexed: 12/21/2024] Open
Abstract
BACKGROUND The proportion of residual leukemic blasts after chemotherapy assessed by multiparameter flow cytometry, is an important prognostic factor for the risk of relapse and overall survival in acute myeloid leukemia (AML). This measurable residual disease (MRD) is used in clinical trials to stratify patients for more or less intensive consolidation therapy. However, an objective and reproducible analysis method to assess MRD status from flow cytometry data is lacking, yet is highly anticipated for broader implementation of MRD testing. METHODS We propose a computational pipeline based on Gaussian mixture modeling that allows a fully automated assessment of MRD status while remaining completely interpretable for clinical diagnostic experts. Our pipeline requires limited training data, which makes it easily transferable to other medical centers and cytometry platforms. RESULTS We identify all healthy and leukemic immature myeloid cells in with high concordance (Spearman's Rho = 0.974) and classification performance (median F-score = 0.861) compared to manual analysis. Using control samples (n = 18), we calculate a computational MRD percentage with high concordance to expert gating (Spearman's rho = 0.823) and predict MRD status in a cohort of 35 AML follow-up measurements with high accuracy (97%). CONCLUSIONS We demonstrate that our pipeline provides a powerful tool for fast (~3 s) and objective automated MRD assessment in AML.
Collapse
Affiliation(s)
- Tim R Mocking
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
| | - Angèle Kelder
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
| | - Tom Reuvekamp
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
- Department of Hematology, Amsterdam UMC, Universiteit van Amsterdam, Amsterdam, The Netherlands
| | - Lok Lam Ngai
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
| | - Philip Rutten
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
- Department of Epidemiology and Data Science, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Patrycja Gradowska
- Department of Hematology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
- HOVON Foundation, Rotterdam, The Netherlands
| | - Arjan A van de Loosdrecht
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
| | - Jacqueline Cloos
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
| | - Costa Bachas
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands.
| |
Collapse
|
34
|
Russo CJ, Husain K, Murugan A. Soft Modes as a Predictive Framework for Low Dimensional Biological Systems across Scales. ARXIV 2024:arXiv:2412.13637v1. [PMID: 39764393 PMCID: PMC11702803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/21/2025]
Abstract
All biological systems are subject to perturbations: due to thermal fluctuations, external environments, or mutations. Yet, while biological systems are composed of thousands of interacting components, recent high-throughput experiments show that their response to perturbations is surprisingly low-dimensional: confined to only a few stereotyped changes out of the many possible. Here, we explore a unifying dynamical systems framework - soft modes - to explain and analyze low-dimensionality in biology, from molecules to eco-systems. We argue that this one framework of soft modes makes non-trivial predictions that generalize classic ideas from developmental biology to disparate systems, namely: phenocopying, dual buffering, and global epistasis. While some of these predictions have been borne out in experiments, we discuss how soft modes allow for a surprisingly far-reaching and unifying framework in which to analyze data from protein biophysics to microbial ecology.
Collapse
Affiliation(s)
- Christopher Joel Russo
- James Franck Institute, University of Chicago, Chicago, United States
- Program in Biophysical Sciences, University of Chicago, Chicago, United States
| | - Kabir Husain
- James Franck Institute, University of Chicago, Chicago, United States
- Department of Physics, University College London, London, United Kingdom
| | - Arvind Murugan
- James Franck Institute, University of Chicago, Chicago, United States
- Department of Physics, University of Chicago, Chicago, United States
| |
Collapse
|
35
|
Vardaman D, Ali MA, Siam MHB, Bolding C, Tidwell H, Stephens H, Patil M, Tyrrell DJ. Development of a Spectral Flow Cytometry Analysis Pipeline for High-dimensional Immune Cell Characterization. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2024; 213:1713-1724. [PMID: 39451039 PMCID: PMC11573633 DOI: 10.4049/jimmunol.2400370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Accepted: 10/02/2024] [Indexed: 10/26/2024]
Abstract
Flow cytometry is used for immune cell analysis for cell composition and function. Spectral flow cytometry allows for high-dimensional analysis of immune cells, overcoming limitations of conventional flow cytometry. However, analyzing data from large Ab panels is challenging using traditional biaxial gating strategies. We present, to our knowledge, a novel analysis pipeline to improve analysis of spectral flow cytometry. We employ this method to identify rare T cell populations in aging. We isolated splenocytes from young (2-3 mo old) and aged (18-19 mo old) female C57BL/6N mice and then stained these with a panel of 20 fluorescently labeled Abs. We performed spectral flow cytometry and then data processing and analysis using Python within a Jupyter Notebook environment to perform dimensionality reduction, batch correction, unsupervised clustering, and differential expression analysis. Our analysis of 3,776,804 T cells from 11 spleens revealed 35 distinct T cell clusters identified by surface marker expression. We observed significant differences between young and aged mice, with clusters enriched in one age group over the other. Naive, effector memory, and central memory CD8+ and CD4+ T cell subsets exhibited age-associated changes in abundance and marker expression. We also demonstrate the utility of our pipeline in a human PBMC dataset that used a 50-fluorescent color panel. By leveraging high-dimensional analysis methods, we provide insights into the immune aging process. This approach offers a robust and easily implemented analysis pipeline for spectral flow cytometry data that may facilitate the discovery of novel therapeutic targets for age-related immune dysfunction.
Collapse
Affiliation(s)
- Donald Vardaman
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL, 35205 USA
| | - Md Akkas Ali
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL, 35205 USA
- Biochemistry and Structural Biology Theme, Graduate Biomedical Sciences, University of Alabama at Birmingham, Birmingham, AL, 35205 USA
| | - Md Hasanul Banna Siam
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL, 35205 USA
- Microbiology Theme, Graduate Biomedical Sciences, University of Alabama at Birmingham, Birmingham, AL, 35205 USA
| | - Chase Bolding
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL, 35205 USA
| | - Harrison Tidwell
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL, 35205 USA
| | - Holly Stephens
- Department of Nutrition Sciences, University of Alabama at Birmingham, Birmingham, AL, 35205 USA
- Immunology Theme, Graduate Biomedical Sciences, University of Alabama at Birmingham, Birmingham, AL, 35205 USA
| | - Mallikarjun Patil
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL, 35205 USA
| | - Daniel J. Tyrrell
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL, 35205 USA
| |
Collapse
|
36
|
Aplakidou E, Vergoulidis N, Chasapi M, Venetsianou NK, Kokoli M, Panagiotopoulou E, Iliopoulos I, Karatzas E, Pafilis E, Georgakopoulos-Soares I, Kyrpides NC, Pavlopoulos GA, Baltoumas FA. Visualizing metagenomic and metatranscriptomic data: A comprehensive review. Comput Struct Biotechnol J 2024; 23:2011-2033. [PMID: 38765606 PMCID: PMC11101950 DOI: 10.1016/j.csbj.2024.04.060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Revised: 04/25/2024] [Accepted: 04/25/2024] [Indexed: 05/22/2024] Open
Abstract
The fields of Metagenomics and Metatranscriptomics involve the examination of complete nucleotide sequences, gene identification, and analysis of potential biological functions within diverse organisms or environmental samples. Despite the vast opportunities for discovery in metagenomics, the sheer volume and complexity of sequence data often present challenges in processing analysis and visualization. This article highlights the critical role of advanced visualization tools in enabling effective exploration, querying, and analysis of these complex datasets. Emphasizing the importance of accessibility, the article categorizes various visualizers based on their intended applications and highlights their utility in empowering bioinformaticians and non-bioinformaticians to interpret and derive insights from meta-omics data effectively.
Collapse
Affiliation(s)
- Eleni Aplakidou
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- Department of Informatics and Telecommunications, Data Science and Information Technologies program, University of Athens, 15784 Athens, Greece
| | - Nikolaos Vergoulidis
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
| | - Maria Chasapi
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- Department of Informatics and Telecommunications, Data Science and Information Technologies program, University of Athens, 15784 Athens, Greece
| | - Nefeli K. Venetsianou
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
| | - Maria Kokoli
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
| | - Eleni Panagiotopoulou
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- Department of Informatics and Telecommunications, Data Science and Information Technologies program, University of Athens, 15784 Athens, Greece
| | - Ioannis Iliopoulos
- Department of Basic Sciences, School of Medicine, University of Crete, 71003 Heraklion, Greece
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Heraklion, Greece
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Nikos C. Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Georgios A. Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Center of New Biotechnologies & Precision Medicine, Department of Medicine, School of Health Sciences, National and Kapodistrian University of Athens, Greece
- Hellenic Army Academy, 16673 Vari, Greece
| | - Fotis A. Baltoumas
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
| |
Collapse
|
37
|
Chrysinas P, Venkatesan S, Ang I, Ghosh V, Chen C, Neelamegham S, Gunawan R. Cell- and tissue-specific glycosylation pathways informed by single-cell transcriptomics. NAR Genom Bioinform 2024; 6:lqae169. [PMID: 39703423 PMCID: PMC11655298 DOI: 10.1093/nargab/lqae169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 11/06/2024] [Accepted: 11/21/2024] [Indexed: 12/21/2024] Open
Abstract
While single-cell studies have made significant impacts in various subfields of biology, they lag in the Glycosciences. To address this gap, we analyzed single-cell glycogene expressions in the Tabula Sapiens dataset of human tissues and cell types using a recent glycosylation-specific gene ontology (GlycoEnzOnto). At the median sequencing (count) depth, ∼40-50 out of 400 glycogenes were detected in individual cells. Upon increasing the sequencing depth, the number of detectable glycogenes saturates at ∼200 glycogenes, suggesting that the average human cell expresses about half of the glycogene repertoire. Hierarchies in glycogene and glycopathway expressions emerged from our analysis: nucleotide-sugar synthesis and transport exhibited the highest gene expressions, followed by genes for core enzymes, glycan modification and extensions, and finally terminal modifications. Interestingly, the same cell types showed variable glycopathway expressions based on their organ or tissue origin, suggesting nuanced cell- and tissue-specific glycosylation patterns. Probing deeper into the transcription factors (TFs) of glycogenes, we identified distinct groupings of TFs controlling different aspects of glycosylation: core biosynthesis, terminal modifications, etc. We present webtools to explore the interconnections across glycogenes, glycopathways and TFs regulating glycosylation in human cell/tissue types. Overall, the study presents an overview of glycosylation across multiple human organ systems.
Collapse
Affiliation(s)
- Panagiotis Chrysinas
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, 308 Furnas Hall, Buffalo, NY 14260, USA
| | - Shriramprasad Venkatesan
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, 308 Furnas Hall, Buffalo, NY 14260, USA
| | - Isaac Ang
- Department of Computer Science, University of Illinois Urbana-Champaign, 201 North Goodwin Avenue, Urbana, IL 61801, USA
| | - Vishnu Ghosh
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, 308 Furnas Hall, Buffalo, NY 14260, USA
| | - Changyou Chen
- Department of Computer Science and Engineering, University at Buffalo-SUNY, 338 Davis Hall, Buffalo, NY 14260, USA
| | - Sriram Neelamegham
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, 308 Furnas Hall, Buffalo, NY 14260, USA
| | - Rudiyanto Gunawan
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, 308 Furnas Hall, Buffalo, NY 14260, USA
| |
Collapse
|
38
|
Lederer AR, Leonardi M, Talamanca L, Bobrovskiy DM, Herrera A, Droin C, Khven I, Carvalho HJF, Valente A, Dominguez Mantes A, Mulet Arabí P, Pinello L, Naef F, La Manno G. Statistical inference with a manifold-constrained RNA velocity model uncovers cell cycle speed modulations. Nat Methods 2024; 21:2271-2286. [PMID: 39482463 PMCID: PMC11621032 DOI: 10.1038/s41592-024-02471-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 09/15/2024] [Indexed: 11/03/2024]
Abstract
Across biological systems, cells undergo coordinated changes in gene expression, resulting in transcriptome dynamics that unfold within a low-dimensional manifold. While low-dimensional dynamics can be extracted using RNA velocity, these algorithms can be fragile and rely on heuristics lacking statistical control. Moreover, the estimated vector field is not dynamically consistent with the traversed gene expression manifold. To address these challenges, we introduce a Bayesian model of RNA velocity that couples velocity field and manifold estimation in a reformulated, unified framework, identifying the parameters of an explicit dynamical system. Focusing on the cell cycle, we implement VeloCycle to study gene regulation dynamics on one-dimensional periodic manifolds and validate its ability to infer cell cycle periods using live imaging. We also apply VeloCycle to reveal speed differences in regionally defined progenitors and Perturb-seq gene knockdowns. Overall, VeloCycle expands the single-cell RNA sequencing analysis toolkit with a modular and statistically consistent RNA velocity inference framework.
Collapse
Affiliation(s)
- Alex R Lederer
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Maxine Leonardi
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Lorenzo Talamanca
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Daniil M Bobrovskiy
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Antonio Herrera
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Colas Droin
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Irina Khven
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Hugo J F Carvalho
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Alessandro Valente
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Albert Dominguez Mantes
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Laboratory of Bioimage Analysis and Computational Microscopy, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Pau Mulet Arabí
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Luca Pinello
- Molecular Pathology Unit, Massachusetts General Research Institute, Charlestown, MA, USA
- Massachusetts General Hospital Cancer Center, Harvard Medical School, Charlestown, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Felix Naef
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| | - Gioele La Manno
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| |
Collapse
|
39
|
Peng Q, Qiu X, Li T. Storm: Incorporating transient stochastic dynamics to infer the RNA velocity with metabolic labeling information. PLoS Comput Biol 2024; 20:e1012606. [PMID: 39556617 DOI: 10.1371/journal.pcbi.1012606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 11/03/2024] [Indexed: 11/20/2024] Open
Abstract
The time-resolved scRNA-seq (tscRNA-seq) provides the possibility to infer physically meaningful kinetic parameters, e.g., the transcription, splicing or RNA degradation rate constants with correct magnitudes, and RNA velocities by incorporating temporal information. Previous approaches utilizing the deterministic dynamics and steady-state assumption on gene expression states are insufficient to achieve favorable results for the data involving transient process. We present a dynamical approach, Storm (Stochastic models of RNA metabolic-labeling), to overcome these limitations by solving stochastic differential equations of gene expression dynamics. The derivation reveals that the new mRNA sequencing data obeys different types of cell-specific Poisson distributions when jointly considering both biological and cell-specific technical noise. Storm deals with measured counts data directly and extends the RNA velocity methodology based on metabolic labeling scRNA-seq data to transient stochastic systems. Furthermore, we relax the constant parameter assumption over genes/cells to obtain gene-cell-specific transcription/splicing rates and gene-specific degradation rates, thus revealing time-dependent and cell-state-specific transcriptional regulations. Storm will facilitate the study of the statistical properties of tscRNA-seq data, eventually advancing our understanding of the dynamic transcription regulation during development and disease.
Collapse
Affiliation(s)
- Qiangwei Peng
- LMAM and School of Mathematical Sciences, Peking University, Beijing, China
| | - Xiaojie Qiu
- Stanford Cardiovascular Institute, Stanford University, Stanford, California, United States of America
- Center for Machine Learning Research, Peking University, Beijing, China
- Department of Genetics, Stanford University School of Medicine, Stanford, California, United States of America
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford, California, United States of America
| | - Tiejun Li
- LMAM and School of Mathematical Sciences, Peking University, Beijing, China
- Department of Computer Science, Stanford University, Stanford, California, United States of America
| |
Collapse
|
40
|
Nanduri S, Black A, Bedford T, Huddleston J. Dimensionality reduction distills complex evolutionary relationships in seasonal influenza and SARS-CoV-2. Virus Evol 2024; 10:veae087. [PMID: 39610652 PMCID: PMC11604119 DOI: 10.1093/ve/veae087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 09/30/2024] [Accepted: 10/11/2024] [Indexed: 11/30/2024] Open
Abstract
Public health researchers and practitioners commonly infer phylogenies from viral genome sequences to understand transmission dynamics and identify clusters of genetically-related samples. However, viruses that reassort or recombine violate phylogenetic assumptions and require more sophisticated methods. Even when phylogenies are appropriate, they can be unnecessary or difficult to interpret without specialty knowledge. For example, pairwise distances between sequences can be enough to identify clusters of related samples or assign new samples to existing phylogenetic clusters. In this work, we tested whether dimensionality reduction methods could capture known genetic groups within two human pathogenic viruses that cause substantial human morbidity and mortality and frequently reassort or recombine, respectively: seasonal influenza A/H3N2 and SARS-CoV-2. We applied principal component analysis, multidimensional scaling (MDS), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection to sequences with well-defined phylogenetic clades and either reassortment (H3N2) or recombination (SARS-CoV-2). For each low-dimensional embedding of sequences, we calculated the correlation between pairwise genetic and Euclidean distances in the embedding and applied a hierarchical clustering method to identify clusters in the embedding. We measured the accuracy of clusters compared to previously defined phylogenetic clades, reassortment clusters, or recombinant lineages. We found that MDS embeddings accurately represented pairwise genetic distances including the intermediate placement of recombinant SARS-CoV-2 lineages between parental lineages. Clusters from t-SNE embeddings accurately recapitulated known phylogenetic clades, H3N2 reassortment groups, and SARS-CoV-2 recombinant lineages. We show that simple statistical methods without a biological model can accurately represent known genetic relationships for relevant human pathogenic viruses. Our open source implementation of these methods for analysis of viral genome sequences can be easily applied when phylogenetic methods are either unnecessary or inappropriate.
Collapse
Affiliation(s)
- Sravani Nanduri
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, United States
| | - Allison Black
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, United States
| | - Trevor Bedford
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, United States
- Howard Hughes Medical Institute, Seattle, WA, United States
| | - John Huddleston
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, United States
| |
Collapse
|
41
|
Li J, Aoi MC, Miller CT. Representing the dynamics of natural marmoset vocal behaviors in frontal cortex. Neuron 2024; 112:3542-3550.e3. [PMID: 39317185 PMCID: PMC11560606 DOI: 10.1016/j.neuron.2024.08.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 07/26/2024] [Accepted: 08/28/2024] [Indexed: 09/26/2024]
Abstract
Here, we tested the respective contributions of primate premotor and prefrontal cortex to support vocal behavior. We applied a model-based generalized linear model (GLM) analysis that better accounts for the inherent variance in natural, continuous behaviors to characterize the activity of neurons throughout the frontal cortex as freely moving marmosets engaged in conversational exchanges. While analyses revealed functional clusters of neural activity related to the different processes involved in the vocal behavior, these clusters did not map to subfields of prefrontal or premotor cortex, as has been observed in more conventional task-based paradigms. Our results suggest a distributed functional organization for the myriad neural mechanisms underlying natural social interactions and have implications for our concepts of the role that frontal cortex plays in governing ethological behaviors in primates.
Collapse
Affiliation(s)
- Jingwen Li
- Cortical Systems & Behavior Lab, University of California, San Diego, La Jolla, CA 92093, USA.
| | - Mikio C Aoi
- Department of Neurobiology, University of California, San Diego, La Jolla, CA 92093, USA; Halıcıoğlu Data Science Institute, University of California, San Diego, La Jolla, CA 92093, USA; Neurosciences Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Cory T Miller
- Cortical Systems & Behavior Lab, University of California, San Diego, La Jolla, CA 92093, USA; Neurosciences Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
42
|
Nazaret A, Fan JL, Lavallée VP, Burdziak C, Cornish AE, Kiseliovas V, Bowman RL, Masilionis I, Chun J, Eisman SE, Wang J, Hong J, Shi L, Levine RL, Mazutis L, Blei D, Pe’er D, Azizi E. Joint representation and visualization of derailed cell states with Decipher. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.11.566719. [PMID: 38014231 PMCID: PMC10680623 DOI: 10.1101/2023.11.11.566719] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Biological insights often depend on comparing conditions such as disease and health, yet we lack effective computational tools for integrating single-cell genomics data across conditions or characterizing transitions from normal to deviant cell states. Here, we present Decipher, a deep generative model that characterizes derailed cell-state trajectories. Decipher jointly models and visualizes gene expression and cell state from normal and perturbed single-cell RNA-seq data, revealing shared and disrupted dynamics. We demonstrate its superior performance across diverse contexts, including in pancreatitis with oncogene mutation, acute myeloid leukemia, and gastric cancer.
Collapse
Affiliation(s)
- Achille Nazaret
- Department of Computer Science, Columbia University, New York, NY 10027, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY 10027, USA
| | - Joy Linyue Fan
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY 10027, USA
- Department of Biomedical Engineering, Columbia University, New York, NY 10027, USA
| | - Vincent-Philippe Lavallée
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
- Centre Hospitalier Universitaire Sainte-Justine Research Center, Montréal, QC, Canada
- Department of Pediatrics, Université de Montréal, Montréal, QC, Canada
| | - Cassandra Burdziak
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Andrew E. Cornish
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
- Immunology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Vaidotas Kiseliovas
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
- Alan and Sandra Gerry Metastasis and Tumor Ecosystems Center, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Robert L. Bowman
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
- Department of Cancer Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Ignas Masilionis
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
- Alan and Sandra Gerry Metastasis and Tumor Ecosystems Center, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Jaeyoung Chun
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
- Alan and Sandra Gerry Metastasis and Tumor Ecosystems Center, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Shira E. Eisman
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - James Wang
- Department of Computer Science, Columbia University, New York, NY 10027, USA
| | - Justin Hong
- Department of Computer Science, Columbia University, New York, NY 10027, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY 10027, USA
| | - Lingting Shi
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY 10027, USA
| | - Ross L. Levine
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Linas Mazutis
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
- Alan and Sandra Gerry Metastasis and Tumor Ecosystems Center, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
- Institute of Biotechnology Vilnius University, Life Sciences Centre, Vilnius 02158, Lithuania
| | - David Blei
- Department of Computer Science, Columbia University, New York, NY 10027, USA
- Department of Statistics, Columbia University, New York, NY 10027, USA
- Data Science Institute, Columbia University, New York, NY 10027, USA
| | - Dana Pe’er
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
- Howard Hughes Medical Institute, Memorial Sloan Kettering Cancer Center, New York 10027, NY 10065, USA
| | - Elham Azizi
- Department of Computer Science, Columbia University, New York, NY 10027, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY 10027, USA
- Department of Biomedical Engineering, Columbia University, New York, NY 10027, USA
- Data Science Institute, Columbia University, New York, NY 10027, USA
- Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY 10032, USA
| |
Collapse
|
43
|
Gao Y, Patro R, Jiang P. Collapsible tree: interactive web app to present collapsible hierarchies. Bioinformatics 2024; 40:btae645. [PMID: 39460943 PMCID: PMC11543613 DOI: 10.1093/bioinformatics/btae645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 10/16/2024] [Accepted: 10/24/2024] [Indexed: 10/28/2024] Open
Abstract
MOTIVATION A crucial component of intuitive data visualization is presenting a hierarchical tree structure with interactive functions. For example, single-cell transcriptomics studies may generate gene expression values with developmental trajectories or cell lineage structures. Two common visualization methods, t-Distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP), require two separate figures to depict the distribution of cell types and gene expression data, with low-dimension projections that may not capture the hierarchical structures among cells. RESULTS Here, we present a JavaScript framework and an interactive web app named Collapsible Tree, which presents values jointly with interactive, expandable, and collapsible lineage structures. For example, the Collapsible Tree presents cellular states and gene expression from single-cell transcriptomics within a single hierarchical plot, enabling comparisons of gene expression across lineages and subtle patterns between sub-lineages. Our framework can facilitate the exploration of complicated value patterns that are not evident in UMAP or t-SNE plots. AVAILABILITY AND IMPLEMENTATION The Collapsible Tree web interface is available at https://collapsibletree.data2in.net. The JavaScript library source code is available at https://github.com/data2intelligence/collapsible_tree.
Collapse
Affiliation(s)
- Yuan Gao
- Cancer Data Science Lab, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, United States
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, United States
| | - Rob Patro
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, United States
| | - Peng Jiang
- Cancer Data Science Lab, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, United States
| |
Collapse
|
44
|
Muir DF, Asper GPR, Notin P, Posner JA, Marks DS, Keiser MJ, Pinney MM. Evolutionary-Scale Enzymology Enables Biochemical Constant Prediction Across a Multi-Peaked Catalytic Landscape. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.23.619915. [PMID: 39484523 PMCID: PMC11526920 DOI: 10.1101/2024.10.23.619915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
Quantitatively mapping enzyme sequence-catalysis landscapes remains a critical challenge in understanding enzyme function, evolution, and design. Here, we expand an emerging microfluidic platform to measure catalytic constants-k cat and K M-for hundreds of diverse naturally occurring sequences and mutants of the model enzyme Adenylate Kinase (ADK). This enables us to dissect the sequence-catalysis landscape's topology, navigability, and mechanistic underpinnings, revealing distinct catalytic peaks organized by structural motifs. These results challenge long-standing hypotheses in enzyme adaptation, demonstrating that thermophilic enzymes are not slower than their mesophilic counterparts. Combining the rich representations of protein sequences provided by deep-learning models with our custom high-throughput kinetic data yields semi-supervised models that significantly outperform existing models at predicting catalytic parameters of naturally occurring ADK sequences. Our work demonstrates a promising strategy for dissecting sequence-catalysis landscapes across enzymatic evolution and building family-specific models capable of accurately predicting catalytic constants, opening new avenues for enzyme engineering and functional prediction.
Collapse
Affiliation(s)
- Duncan F Muir
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA, USA
- Program in Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Garrison P R Asper
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA, USA
| | - Pascal Notin
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Jacob A Posner
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA, USA
- Department of Biology, San Francisco State University, San Francisco, CA, USA
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Michael J Keiser
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, USA
- Institute for Neurodegenerative Diseases, University of California, San Francisco, San Francisco, CA, USA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Margaux M Pinney
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA, USA
- Valhalla Fellow, University of California San Francisco, San Francisco, CA, USA
| |
Collapse
|
45
|
Srivastava P, Steuer A, Ferri F, Nicoli A, Schultz K, Bej S, Di Pizio A, Wolkenhauer O. Bitter peptide prediction using graph neural networks. J Cheminform 2024; 16:111. [PMID: 39375808 PMCID: PMC11459932 DOI: 10.1186/s13321-024-00909-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 09/22/2024] [Indexed: 10/09/2024] Open
Abstract
Bitter taste is an unpleasant taste modality that affects food consumption. Bitter peptides are generated during enzymatic processes that produce functional, bioactive protein hydrolysates or during the aging process of fermented products such as cheese, soybean protein, and wine. Understanding the underlying peptide sequences responsible for bitter taste can pave the way for more efficient identification of these peptides. This paper presents BitterPep-GCN, a feature-agnostic graph convolution network for bitter peptide prediction. The graph-based model learns the embedding of amino acids in the bitter peptide sequences and uses mixed pooling for bitter classification. BitterPep-GCN was benchmarked using BTP640, a publicly available bitter peptide dataset. The latent peptide embeddings generated by the trained model were used to analyze the activity of sequence motifs responsible for the bitter taste of the peptides. Particularly, we calculated the activity for individual amino acids and dipeptide, tripeptide, and tetrapeptide sequence motifs present in the peptides. Our analyses pinpoint specific amino acids, such as F, G, P, and R, as well as sequence motifs, notably tripeptide and tetrapeptide motifs containing FF, as key bitter signatures in peptides. This work not only provides a new predictor of bitter taste for a more efficient identification of bitter peptides in various food products but also gives a hint into the molecular basis of bitterness.Scientific ContributionOur work provides the first application of Graph Neural Networks for the prediction of peptide bitter taste. The best-developed model, BitterPep-GCN, learns the embedding of amino acids in the bitter peptide sequences and uses mixed pooling for bitter classification. The embeddings were used to analyze the sequence motifs responsible for the bitter taste.
Collapse
Affiliation(s)
- Prashant Srivastava
- Institute of Computer Science, University of Rostock, 18051, Rostock, Germany
| | - Alexandra Steuer
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany
- Professorship for Chemoinformatics and Protein Modelling, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Francesco Ferri
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany
- Professorship for Chemoinformatics and Protein Modelling, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Alessandro Nicoli
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany
- Professorship for Chemoinformatics and Protein Modelling, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Kristian Schultz
- Institute of Computer Science, University of Rostock, 18051, Rostock, Germany
| | - Saptarshi Bej
- Indian Institute of Science Education and Research Thiruvananthapuram, Maruthamala P. O, Vithura, 695551, Kerala, India
| | - Antonella Di Pizio
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany.
- Professorship for Chemoinformatics and Protein Modelling, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany.
| | - Olaf Wolkenhauer
- Institute of Computer Science, University of Rostock, 18051, Rostock, Germany.
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany.
| |
Collapse
|
46
|
Lause J, Berens P, Kobak D. The art of seeing the elephant in the room: 2D embeddings of single-cell data do make sense. PLoS Comput Biol 2024; 20:e1012403. [PMID: 39356722 PMCID: PMC11446450 DOI: 10.1371/journal.pcbi.1012403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 08/09/2024] [Indexed: 10/04/2024] Open
Abstract
A recent paper claimed that t-SNE and UMAP embeddings of single-cell datasets are "specious" and fail to capture true biological structure. The authors argued that such embeddings are as arbitrary and as misleading as forcing the data into an elephant shape. Here we show that this conclusion was based on inadequate and limited metrics of embedding quality. More appropriate metrics quantifying neighborhood and class preservation reveal the elephant in the room: while t-SNE and UMAP embeddings of single-cell data do not preserve high-dimensional distances, they can nevertheless provide biologically relevant information.
Collapse
Affiliation(s)
- Jan Lause
- Hertie Institute for AI in Brain Health, University of Tübingen, Tübingen, Germany
- Tübingen AI Center, University of Tübingen, Tübingen, Germany
| | - Philipp Berens
- Hertie Institute for AI in Brain Health, University of Tübingen, Tübingen, Germany
- Tübingen AI Center, University of Tübingen, Tübingen, Germany
| | - Dmitry Kobak
- Hertie Institute for AI in Brain Health, University of Tübingen, Tübingen, Germany
- Tübingen AI Center, University of Tübingen, Tübingen, Germany
- IWR, Heidelberg University, Heidelberg, Germany
| |
Collapse
|
47
|
Kitanovski S, Cao Y, Ttoouli D, Farahpour F, Wang J, Hoffmann D. scBubbletree: computational approach for visualization of single cell RNA-seq data. BMC Bioinformatics 2024; 25:302. [PMID: 39271980 PMCID: PMC11401305 DOI: 10.1186/s12859-024-05927-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Accepted: 09/09/2024] [Indexed: 09/15/2024] Open
Abstract
BACKGROUND Visualization approaches transform high-dimensional data from single cell RNA sequencing (scRNA-seq) experiments into two-dimensional plots that are used for analysis of cell relationships, and as a means of reporting biological insights. Yet, many standard approaches generate visuals that suffer from overplotting, lack of quantitative information, and distort global and local properties of biological patterns relative to the original high-dimensional space. RESULTS We present scBubbletree, a new, scalable method for visualization of scRNA-seq data. The method identifies clusters of cells of similar transcriptomes and visualizes such clusters as "bubbles" at the tips of dendrograms (bubble trees), corresponding to quantitative summaries of cluster properties and relationships. scBubbletree stacks bubble trees with further cluster-associated information in a visually easily accessible way, thus facilitating quantitative assessment and biological interpretation of scRNA-seq data. We demonstrate this with large scRNA-seq data sets, including one with over 1.2 million cells. CONCLUSIONS To facilitate coherent quantification and visualization of scRNA-seq data we developed the R-package scBubbletree, which is freely available as part of the Bioconductor repository at: https://bioconductor.org/packages/scBubbletree/.
Collapse
Affiliation(s)
- Simo Kitanovski
- Bioinformatics and Computational Biophysics, Faculty of Biology and Centre for Medical Biotechnology (ZMB), University of Duisburg-Essen, 45141, Essen, Germany.
| | - Yingying Cao
- Bioinformatics and Computational Biophysics, Faculty of Biology and Centre for Medical Biotechnology (ZMB), University of Duisburg-Essen, 45141, Essen, Germany
| | - Dimitris Ttoouli
- Bioinformatics and Computational Biophysics, Faculty of Biology and Centre for Medical Biotechnology (ZMB), University of Duisburg-Essen, 45141, Essen, Germany
| | - Farnoush Farahpour
- Bioinformatics and Computational Biophysics, Faculty of Biology and Centre for Medical Biotechnology (ZMB), University of Duisburg-Essen, 45141, Essen, Germany
- Institute of Cell Biology (Cancer Research), University Hospital Essen, University of Duisburg-Essen, 45147, Essen, Germany
| | - Jun Wang
- Bioinformatics and Computational Biophysics, Faculty of Biology and Centre for Medical Biotechnology (ZMB), University of Duisburg-Essen, 45141, Essen, Germany
- National Clinical Research Centre for Infectious Diseases, The Third People's Hospital of Shenzhen and The Second Affiliated Hospital of Southern University of Science and Technology, Shenzhen, 518112, Guangdong Province, China
| | - Daniel Hoffmann
- Bioinformatics and Computational Biophysics, Faculty of Biology and Centre for Medical Biotechnology (ZMB), University of Duisburg-Essen, 45141, Essen, Germany.
| |
Collapse
|
48
|
Protti G, Spreafico R. A primer on single-cell RNA-seq analysis using dendritic cells as a case study. FEBS Lett 2024. [PMID: 39245787 DOI: 10.1002/1873-3468.15009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 07/18/2024] [Accepted: 08/12/2024] [Indexed: 09/10/2024]
Abstract
Recent advances in single-cell (sc) transcriptomics have revolutionized our understanding of dendritic cells (DCs), pivotal players of the immune system. ScRNA-sequencing (scRNA-seq) has unraveled a previously unrecognized complexity and heterogeneity of DC subsets, shedding light on their ontogeny and specialized roles. However, navigating the rapid technological progress and computational methods can be daunting for researchers unfamiliar with the field. This review aims to provide immunologists with a comprehensive introduction to sc transcriptomic analysis, offering insights into recent developments in DC biology. Addressing common analytical queries, we guide readers through popular tools and methodologies, supplemented with references to benchmarks and tutorials for in-depth understanding. By examining findings from pioneering studies, we illustrate how computational techniques have expanded our knowledge of DC biology. Through this synthesis, we aim to equip researchers with the necessary tools and knowledge to navigate and leverage scRNA-seq for unraveling the intricacies of DC biology and advancing immunological research.
Collapse
Affiliation(s)
- Giulia Protti
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Milan, Italy
| | - Roberto Spreafico
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA, USA
| |
Collapse
|
49
|
Xu Y, Zhu W, Su Y, Ma T, Zhang Y, Pan X, Huang R, Li Y, Zuo K, Ong SB, Xu D. Characterization of a novel mitophagy-related 5-genes signature for diagnosis of acute myocardial infarction. Vascul Pharmacol 2024; 156:107417. [PMID: 39159737 DOI: 10.1016/j.vph.2024.107417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 08/05/2024] [Accepted: 08/16/2024] [Indexed: 08/21/2024]
Abstract
Myocardial infarction (MI) and the ensuing heart failure (HF) remain the main cause of morbidity and mortality worldwide. One of the strategies to combat MI and HF lies in the ability to accurately predict the onset of these disorders. Alterations in mitochondrial homeostasis have been reported to be involved in the pathogenesis of various cardiovascular diseases (CVDs). In this regard, perturbations to mitochondrial dynamics leading to impaired clearance of dysfunctional mitochondria have been previously established to be a crucial trigger for MI/HF. In this study, we found that MI patients could be classified into three clusters based on the expression levels of mitophagy-related genes and consensus clustering. We identified a mitophagy-related diagnostic 5-genes signature for MI using support vector machines-Recursive Feature Elimination (SVM-RFE) and random forest, with the area under the ROC curve (AUC) value of the predictive model at 0.813. Additionally, the single-cell transcriptome and pseudo-time analyses showed that the mitoscore was significantly upregulated in macrophages, endothelial cells, pericytes, fibroblasts and monocytes in patients with ischemic cardiomyopathy, while sequestosome 1 (SQSTM1) exhibited remarkable increase in the infarcted (ICM) and non-infarcted (ICMN) myocardium samples dissected from the left ventricle compared with control samples. Lastly, through analysis of peripheral blood from MI patients, we found that the expression of SQSTM1 is positively correlated with troponin-T (P < 0.0001, R = 0.4195, R2 = 0.1759). Therefore, this study provides the rationale for a cell-specific mitophagy-related gene signature as an additional supporting diagnostic for CVDs.
Collapse
Affiliation(s)
- Yanhua Xu
- Institute for Regenerative Medicine, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, Frontier Science Center for Stem Cell Research, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China; Department of Cardiology, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai 200072, China
| | - Wenqing Zhu
- Institute for Regenerative Medicine, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, Frontier Science Center for Stem Cell Research, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China; Tongji University School of Medicine, Shanghai, China
| | - Yang Su
- Department of Cardiology, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai 200072, China
| | - Teng Ma
- Department of Cardiology, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai 200072, China
| | - Yaqi Zhang
- Department of Cardiology, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai 200072, China
| | - Xin Pan
- Department of Cardiology, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai 200072, China
| | - Rongrong Huang
- Department of Cardiology, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai 200072, China
| | - Yuhao Li
- Department of Medicine and Therapeutics, Faculty of Medicine, Chinese University of Hong Kong (CUHK), Hong Kong, China; Centre for Cardiovascular Genomics and Medicine (CCGM), Lui Che Woo Institute of Innovative Medicine, Chinese University of Hong Kong (CUHK), Hong Kong, China
| | - Keqiang Zuo
- Department of Interventional and Vascular Surgery, Shanghai Tenth People's Hospital, Tongji University, No. 301 Middle Yan Chang Road, Shanghai 200072, China.
| | - Sang-Bing Ong
- Department of Medicine and Therapeutics, Faculty of Medicine, Chinese University of Hong Kong (CUHK), Hong Kong, China; Centre for Cardiovascular Genomics and Medicine (CCGM), Lui Che Woo Institute of Innovative Medicine, Chinese University of Hong Kong (CUHK), Hong Kong, China; Neural, Vascular, and Metabolic Biology Thematic Research Program, School of Biomedical Sciences (SBS), Chinese University of Hong Kong (CUHK), Hong Kong, China; Hong Kong Hub of Paediatric Excellence (HK HOPE), Hong Kong Children's Hospital (HKCH), Kowloon Bay, Hong Kong, China; Kunming Institute of Zoology - The Chinese University of Hong Kong (KIZ-CUHK) Joint Laboratory of Bioresources and Molecular Research of Common Diseases, Hong Kong, China.
| | - Dachun Xu
- Department of Cardiology, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai 200072, China.
| |
Collapse
|
50
|
Adema K, Schon MA, Nodine MD, Kohlen W. Lost in space: what single-cell RNA sequencing cannot tell you. TRENDS IN PLANT SCIENCE 2024; 29:1018-1028. [PMID: 38570278 DOI: 10.1016/j.tplants.2024.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 02/21/2024] [Accepted: 03/11/2024] [Indexed: 04/05/2024]
Abstract
Plant scientists are rapidly integrating single-cell RNA sequencing (scRNA-seq) into their workflows. Maximizing the potential of scRNA-seq requires a proper understanding of the spatiotemporal context of cells. However, positional information is inherently lost during scRNA-seq, limiting its potential to characterize complex biological systems. In this review we highlight how current single-cell analysis pipelines cannot completely recover spatial information, which confounds biological interpretation. Various strategies exist to identify the location of RNA, from classical RNA in situ hybridization to spatial transcriptomics. Herein we discuss the possibility of utilizing this spatial information to supervise single-cell analyses. An integrative approach will maximize the potential of each technology, and lead to insights which go beyond the capability of each individual technology.
Collapse
Affiliation(s)
- Kelvin Adema
- Laboratory of Cell and Developmental Biology, Cluster of Plant Developmental Biology, Department of Plant Sciences, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
| | - Michael A Schon
- Laboratory of Cell and Developmental Biology, Cluster of Plant Developmental Biology, Department of Plant Sciences, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands; Laboratory of Molecular Biology, Cluster of Plant Developmental Biology, Department of Plant Sciences, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
| | - Michael D Nodine
- Laboratory of Molecular Biology, Cluster of Plant Developmental Biology, Department of Plant Sciences, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
| | - Wouter Kohlen
- Laboratory of Cell and Developmental Biology, Cluster of Plant Developmental Biology, Department of Plant Sciences, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands; Laboratory of Molecular Biology, Cluster of Plant Developmental Biology, Department of Plant Sciences, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands.
| |
Collapse
|