1
|
Mocking TR, van de Loosdrecht AA, Cloos J, Bachas C. Applications of machine learning for immunophenotypic measurable residual disease assessment in acute myeloid leukemia. Hemasphere 2025; 9:e70138. [PMID: 40400510 PMCID: PMC12093103 DOI: 10.1002/hem3.70138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2025] [Revised: 03/16/2025] [Accepted: 04/08/2025] [Indexed: 05/23/2025] Open
Abstract
Immunophenotypic detection and quantification of residual leukemic cells by multiparameter flow cytometry is increasingly adopted in the clinical practice of acute myeloid leukemia (AML) to assess measurable residual disease (MRD). However, MRD levels quantified by manual gating analysis can differ based on differences in gating strategy between trained operators and clinical centers. Manual gating requires extensive training, is time-consuming in daily practice, and faces a significant hurdle in analyzing data from next-generation cytometry platforms. To address these challenges, several computational approaches involving machine learning and artificial intelligence algorithms have been proposed to automate or aid the assessment of MRD. However, the immunophenotypic variability between patients and the relatively low proportions of residual leukemic cells in AML challenge most algorithms and require innovative approaches. This review provides an overview of recent efforts in using computational methods for immunophenotypic AML-MRD assessment. We first explain the technical and conceptual background of the different algorithms that have been explored. Next, we discuss their strengths and limitations in the disease-specific context of AML. Finally, we highlight how computational approaches offer a unique opportunity to standardize or even outperform current manual gating analyses, and ultimately, improve the treatment of AML patients.
Collapse
Affiliation(s)
- Tim R. Mocking
- Department of Hematology, Amsterdam UMCVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Cancer Center AmsterdamImaging and BiomarkersAmsterdamThe Netherlands
| | - Arjan A. van de Loosdrecht
- Department of Hematology, Amsterdam UMCVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Cancer Center AmsterdamImaging and BiomarkersAmsterdamThe Netherlands
| | - Jacqueline Cloos
- Department of Hematology, Amsterdam UMCVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Cancer Center AmsterdamImaging and BiomarkersAmsterdamThe Netherlands
| | - Costa Bachas
- Department of Hematology, Amsterdam UMCVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Cancer Center AmsterdamImaging and BiomarkersAmsterdamThe Netherlands
| |
Collapse
|
2
|
Wei J, Zhang B, Wang Q, Zhou T, Tian T, Chen L. Diffusive topology preserving manifold distances for single-cell data analysis. Proc Natl Acad Sci U S A 2025; 122:e2404860121. [PMID: 39854240 PMCID: PMC11789025 DOI: 10.1073/pnas.2404860121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 11/25/2024] [Indexed: 01/26/2025] Open
Abstract
Manifold learning techniques have emerged as crucial tools for uncovering latent patterns in high-dimensional single-cell data. However, most existing dimensionality reduction methods primarily rely on 2D visualization, which can distort true data relationships and fail to extract reliable biological information. Here, we present DTNE (diffusive topology neighbor embedding), a dimensionality reduction framework that faithfully approximates manifold distance to enhance cellular relationships and dynamics. DTNE constructs a manifold distance matrix using a modified personalized PageRank algorithm, thereby preserving topological structure while enabling diverse single-cell analyses. This approach facilitates distribution-based cellular relationship analysis, pseudotime inference, and clustering within a unified framework. Extensive benchmarking against mainstream algorithms on diverse datasets demonstrates DTNE's superior performance in maintaining geodesic distances and revealing significant biological patterns. Our results establish DTNE as a powerful tool for high-dimensional data analysis in uncovering meaningful biological insights.
Collapse
Affiliation(s)
- Jiangyong Wei
- Guangdong Institute of Intelligence Science and Technology, 519031Hengqin, Zhuhai, Guangdong, China
| | - Bin Zhang
- Guangdong Institute of Intelligence Science and Technology, 519031Hengqin, Zhuhai, Guangdong, China
| | - Qiu Wang
- Guangdong Institute of Intelligence Science and Technology, 519031Hengqin, Zhuhai, Guangdong, China
| | - Tianshou Zhou
- School of Mathematics and Statistics, Sun Yat-sen University, 510275Guangzhou, China
| | - Tianhai Tian
- School of Mathematics, Monash University, Melbourne, VIC3800, Australia
| | - Luonan Chen
- Guangdong Institute of Intelligence Science and Technology, 519031Hengqin, Zhuhai, Guangdong, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 310024Hangzhou, China
- Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai200031, China
| |
Collapse
|
3
|
Mocking TR, Kelder A, Reuvekamp T, Ngai LL, Rutten P, Gradowska P, van de Loosdrecht AA, Cloos J, Bachas C. Computational assessment of measurable residual disease in acute myeloid leukemia using mixture models. COMMUNICATIONS MEDICINE 2024; 4:271. [PMID: 39702555 DOI: 10.1038/s43856-024-00700-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Accepted: 12/05/2024] [Indexed: 12/21/2024] Open
Abstract
BACKGROUND The proportion of residual leukemic blasts after chemotherapy assessed by multiparameter flow cytometry, is an important prognostic factor for the risk of relapse and overall survival in acute myeloid leukemia (AML). This measurable residual disease (MRD) is used in clinical trials to stratify patients for more or less intensive consolidation therapy. However, an objective and reproducible analysis method to assess MRD status from flow cytometry data is lacking, yet is highly anticipated for broader implementation of MRD testing. METHODS We propose a computational pipeline based on Gaussian mixture modeling that allows a fully automated assessment of MRD status while remaining completely interpretable for clinical diagnostic experts. Our pipeline requires limited training data, which makes it easily transferable to other medical centers and cytometry platforms. RESULTS We identify all healthy and leukemic immature myeloid cells in with high concordance (Spearman's Rho = 0.974) and classification performance (median F-score = 0.861) compared to manual analysis. Using control samples (n = 18), we calculate a computational MRD percentage with high concordance to expert gating (Spearman's rho = 0.823) and predict MRD status in a cohort of 35 AML follow-up measurements with high accuracy (97%). CONCLUSIONS We demonstrate that our pipeline provides a powerful tool for fast (~3 s) and objective automated MRD assessment in AML.
Collapse
Affiliation(s)
- Tim R Mocking
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
| | - Angèle Kelder
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
| | - Tom Reuvekamp
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
- Department of Hematology, Amsterdam UMC, Universiteit van Amsterdam, Amsterdam, The Netherlands
| | - Lok Lam Ngai
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
| | - Philip Rutten
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
- Department of Epidemiology and Data Science, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Patrycja Gradowska
- Department of Hematology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
- HOVON Foundation, Rotterdam, The Netherlands
| | - Arjan A van de Loosdrecht
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
| | - Jacqueline Cloos
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands
| | - Costa Bachas
- Department of Hematology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.
- Cancer Center Amsterdam, Imaging and Biomarkers, Amsterdam, The Netherlands.
| |
Collapse
|
4
|
Chrysinas P, Venkatesan S, Ang I, Ghosh V, Chen C, Neelamegham S, Gunawan R. Cell- and tissue-specific glycosylation pathways informed by single-cell transcriptomics. NAR Genom Bioinform 2024; 6:lqae169. [PMID: 39703423 PMCID: PMC11655298 DOI: 10.1093/nargab/lqae169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 11/06/2024] [Accepted: 11/21/2024] [Indexed: 12/21/2024] Open
Abstract
While single-cell studies have made significant impacts in various subfields of biology, they lag in the Glycosciences. To address this gap, we analyzed single-cell glycogene expressions in the Tabula Sapiens dataset of human tissues and cell types using a recent glycosylation-specific gene ontology (GlycoEnzOnto). At the median sequencing (count) depth, ∼40-50 out of 400 glycogenes were detected in individual cells. Upon increasing the sequencing depth, the number of detectable glycogenes saturates at ∼200 glycogenes, suggesting that the average human cell expresses about half of the glycogene repertoire. Hierarchies in glycogene and glycopathway expressions emerged from our analysis: nucleotide-sugar synthesis and transport exhibited the highest gene expressions, followed by genes for core enzymes, glycan modification and extensions, and finally terminal modifications. Interestingly, the same cell types showed variable glycopathway expressions based on their organ or tissue origin, suggesting nuanced cell- and tissue-specific glycosylation patterns. Probing deeper into the transcription factors (TFs) of glycogenes, we identified distinct groupings of TFs controlling different aspects of glycosylation: core biosynthesis, terminal modifications, etc. We present webtools to explore the interconnections across glycogenes, glycopathways and TFs regulating glycosylation in human cell/tissue types. Overall, the study presents an overview of glycosylation across multiple human organ systems.
Collapse
Affiliation(s)
- Panagiotis Chrysinas
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, 308 Furnas Hall, Buffalo, NY 14260, USA
| | - Shriramprasad Venkatesan
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, 308 Furnas Hall, Buffalo, NY 14260, USA
| | - Isaac Ang
- Department of Computer Science, University of Illinois Urbana-Champaign, 201 North Goodwin Avenue, Urbana, IL 61801, USA
| | - Vishnu Ghosh
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, 308 Furnas Hall, Buffalo, NY 14260, USA
| | - Changyou Chen
- Department of Computer Science and Engineering, University at Buffalo-SUNY, 338 Davis Hall, Buffalo, NY 14260, USA
| | - Sriram Neelamegham
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, 308 Furnas Hall, Buffalo, NY 14260, USA
| | - Rudiyanto Gunawan
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, 308 Furnas Hall, Buffalo, NY 14260, USA
| |
Collapse
|
5
|
Lause J, Berens P, Kobak D. The art of seeing the elephant in the room: 2D embeddings of single-cell data do make sense. PLoS Comput Biol 2024; 20:e1012403. [PMID: 39356722 PMCID: PMC11446450 DOI: 10.1371/journal.pcbi.1012403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 08/09/2024] [Indexed: 10/04/2024] Open
Abstract
A recent paper claimed that t-SNE and UMAP embeddings of single-cell datasets are "specious" and fail to capture true biological structure. The authors argued that such embeddings are as arbitrary and as misleading as forcing the data into an elephant shape. Here we show that this conclusion was based on inadequate and limited metrics of embedding quality. More appropriate metrics quantifying neighborhood and class preservation reveal the elephant in the room: while t-SNE and UMAP embeddings of single-cell data do not preserve high-dimensional distances, they can nevertheless provide biologically relevant information.
Collapse
Affiliation(s)
- Jan Lause
- Hertie Institute for AI in Brain Health, University of Tübingen, Tübingen, Germany
- Tübingen AI Center, University of Tübingen, Tübingen, Germany
| | - Philipp Berens
- Hertie Institute for AI in Brain Health, University of Tübingen, Tübingen, Germany
- Tübingen AI Center, University of Tübingen, Tübingen, Germany
| | - Dmitry Kobak
- Hertie Institute for AI in Brain Health, University of Tübingen, Tübingen, Germany
- Tübingen AI Center, University of Tübingen, Tübingen, Germany
- IWR, Heidelberg University, Heidelberg, Germany
| |
Collapse
|
6
|
Lause J, Kobak D, Berens P. The art of seeing the elephant in the room: 2D embeddings of single-cell data do make sense. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.26.586728. [PMID: 38585748 PMCID: PMC10996625 DOI: 10.1101/2024.03.26.586728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
A recent paper in PLOS Computational Biology (Chari and Pachter, 2023) claimed that t -SNE and UMAP embeddings of single-cell datasets fail to capture true biological structure. The authors argued that such embeddings are as arbitrary and as misleading as forcing the data into an elephant shape. Here we show that this conclusion was based on inadequate and limited metrics of embedding quality. More appropriate metrics quantifying neighborhood and class preservation reveal the elephant in the room: while t -SNE and UMAP embeddings of single-cell data do not preserve high-dimensional distances, they can nevertheless provide biologically relevant information.
Collapse
|
7
|
Chrysinas P, Venkatesan S, Ang I, Ghosh V, Chen C, Neelamegham S, Gunawan R. Cell and tissue-specific glycosylation pathways informed by single-cell transcriptomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.26.559616. [PMID: 38260527 PMCID: PMC10802235 DOI: 10.1101/2023.09.26.559616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
While single cell studies have made significant impacts in various subfields of biology, they lag in the Glycosciences. To address this gap, we analyzed single-cell glycogene expressions in the Tabula Sapiens dataset of human tissues and cell types using a recent glycosylation-specific gene ontology (GlycoEnzOnto). At the median sequencing (count) depth, ~40-50 out of 400 glycogenes were detected in individual cells. Upon increasing the sequencing depth, the number of detectable glycogenes saturates at ~200 glycogenes, suggesting that the average human cell expresses about half of the glycogene repertoire. Hierarchies in glycogene and glycopathway expressions emerged from our analysis: nucleotide-sugar synthesis and transport exhibited the highest gene expressions, followed by genes for core enzymes, glycan modification and extensions, and finally terminal modifications. Interestingly, the same cell types showed variable glycopathway expressions based on their organ or tissue origin, suggesting nuanced cell- and tissue-specific glycosylation patterns. Probing deeper into the transcription factors (TFs) of glycogenes, we identified distinct groupings of TFs controlling different aspects of glycosylation: core biosynthesis, terminal modifications, etc. We present webtools to explore the interconnections across glycogenes, glycopathways, and TFs regulating glycosylation in human cell/tissue types. Overall, the study presents an overview of glycosylation across multiple human organ systems.
Collapse
Affiliation(s)
- Panagiotis Chrysinas
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, Buffalo, NY, 14260, USA
| | - Shriramprasad Venkatesan
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, Buffalo, NY, 14260, USA
| | - Isaac Ang
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
| | - Vishnu Ghosh
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, Buffalo, NY, 14260, USA
| | - Changyou Chen
- Department of Computer Science and Engineering, University at Buffalo-SUNY, Buffalo, NY, 14260, USA
| | - Sriram Neelamegham
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, Buffalo, NY, 14260, USA
| | - Rudiyanto Gunawan
- Department of Chemical and Biological Engineering, University at Buffalo-SUNY, Buffalo, NY, 14260, USA
| |
Collapse
|