1
|
Sun Y, Kong L, Huang J, Deng H, Bian X, Li X, Cui F, Dou L, Cao C, Zou Q, Zhang Z. A comprehensive survey of dimensionality reduction and clustering methods for single-cell and spatial transcriptomics data. Brief Funct Genomics 2024:elae023. [PMID: 38860675 DOI: 10.1093/bfgp/elae023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 02/29/2024] [Accepted: 05/27/2024] [Indexed: 06/12/2024] Open
Abstract
In recent years, the application of single-cell transcriptomics and spatial transcriptomics analysis techniques has become increasingly widespread. Whether dealing with single-cell transcriptomic or spatial transcriptomic data, dimensionality reduction and clustering are indispensable. Both single-cell and spatial transcriptomic data are often high-dimensional, making the analysis and visualization of such data challenging. Through dimensionality reduction, it becomes possible to visualize the data in a lower-dimensional space, allowing for the observation of relationships and differences between cell subpopulations. Clustering enables the grouping of similar cells into the same cluster, aiding in the identification of distinct cell subpopulations and revealing cellular diversity, providing guidance for downstream analyses. In this review, we systematically summarized the most widely recognized algorithms employed for the dimensionality reduction and clustering analysis of single-cell transcriptomic and spatial transcriptomic data. This endeavor provides valuable insights and ideas that can contribute to the development of novel tools in this rapidly evolving field.
Collapse
Affiliation(s)
- Yidi Sun
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Lingling Kong
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Jiayi Huang
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Hongyan Deng
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Xinling Bian
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Xingfeng Li
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Feifei Cui
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Lijun Dou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland, OH 44106, United States
| | - Chen Cao
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 210029, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Zilong Zhang
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| |
Collapse
|
2
|
Mori T, Takase T, Lan KC, Yamane J, Alev C, Kimura A, Osafune K, Yamashita JK, Akutsu T, Kitano H, Fujibuchi W. eSPRESSO: topological clustering of single-cell transcriptomics data to reveal informative genes for spatio-temporal architectures of cells. BMC Bioinformatics 2023; 24:252. [PMID: 37322439 DOI: 10.1186/s12859-023-05355-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 05/25/2023] [Indexed: 06/17/2023] Open
Abstract
BACKGROUND Bioinformatics capability to analyze spatio-temporal dynamics of gene expression is essential in understanding animal development. Animal cells are spatially organized as functional tissues where cellular gene expression data contain information that governs morphogenesis during the developmental process. Although several computational tissue reconstruction methods using transcriptomics data have been proposed, those methods have been ineffective in arranging cells in their correct positions in tissues or organs unless spatial information is explicitly provided. RESULTS This study demonstrates stochastic self-organizing map clustering with Markov chain Monte Carlo calculations for optimizing informative genes effectively reconstruct any spatio-temporal topology of cells from their transcriptome profiles with only a coarse topological guideline. The method, eSPRESSO (enhanced SPatial REconstruction by Stochastic Self-Organizing Map), provides a powerful in silico spatio-temporal tissue reconstruction capability, as confirmed by using human embryonic heart and mouse embryo, brain, embryonic heart, and liver lobule with generally high reproducibility (average max. accuracy = 92.0%), while revealing topologically informative genes, or spatial discriminator genes. Furthermore, eSPRESSO was used for temporal analysis of human pancreatic organoids to infer rational developmental trajectories with several candidate 'temporal' discriminator genes responsible for various cell type differentiations. CONCLUSIONS eSPRESSO provides a novel strategy for analyzing mechanisms underlying the spatio-temporal formation of cellular organizations.
Collapse
Affiliation(s)
- Tomoya Mori
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto, 611-0011, Japan
| | - Toshiro Takase
- Life Sciences, IBM Consulting, IBM Japan Ltd., 19-21 Nihonbashi Hakozaki-cho , Chuo-ku, Tokyo, 103-8510, Japan
| | - Kuan-Chun Lan
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Kawahara-cho, Sho-goin, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Junko Yamane
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Kawahara-cho, Sho-goin, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Cantas Alev
- Institute for the Advanced Study of Human Biology (ASHBi), Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto, 606-8501, Japan
| | - Azuma Kimura
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Kawahara-cho, Sho-goin, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Kenji Osafune
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Kawahara-cho, Sho-goin, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Jun K Yamashita
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Kawahara-cho, Sho-goin, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto, 611-0011, Japan
| | - Hiroaki Kitano
- The Systems Biology Institute, Tokyo, Japan
- Okinawa Institute of Science and Technology Graduate School, Okinawa, Japan
- Sony Computer Science Laboratories, Inc., Tokyo, Japan
- Sony AI, Inc., Tokyo, Japan
- The Alan Turing Institute, London, UK
| | - Wataru Fujibuchi
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Kawahara-cho, Sho-goin, Sakyo-ku, Kyoto, 606-8507, Japan.
| |
Collapse
|
3
|
Mao Y, Pichaud F. For Special Issue: Tissue size and shape. Semin Cell Dev Biol 2022; 130:1-2. [PMID: 35659474 DOI: 10.1016/j.semcdb.2022.05.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Yanlan Mao
- Laboratory for Molecular Cell Biology, University College London, Gower Street, London WC1E 6BT, UK; Institute for the Physics of Living Systems, University College London, Gower Street, London WC1E 6BT, UK
| | - Franck Pichaud
- Laboratory for Molecular Cell Biology, University College London, Gower Street, London WC1E 6BT, UK; Institute for the Physics of Living Systems, University College London, Gower Street, London WC1E 6BT, UK
| |
Collapse
|
4
|
Gharaei N, Ismail W, Grosan C, Hendradi R. Optimizing the setting of medical interactive rehabilitation assistant platform to improve the performance of the patients: A case study. Artif Intell Med 2021; 120:102151. [PMID: 34629147 DOI: 10.1016/j.artmed.2021.102151] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 07/19/2021] [Accepted: 08/16/2021] [Indexed: 11/24/2022]
Abstract
Tele-rehabilitation is an alternative to the conventional rehabilitation service that helps patients in remote areas to access a service that is practical in terms of logistics and cost, in a controlled environment. It includes the usage of mobile phones or other wireless devices that are applied to rehabilitation exercises. Such applications or software include exercises in the form of virtual games, treatment monitoring based on the rehabilitation progress and data analysis. However, nowadays, physiotherapists use a default profiling setting for patients carrying out rehabilitation, due to lack of information. Medical Interactive Rehabilitation Assistant (MIRA) is a computer-based (virtual reality) rehabilitation platform. The profile setting includes: a level of difficulty, percentage of tolerance and maximum range. To the best of our knowledge, there is a lack of optimization in the parameter values setting of MIRA exergames that could enhance patients' performance. Generally, non-optimal profile setting leads to reduced effectiveness. Therefore, this study aims to develop a method that optimizes the profile setting of each patient according to the estimated (desired) optimal results. The proposed method is developed using unsupervised and supervised machine learning techniques. We use Self-Organizing Map (SOM) to cluster patient records into several distinct clusters. K-fold cross validation is applied to construct the prediction models. Classification And Regression Tree (CART) is utilized to predict the patient's optimal input setting for playing the MIRA games. The combination of these techniques seems to improve the efficiency of the standard (default) way in predicting the optimal settings for exergames. To evaluate the proposed method, we conduct an experiment with data collected from a rehabilitation center. We use three metrics to quantify the quality of the results: R-squared (R2), Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). The results of experimental analysis demonstrate that the proposed method is effective in predicting the adequate parameter setting in MIRA platform. The method has potential to be implemented as an intelligent system for MIRA prediction in healthcare. Moreover, the method could be extended to similar platforms for which data is available to train our method on.
Collapse
Affiliation(s)
- Niayesh Gharaei
- Faculty of Science and Technology, Universiti Sains Islam Malaysia, Nilai, Negeri Sembilan, Malaysia.
| | - Waidah Ismail
- Faculty of Science and Technology, Universiti Sains Islam Malaysia, Nilai, Negeri Sembilan, Malaysia; Information System Study Program, Faculty Science and Technology, Universitas Airlangga, Indonesia Kampus C, Surabaya, Indonesia.
| | - Crina Grosan
- Department of Computer Science, Brunel University London, United Kingdom.
| | - Rimuljo Hendradi
- Information System Study Program, Faculty Science and Technology, Universitas Airlangga, Indonesia Kampus C, Surabaya, Indonesia.
| |
Collapse
|
5
|
Panina Y, Karagiannis P, Kurtz A, Stacey GN, Fujibuchi W. Human Cell Atlas and cell-type authentication for regenerative medicine. Exp Mol Med 2020; 52:1443-1451. [PMID: 32929224 PMCID: PMC8080834 DOI: 10.1038/s12276-020-0421-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Revised: 03/05/2020] [Accepted: 03/09/2020] [Indexed: 12/22/2022] Open
Abstract
In modern biology, the correct identification of cell types is required for the developmental study of tissues and organs and the production of functional cells for cell therapies and disease modeling. For decades, cell types have been defined on the basis of morphological and physiological markers and, more recently, immunological markers and molecular properties. Recent advances in single-cell RNA sequencing have opened new doors for the characterization of cells at the individual and spatiotemporal levels on the basis of their RNA profiles, vastly transforming our understanding of cell types. The objective of this review is to survey the current progress in the field of cell-type identification, starting with the Human Cell Atlas project, which aims to sequence every cell in the human body, to molecular marker databases for individual cell types and other sources that address cell-type identification for regenerative medicine based on cell data guidelines.
Collapse
Affiliation(s)
- Yulia Panina
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Kawahara-cho, Shogoin, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Peter Karagiannis
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Kawahara-cho, Shogoin, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Andreas Kurtz
- BIH Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353, Berlin, Germany
| | - Glyn N Stacey
- International Stem Cell Banking Initiative, 2 High Street, Barley, Herts, SG88HZ, UK
- National Stem Cell Resource Centre, Institute of Zoology, Chinese Academy of Sciences, 100190, Beijing, China
- Innovation Academy for Stem Cell and Regeneration, Chinese Academy of Sciences, 100101, Beijing, China
| | - Wataru Fujibuchi
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Kawahara-cho, Shogoin, Sakyo-ku, Kyoto, 606-8507, Japan.
| |
Collapse
|
6
|
Affiliation(s)
- Patrick P L Tam
- Embryology Unit, Children's Medical Research Institute, University of Sydney and School of Medical Sciences, Faculty of Medicine and Health, University of Sydney, Westmead, NSW 2145, Australia.
| |
Collapse
|