1
|
CZI Cell Science Program, Abdulla S, Aevermann B, Assis P, Badajoz S, Bell SM, Bezzi E, Cakir B, Chaffer J, Chambers S, Cherry J, Chi T, Chien J, Dorman L, Garcia-Nieto P, Gloria N, Hastie M, Hegeman D, Hilton J, Huang T, Infeld A, Istrate AM, Jelic I, Katsuya K, Kim YJ, Liang K, Lin M, Lombardo M, Marshall B, Martin B, McDade F, Megill C, Patel N, Predeus A, Raymor B, Robatmili B, Rogers D, Rutherford E, Sadgat D, Shin A, Small C, Smith T, Sridharan P, Tarashansky A, Tavares N, Thomas H, Tolopko A, Urisko M, Yan J, Yeretssian G, Zamanian J, Mani A, Cool J, Carr A. CZ CELLxGENE Discover: a single-cell data platform for scalable exploration, analysis and modeling of aggregated data. Nucleic Acids Res 2025; 53:D886-D900. [PMID: 39607691 PMCID: PMC11701654 DOI: 10.1093/nar/gkae1142] [Citation(s) in RCA: 35] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2024] [Revised: 10/28/2024] [Accepted: 11/01/2024] [Indexed: 11/29/2024] Open
Abstract
Hundreds of millions of single cells have been analyzed using high-throughput transcriptomic methods. The cumulative knowledge within these datasets provides an exciting opportunity for unlocking insights into health and disease at the level of single cells. Meta-analyses that span diverse datasets building on recent advances in large language models and other machine-learning approaches pose exciting new directions to model and extract insight from single-cell data. Despite the promise of these and emerging analytical tools for analyzing large amounts of data, the sheer number of datasets, data models and accessibility remains a challenge. Here, we present CZ CELLxGENE Discover (cellxgene.cziscience.com), a data platform that provides curated and interoperable single-cell data. Available via a free-to-use online data portal, CZ CELLxGENE hosts a growing corpus of community-contributed data of over 93 million unique cells. Curated, standardized and associated with consistent cell-level metadata, this collection of single-cell transcriptomic data is the largest of its kind and growing rapidly via community contributions. A suite of tools and features enables accessibility and reusability of the data via both computational and visual interfaces to allow researchers to explore individual datasets, perform cross-corpus analysis, and run meta-analyses of tens of millions of cells across studies and tissues at the resolution of single cells.
Collapse
Affiliation(s)
| | - Shibla Abdulla
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Brian Aevermann
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Pedro Assis
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Seve Badajoz
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Sidney M Bell
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Emanuele Bezzi
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Batuhan Cakir
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Jim Chaffer
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Signe Chambers
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - J Michael Cherry
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Tiffany Chi
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Jennifer Chien
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Leah Dorman
- Chan Zuckerberg, Biohub, SF, 499 Illinois St, San Francisco, CA 94158, USA
| | - Pablo Garcia-Nieto
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Nayib Gloria
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Mim Hastie
- Clever Canary, 850 Front St. #1491, Santa Cruz, CA, USA
| | - Daniel Hegeman
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Jason Hilton
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Timmy Huang
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Amanda Infeld
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Ana-Maria Istrate
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Ivana Jelic
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Kuni Katsuya
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Yang Joon Kim
- Chan Zuckerberg, Biohub, SF, 499 Illinois St, San Francisco, CA 94158, USA
| | - Karen Liang
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Mike Lin
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | | | - Bailey Marshall
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Bruce Martin
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Fran McDade
- Clever Canary, 850 Front St. #1491, Santa Cruz, CA, USA
| | - Colin Megill
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Nikhil Patel
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Alexander Predeus
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Brian Raymor
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Behnam Robatmili
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Dave Rogers
- Clever Canary, 850 Front St. #1491, Santa Cruz, CA, USA
| | - Erica Rutherford
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Dana Sadgat
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Andrew Shin
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Corinn Small
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Trent Smith
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Prathap Sridharan
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | | | - Norbert Tavares
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Harley Thomas
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Andrew Tolopko
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Meghan Urisko
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Joyce Yan
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Garabet Yeretssian
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Jennifer Zamanian
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Arathi Mani
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Jonah Cool
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Ambrose Carr
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| |
Collapse
|
2
|
Sarfraz I, Wang Y, Shastry A, Teh WK, Sokolov A, Herb BR, Creasy HH, Virshup I, Dries R, Degatano K, Mahurkar A, Schnell DJ, Madrigal P, Hilton J, Gehlenborg N, Tickle T, Campbell JD. MAMS: matrix and analysis metadata standards to facilitate harmonization and reproducibility of single-cell data. Genome Biol 2024; 25:205. [PMID: 39090672 PMCID: PMC11292877 DOI: 10.1186/s13059-024-03349-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 07/24/2024] [Indexed: 08/04/2024] Open
Abstract
Many datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While biospecimen and experimental information is often captured, detailed metadata standards related to data matrices and analysis workflows are currently lacking. To address this, we develop the matrix and analysis metadata standards (MAMS) to serve as a resource for data centers, repositories, and tool developers. We define metadata fields for matrices and parameters commonly utilized in analytical workflows and developed the rmams package to extract MAMS from single-cell objects. Overall, MAMS promotes the harmonization, integration, and reproducibility of single-cell data across platforms.
Collapse
Affiliation(s)
- Irzam Sarfraz
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Yichen Wang
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Amulya Shastry
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Wei Kheng Teh
- European Bioinformatics Institute, European Molecular Biology Laboratory, Hinxton, Cambridgeshire, UK
| | - Artem Sokolov
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
| | - Brian R Herb
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Heather H Creasy
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Isaac Virshup
- Department of Computational Health, Helmholtz Munich, Oberschleißheim, Germany
| | - Ruben Dries
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Kylee Degatano
- Data Sciences Platform, Broad Institute, Cambridge, MA, USA
| | - Anup Mahurkar
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Daniel J Schnell
- Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Pedro Madrigal
- European Bioinformatics Institute, European Molecular Biology Laboratory, Hinxton, Cambridgeshire, UK
| | - Jason Hilton
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Nils Gehlenborg
- Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Timothy Tickle
- Data Sciences Platform, Broad Institute, Cambridge, MA, USA
| | - Joshua D Campbell
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA.
| |
Collapse
|
3
|
Zhang Y, Sun H, Zhang W, Fu T, Huang S, Mou M, Zhang J, Gao J, Ge Y, Yang Q, Zhu F. CellSTAR: a comprehensive resource for single-cell transcriptomic annotation. Nucleic Acids Res 2024; 52:D859-D870. [PMID: 37855686 PMCID: PMC10767908 DOI: 10.1093/nar/gkad874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/12/2023] [Accepted: 09/27/2023] [Indexed: 10/20/2023] Open
Abstract
Large-scale studies of single-cell sequencing and biological experiments have successfully revealed expression patterns that distinguish different cell types in tissues, emphasizing the importance of studying cellular heterogeneity and accurately annotating cell types. Analysis of gene expression profiles in these experiments provides two essential types of data for cell type annotation: annotated references and canonical markers. In this study, the first comprehensive database of single-cell transcriptomic annotation resource (CellSTAR) was thus developed. It is unique in (a) offering the comprehensive expertly annotated reference data for annotating hundreds of cell types for the first time and (b) enabling the collective consideration of reference data and marker genes by incorporating tens of thousands of markers. Given its unique features, CellSTAR is expected to attract broad research interests from the technological innovations in single-cell transcriptomics, the studies of cellular heterogeneity & dynamics, and so on. It is now publicly accessible without any login requirement at: https://idrblab.org/cellstar.
Collapse
Affiliation(s)
- Ying Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Huaicheng Sun
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Wei Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Tingting Fu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Shijie Huang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Jinsong Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Jianqing Gao
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Yichao Ge
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Qingxia Yang
- Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
4
|
Ma WF, Turner AW, Gancayco C, Wong D, Song Y, Mosquera JV, Auguste G, Hodonsky CJ, Prabhakar A, Ekiz HA, van der Laan SW, Miller CL. PlaqView 2.0: A comprehensive web portal for cardiovascular single-cell genomics. Front Cardiovasc Med 2022; 9:969421. [PMID: 36003902 PMCID: PMC9393487 DOI: 10.3389/fcvm.2022.969421] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 07/21/2022] [Indexed: 11/13/2022] Open
Abstract
Single-cell RNA-seq (scRNA-seq) is a powerful genomics technology to interrogate the cellular composition and behaviors of complex systems. While the number of scRNA-seq datasets and available computational analysis tools have grown exponentially, there are limited systematic data sharing strategies to allow rapid exploration and re-analysis of single-cell datasets, particularly in the cardiovascular field. We previously introduced PlaqView, an open-source web portal for the exploration and analysis of published atherosclerosis single-cell datasets. Now, we introduce PlaqView 2.0 (www.plaqview.com), which provides expanded features and functionalities as well as additional cardiovascular single-cell datasets. We showcase improved PlaqView functionality, backend data processing, user-interface, and capacity. PlaqView brings new or improved tools to explore scRNA-seq data, including gene query, metadata browser, cell identity prediction, ad hoc RNA-trajectory analysis, and drug-gene interaction prediction. PlaqView serves as one of the largest central repositories for cardiovascular single-cell datasets, which now includes data from human aortic aneurysm, gene-specific mouse knockouts, and healthy references. PlaqView 2.0 brings advanced tools and high-performance computing directly to users without the need for any programming knowledge. Lastly, we outline steps to generalize and repurpose PlaqView's framework for single-cell datasets from other fields.
Collapse
Affiliation(s)
- Wei Feng Ma
- Medical Scientist Training Program, University of Virginia, Charlottesville, VA, United States
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
| | - Adam W. Turner
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
| | - Christina Gancayco
- Research Computing, University of Virginia, Charlottesville, VA, United States
| | - Doris Wong
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, United States
| | - Yipei Song
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
- Department of Computer Engineering, University of Virginia, Charlottesville, VA, United States
| | - Jose Verdezoto Mosquera
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
- Research Computing, University of Virginia, Charlottesville, VA, United States
| | - Gaëlle Auguste
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
| | - Chani J. Hodonsky
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
| | - Ajay Prabhakar
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
| | - H. Atakan Ekiz
- Department of Molecular Biology and Genetics, Izmir Institute of Technology, Gülbahçe, Turkey
| | - Sander W. van der Laan
- Central Diagnostics Laboratory, Division Laboratories, Pharmacy, and Biomedical Genetics, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | - Clint L. Miller
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, United States
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA, United States
| |
Collapse
|
5
|
Ascensión AM, Araúzo-Bravo MJ, Izeta A. The need to reassess single-cell RNA sequencing datasets: the importance of biological sample processing. F1000Res 2021; 10:767. [PMID: 35399227 PMCID: PMC8984215 DOI: 10.12688/f1000research.54864.2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/04/2022] [Indexed: 12/15/2022] Open
Abstract
Background: The advent of single-cell RNA sequencing (scRNAseq) and additional single-cell omics technologies have provided scientists with unprecedented tools to explore biology at cellular resolution. However, reaching an appropriate number of good quality reads per cell and reasonable numbers of cells within each of the populations of interest are key to infer relevant conclusions about the underlying biology of the dataset. For these reasons, scRNAseq studies are constantly increasing the number of cells analysed and the granularity of the resultant transcriptomics analyses. Methods: We aimed to identify previously described fibroblast subpopulations in healthy adult human skin by using the largest dataset published to date (528,253 sequenced cells) and an unsupervised population-matching algorithm. Results: Our reanalysis of this landmark resource demonstrates that a substantial proportion of cell transcriptomic signatures may be biased by cellular stress and response to hypoxic conditions. Conclusions: We postulate that careful design of experimental conditions is needed to avoid long processing times of biological samples. Additionally, computation of large datasets might undermine the extent of the analysis, possibly due to long processing times.
Collapse
Affiliation(s)
- Alex M. Ascensión
- Computational Biology and Systems Biomedicine Group, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain
- Tissue Engineering Group, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain
| | - Marcos J. Araúzo-Bravo
- Computational Biology and Systems Biomedicine Group, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain
- Computational Biomedicine Data Analysis Platform, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain
- IKERBASQUE, Basque Foundation for Science, Bilbao, Spain
- CIBER of Frailty and Healthy Aging (CIBERfes), Madrid, Spain
- Computational Biology and Bioinformatics Group, Max Planck Institute for Molecular Biomedicine, Münster, Germany
| | - Ander Izeta
- Tissue Engineering Group, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain
- Department of Biomedical Engineering and Science, Tecnun-University of Navarra, School of Engineering, San Sebastian, Gipuzkoa, 20009, Spain
| |
Collapse
|
6
|
Ascensión AM, Araúzo-Bravo MJ, Izeta A. The need to reassess single-cell RNA sequencing datasets: more is not always better. F1000Res 2021; 10:767. [PMID: 35399227 PMCID: PMC8984215 DOI: 10.12688/f1000research.54864.1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/21/2021] [Indexed: 08/27/2023] Open
Abstract
Background: The advent of single-cell RNA sequencing (scRNAseq) and additional single-cell omics technologies have provided scientists with unprecedented tools to explore biology at cellular resolution. However, reaching an appropriate number of good quality reads per cell and reasonable numbers of cells within each of the populations of interest are key to infer conclusions from otherwise limited analyses. For these reasons, scRNAseq studies are constantly increasing the number of cells analysed and the granularity of the resultant transcriptomics analyses. Methods: We aimed to identify previously described fibroblast subpopulations in healthy adult human skin by using the largest dataset published to date (528,253 sequenced cells) and an unsupervised population-matching algorithm. Results: Our reanalysis of this landmark resource demonstrates that a substantial proportion of cell transcriptomic signatures may be biased by cellular stress and response to hypoxic conditions. Conclusions: We postulate that the "more is better" approach, currently prevalent in the scientific community, might undermine the extent of the analysis, possibly due to long computational processing times inherent to large datasets.
Collapse
Affiliation(s)
- Alex M. Ascensión
- Computational Biology and Systems Biomedicine Group, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain
- Tissue Engineering Group, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain
| | - Marcos J. Araúzo-Bravo
- Computational Biology and Systems Biomedicine Group, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain
- Computational Biomedicine Data Analysis Platform, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain
- IKERBASQUE, Basque Foundation for Science, Bilbao, Spain
- CIBER of Frailty and Healthy Aging (CIBERfes), Madrid, Spain
- Computational Biology and Bioinformatics Group, Max Planck Institute for Molecular Biomedicine, Münster, Germany
| | - Ander Izeta
- Tissue Engineering Group, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain
- Department of Biomedical Engineering and Science, Tecnun-University of Navarra, School of Engineering, San Sebastian, Gipuzkoa, 20009, Spain
| |
Collapse
|