1
|
Jimenez L, Campos Codo A, Sampaio VDS, Oliveira AER, Ferreira LKK, Davanzo GG, de Brito Monteiro L, Victor Virgilio-da-Silva J, Borba MGS, Fabiano de Souza G, Zini N, de Andrade Gandolfi F, Muraro SP, Luiz Proença-Modena J, Val FA, Cardoso Melo G, Monteiro WM, Nogueira ML, Lacerda MVG, Moraes-Vieira PM, Nakaya HI. Acid pH Increases SARS-CoV-2 Infection and the Risk of Death by COVID-19. Front Med (Lausanne) 2021; 8:637885. [PMID: 34490283 PMCID: PMC8417536 DOI: 10.3389/fmed.2021.637885] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 07/26/2021] [Indexed: 01/14/2023] Open
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) can infect a broad range of human tissues by using the host receptor angiotensin-converting enzyme 2 (ACE2). Individuals with comorbidities associated with severe COVID-19 display higher levels of ACE2 in the lungs compared to those without comorbidities, and conditions such as cell stress, elevated glucose levels and hypoxia may also increase the expression of ACE2. Here, we showed that patients with Barrett's esophagus (BE) have a higher expression of ACE2 in BE tissues compared to normal squamous esophagus, and that the lower pH associated with BE may drive this increase in expression. Human primary monocytes cultured in reduced pH displayed increased ACE2 expression and higher viral load upon SARS-CoV-2 infection. We also showed in two independent cohorts of 1,357 COVID-19 patients that previous use of proton pump inhibitors is associated with 2- to 3-fold higher risk of death compared to those not using the drugs. Our work suggests that pH has a great influence on SARS-CoV-2 Infection and COVID-19 severity.
Collapse
Affiliation(s)
- Leandro Jimenez
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of São Paulo, São Paulo, Brazil.,Scientific Platform Pasteur-University of São Paulo, São Paulo, Brazil
| | - Ana Campos Codo
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, University of Campinas, São Paulo, Brazil
| | - Vanderson de Souza Sampaio
- Fundação de Medicina Tropical Dr. Heitor Vieira Dourado, Manaus, Brazil.,Universidade do Estado do Amazonas, Manaus, Brazil.,Fundação de Vigilância em Saúde do Amazonas, Manaus, Brazil.,Faculdade de Medicina da Universidade Federal do Amazonas, Manaus, Brazil
| | - Antonio E R Oliveira
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of São Paulo, São Paulo, Brazil
| | - Lucas Kaoru Kobo Ferreira
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of São Paulo, São Paulo, Brazil
| | - Gustavo Gastão Davanzo
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, University of Campinas, São Paulo, Brazil
| | - Lauar de Brito Monteiro
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, University of Campinas, São Paulo, Brazil
| | - João Victor Virgilio-da-Silva
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, University of Campinas, São Paulo, Brazil
| | | | - Gabriela Fabiano de Souza
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, University of Campinas, São Paulo, Brazil
| | - Nathalia Zini
- Faculdade de Medicina de São José do Rio Preto, São Paulo, Brazil
| | | | - Stéfanie Primon Muraro
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, University of Campinas, São Paulo, Brazil
| | - José Luiz Proença-Modena
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, University of Campinas, São Paulo, Brazil
| | - Fernando Almeida Val
- Fundação de Medicina Tropical Dr. Heitor Vieira Dourado, Manaus, Brazil.,Universidade do Estado do Amazonas, Manaus, Brazil.,Faculdade de Medicina da Universidade Federal do Amazonas, Manaus, Brazil
| | - Gisely Cardoso Melo
- Fundação de Medicina Tropical Dr. Heitor Vieira Dourado, Manaus, Brazil.,Universidade do Estado do Amazonas, Manaus, Brazil
| | - Wuelton Marcelo Monteiro
- Fundação de Medicina Tropical Dr. Heitor Vieira Dourado, Manaus, Brazil.,Universidade do Estado do Amazonas, Manaus, Brazil
| | | | - Marcus Vinícius Guimarães Lacerda
- Fundação de Medicina Tropical Dr. Heitor Vieira Dourado, Manaus, Brazil.,Universidade do Estado do Amazonas, Manaus, Brazil.,Faculdade de Medicina da Universidade Federal do Amazonas, Manaus, Brazil
| | - Pedro M Moraes-Vieira
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, University of Campinas, São Paulo, Brazil.,Obesity and Comorbidities Research Center, University of Campinas, São Paulo, Brazil.,Experimental Medicine Research Cluster, University of Campinas, São Paulo, Brazil
| | - Helder I Nakaya
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of São Paulo, São Paulo, Brazil.,Scientific Platform Pasteur-University of São Paulo, São Paulo, Brazil.,Hospital Israelita Albert Einstein, São Paulo, Brazil
| |
Collapse
|
2
|
Zhao Y, Fang ZY, Lin CX, Deng C, Xu YP, Li HD. RFCell: A Gene Selection Approach for scRNA-seq Clustering Based on Permutation and Random Forest. Front Genet 2021; 12:665843. [PMID: 34386033 PMCID: PMC8354212 DOI: 10.3389/fgene.2021.665843] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 04/01/2021] [Indexed: 11/13/2022] Open
Abstract
In recent years, the application of single cell RNA-seq (scRNA-seq) has become more and more popular in fields such as biology and medical research. Analyzing scRNA-seq data can discover complex cell populations and infer single-cell trajectories in cell development. Clustering is one of the most important methods to analyze scRNA-seq data. In this paper, we focus on improving scRNA-seq clustering through gene selection, which also reduces the dimensionality of scRNA-seq data. Studies have shown that gene selection for scRNA-seq data can improve clustering accuracy. Therefore, it is important to select genes with cell type specificity. Gene selection not only helps to reduce the dimensionality of scRNA-seq data, but also can improve cell type identification in combination with clustering methods. Here, we proposed RFCell, a supervised gene selection method, which is based on permutation and random forest classification. We first use RFCell and three existing gene selection methods to select gene sets on 10 scRNA-seq data sets. Then, three classical clustering algorithms are used to cluster the cells obtained by these gene selection methods. We found that the gene selection performance of RFCell was better than other gene selection methods.
Collapse
Affiliation(s)
- Yuan Zhao
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Zhao-Yu Fang
- School of Mathematics and Statistics, Central South University, Changsha, China
| | - Cui-Xiang Lin
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Chao Deng
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Yun-Pei Xu
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Hong-Dong Li
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
3
|
|
4
|
Cornish A, Roychoudhury S, Sarma K, Pramanik S, Bhakat K, Dudley A, Mishra NK, Guda C. Red panda: a novel method for detecting variants in single-cell RNA sequencing. BMC Genomics 2020; 21:830. [PMID: 33372593 PMCID: PMC7771073 DOI: 10.1186/s12864-020-07224-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 11/10/2020] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND Single-cell sequencing enables us to better understand genetic diseases, such as cancer or autoimmune disorders, which are often affected by changes in rare cells. Currently, no existing software is aimed at identifying single nucleotide variations or micro (1-50 bp) insertions and deletions in single-cell RNA sequencing (scRNA-seq) data. Generating high-quality variant data is vital to the study of the aforementioned diseases, among others. RESULTS In this study, we report the design and implementation of Red Panda, a novel method to accurately identify variants in scRNA-seq data. Variants were called on scRNA-seq data from human articular chondrocytes, mouse embryonic fibroblasts (MEFs), and simulated data stemming from the MEF alignments. Red Panda had the highest Positive Predictive Value at 45.0%, while other tools-FreeBayes, GATK HaplotypeCaller, GATK UnifiedGenotyper, Monovar, and Platypus-ranged from 5.8-41.53%. From the simulated data, Red Panda had the highest sensitivity at 72.44%. CONCLUSIONS We show that our method provides a novel and improved mechanism to identify variants in scRNA-seq as compared to currently existing software. However, methods for identification of genomic variants using scRNA-seq data can be still improved.
Collapse
Affiliation(s)
- Adam Cornish
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, 68198, USA
| | - Shrabasti Roychoudhury
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, 68198, USA
| | - Krishna Sarma
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, 68198, USA
| | - Suravi Pramanik
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, 68198, USA
| | - Kishor Bhakat
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, 68198, USA
| | - Andrew Dudley
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, 68198, USA
| | - Nitish K Mishra
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, 68198, USA
| | - Chittibabu Guda
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, 68198, USA.
| |
Collapse
|
5
|
The Role of Single-Cell Technology in the Study and Control of Infectious Diseases. Cells 2020; 9:cells9061440. [PMID: 32531928 PMCID: PMC7348906 DOI: 10.3390/cells9061440] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 06/03/2020] [Accepted: 06/05/2020] [Indexed: 02/07/2023] Open
Abstract
The advent of single-cell research in the recent decade has allowed biological studies at an unprecedented resolution and scale. In particular, single-cell analysis techniques such as Next-Generation Sequencing (NGS) and Fluorescence-Activated Cell Sorting (FACS) have helped show substantial links between cellular heterogeneity and infectious disease progression. The extensive characterization of genomic and phenotypic biomarkers, in addition to host-pathogen interactions at the single-cell level, has resulted in the discovery of previously unknown infection mechanisms as well as potential treatment options. In this article, we review the various single-cell technologies and their applications in the ongoing fight against infectious diseases, as well as discuss the potential opportunities for future development.
Collapse
|
6
|
Huang Y, Chang X, Zhang Y, Chen L, Liu X. Disease characterization using a partial correlation-based sample-specific network. Brief Bioinform 2020; 22:5838457. [PMID: 32422654 DOI: 10.1093/bib/bbaa062] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 03/25/2020] [Accepted: 03/26/2020] [Indexed: 12/23/2022] Open
Abstract
A single-sample network (SSN) is a biological molecular network constructed from single-sample data given a reference dataset and can provide insights into the mechanisms of individual diseases and aid in the development of personalized medicine. In this study, we proposed a computational method, a partial correlation-based single-sample network (P-SSN), which not only infers a network from each single-sample data given a reference dataset but also retains the direct interactions by excluding indirect interactions (https://github.com/hyhRise/P-SSN). By applying P-SSN to analyze tumor data from the Cancer Genome Atlas and single cell data, we validated the effectiveness of P-SSN in predicting driver mutation genes (DMGs), producing network distance, identifying subtypes and further classifying single cells. In particular, P-SSN is highly effective in predicting DMGs based on single-sample data. P-SSN is also efficient for subtyping complex diseases and for clustering single cells by introducing network distance between any two samples.
Collapse
Affiliation(s)
- Yanhong Huang
- Institute of Statistics and Applied Mathematics, Anhui University of Finance & Economics, Bengbu 233030, China, and School of Mathematics and Statistics, Shandong University at Weihai, Weihai 264209, China
| | - Xiao Chang
- Institute of Statistics and Applied Mathematics, Anhui University of Finance & Economics, Bengbu 233030, China
| | - Yu Zhang
- School of Mathematics and Statistics, Shandong University at Weihai, Weihai 264209, China
| | - Luonan Chen
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai 200031, China, Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai 201210, China, and Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| | - Xiaoping Liu
- School of Mathematics and Statistics, Shandong University at Weihai, Weihai 264209, China
| |
Collapse
|
7
|
Lähnemann D, Köster J, Szczurek E, McCarthy DJ, Hicks SC, Robinson MD, Vallejos CA, Campbell KR, Beerenwinkel N, Mahfouz A, Pinello L, Skums P, Stamatakis A, Attolini CSO, Aparicio S, Baaijens J, Balvert M, Barbanson BD, Cappuccio A, Corleone G, Dutilh BE, Florescu M, Guryev V, Holmer R, Jahn K, Lobo TJ, Keizer EM, Khatri I, Kielbasa SM, Korbel JO, Kozlov AM, Kuo TH, Lelieveldt BP, Mandoiu II, Marioni JC, Marschall T, Mölder F, Niknejad A, Rączkowska A, Reinders M, Ridder JD, Saliba AE, Somarakis A, Stegle O, Theis FJ, Yang H, Zelikovsky A, McHardy AC, Raphael BJ, Shah SP, Schönhuth A. Eleven grand challenges in single-cell data science. Genome Biol 2020; 21:31. [PMID: 32033589 PMCID: PMC7007675 DOI: 10.1186/s13059-020-1926-6] [Citation(s) in RCA: 534] [Impact Index Per Article: 133.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 01/02/2020] [Indexed: 02/08/2023] Open
Abstract
The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years.
Collapse
Affiliation(s)
- David Lähnemann
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Department of Paediatric Oncology, Haematology and Immunology, Medical Faculty, Heinrich Heine University, University Hospital, Düsseldorf, Germany
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Johannes Köster
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA
| | - Ewa Szczurek
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland
| | - Davis J. McCarthy
- Bioinformatics and Cellular Genomics, St Vincent’s Institute of Medical Research, Fitzroy, Australia
- Melbourne Integrative Genomics, School of BioSciences–School of Mathematics & Statistics, Faculty of Science, University of Melbourne, Melbourne, Australia
| | - Stephanie C. Hicks
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD USA
| | - Mark D. Robinson
- Institute of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zürich, Zürich, Switzerland
| | - Catalina A. Vallejos
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, UK
- The Alan Turing Institute, British Library, London, UK
| | - Kieran R. Campbell
- Department of Statistics, University of British Columbia, Vancouver, Canada
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
- Data Science Institute, University of British Columbia, Vancouver, Canada
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Ahmed Mahfouz
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Luca Pinello
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital Research Institute, Charlestown, USA
- Department of Pathology, Harvard Medical School, Boston, USA
- Broad Institute of Harvard and MIT, Cambridge, MA USA
| | - Pavel Skums
- Department of Computer Science, Georgia State University, Atlanta, USA
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | | | - Samuel Aparicio
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, Canada
| | - Jasmijn Baaijens
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
| | - Marleen Balvert
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
| | - Buys de Barbanson
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
- Quantitative biology, Hubrecht Institute, Utrecht, The Netherlands
| | - Antonio Cappuccio
- Institute for Advanced Study, University of Amsterdam, Amsterdam, The Netherlands
| | - Giacomo Corleone
- Department of Surgery and Cancer, The Imperial Centre for Translational and Experimental Medicine, Imperial College London, London, UK
| | - Bas E. Dutilh
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
- Centre for Molecular and Biomolecular Informatics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Maria Florescu
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
- Quantitative biology, Hubrecht Institute, Utrecht, The Netherlands
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Rens Holmer
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Katharina Jahn
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Thamar Jessurun Lobo
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Emma M. Keizer
- Biometris, Wageningen University & Research, Wageningen, The Netherlands
| | - Indu Khatri
- Department of Immunohematology and Blood Transfusion, Leiden University Medical Center, Leiden, The Netherlands
| | - Szymon M. Kielbasa
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Jan O. Korbel
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Alexey M. Kozlov
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Tzu-Hao Kuo
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Boudewijn P.F. Lelieveldt
- PRB lab, Delft University of Technology, Delft, The Netherlands
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Ion I. Mandoiu
- Computer Science & Engineering Department, University of Connecticut, Storrs, USA
| | - John C. Marioni
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Tobias Marschall
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
- Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Felix Mölder
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Institute of Pathology, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| | - Amir Niknejad
- Computation molecular design, Zuse Institute Berlin, Berlin, Germany
- Mathematics Department, Mount Saint Vincent, New York, USA
| | - Alicja Rączkowska
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland
| | - Marcel Reinders
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Jeroen de Ridder
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Antoine-Emmanuel Saliba
- Helmholtz Institute for RNA-based Infection Research, Helmholtz-Center for Infection Research, Würzburg, Germany
| | - Antonios Somarakis
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Oliver Stegle
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center–DKFZ, Heidelberg, Germany
| | - Fabian J. Theis
- Institute of Computational Biology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
| | - Huan Yang
- Division of Drug Discovery and Safety, Leiden Academic Center for Drug Research–LACDR–Leiden University, Leiden, The Netherlands
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, Atlanta, USA
- The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Alice C. McHardy
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | | | - Sohrab P. Shah
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, USA
| | - Alexander Schönhuth
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
8
|
Owen RP, White MJ, Severson DT, Braden B, Bailey A, Goldin R, Wang LM, Ruiz-Puig C, Maynard ND, Green A, Piazza P, Buck D, Middleton MR, Ponting CP, Schuster-Böckler B, Lu X. Single cell RNA-seq reveals profound transcriptional similarity between Barrett's oesophagus and oesophageal submucosal glands. Nat Commun 2018; 9:4261. [PMID: 30323168 PMCID: PMC6189174 DOI: 10.1038/s41467-018-06796-9] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 09/19/2018] [Indexed: 02/07/2023] Open
Abstract
Barrett's oesophagus is a precursor of oesophageal adenocarcinoma. In this common condition, squamous epithelium in the oesophagus is replaced by columnar epithelium in response to acid reflux. Barrett's oesophagus is highly heterogeneous and its relationships to normal tissues are unclear. Here we investigate the cellular complexity of Barrett's oesophagus and the upper gastrointestinal tract using RNA-sequencing of single cells from multiple biopsies from six patients with Barrett's oesophagus and two patients without oesophageal pathology. We find that cell populations in Barrett's oesophagus, marked by LEFTY1 and OLFM4, exhibit a profound transcriptional overlap with oesophageal submucosal gland cells, but not with gastric or duodenal cells. Additionally, SPINK4 and ITLN1 mark cells that precede morphologically identifiable goblet cells in colon and Barrett's oesophagus, potentially aiding the identification of metaplasia. Our findings reveal striking transcriptional relationships between normal tissue populations and cells in a premalignant condition, with implications for clinical practice.
Collapse
Affiliation(s)
- Richard Peter Owen
- Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7DQ, UK
| | - Michael Joseph White
- Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7DQ, UK
| | - David Tyler Severson
- Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7DQ, UK
| | - Barbara Braden
- Translational Gastroenterology Unit, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 9DU, UK
| | - Adam Bailey
- Translational Gastroenterology Unit, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 9DU, UK
| | - Robert Goldin
- Centre for Pathology, St Mary's Hospital, Imperial College, London, W2 1NY, UK
| | - Lai Mun Wang
- Department of Pathology, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Carlos Ruiz-Puig
- Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7DQ, UK
| | | | - Angie Green
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Paolo Piazza
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
- Department of Medicine, Faculty of Medicine, Imperial College London, London, W12 0NN, UK
| | - David Buck
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Mark Ross Middleton
- Department of Oncology, Old Road Campus Research Building, Roosevelt Drive, Oxford, OX3 7DQ, UK
| | - Chris Paul Ponting
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XU, UK
| | - Benjamin Schuster-Böckler
- Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7DQ, UK.
| | - Xin Lu
- Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7DQ, UK.
| |
Collapse
|
9
|
Duò A, Robinson MD, Soneson C. A systematic performance evaluation of clustering methods for single-cell RNA-seq data. F1000Res 2018; 7:1141. [PMID: 30271584 DOI: 10.12688/f1000research.15666.1] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/20/2018] [Indexed: 12/21/2022] Open
Abstract
Subpopulation identification, usually via some form of unsupervised clustering, is a fundamental step in the analysis of many single-cell RNA-seq data sets. This has motivated the development and application of a broad range of clustering methods, based on various underlying algorithms. Here, we provide a systematic and extensible performance evaluation of 14 clustering algorithms implemented in R, including both methods developed explicitly for scRNA-seq data and more general-purpose methods. The methods were evaluated using nine publicly available scRNA-seq data sets as well as three simulations with varying degree of cluster separability. The same feature selection approaches were used for all methods, allowing us to focus on the investigation of the performance of the clustering algorithms themselves. We evaluated the ability of recovering known subpopulations, the stability and the run time and scalability of the methods. Additionally, we investigated whether the performance could be improved by generating consensus partitions from multiple individual clustering methods. We found substantial differences in the performance, run time and stability between the methods, with SC3 and Seurat showing the most favorable results. Additionally, we found that consensus clustering typically did not improve the performance compared to the best of the combined methods, but that several of the top-performing methods already perform some type of consensus clustering. All the code used for the evaluation is available on GitHub ( https://github.com/markrobinsonuzh/scRNAseq_clustering_comparison). In addition, an R package providing access to data and clustering results, thereby facilitating inclusion of new methods and data sets, is available from Bioconductor ( https://bioconductor.org/packages/DuoClustering2018).
Collapse
Affiliation(s)
- Angelo Duò
- Institute of Molecular Life Sciences, University of Zurich, Zurich, 8057, Switzerland.,SIB Swiss Institute of Bioinformatics, Zurich, 8057, Switzerland
| | - Mark D Robinson
- Institute of Molecular Life Sciences, University of Zurich, Zurich, 8057, Switzerland.,SIB Swiss Institute of Bioinformatics, Zurich, 8057, Switzerland
| | - Charlotte Soneson
- Institute of Molecular Life Sciences, University of Zurich, Zurich, 8057, Switzerland.,SIB Swiss Institute of Bioinformatics, Zurich, 8057, Switzerland
| |
Collapse
|
10
|
Duò A, Robinson MD, Soneson C. A systematic performance evaluation of clustering methods for single-cell RNA-seq data. F1000Res 2018; 7:1141. [PMID: 30271584 PMCID: PMC6134335 DOI: 10.12688/f1000research.15666.3] [Citation(s) in RCA: 120] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/04/2020] [Indexed: 02/05/2023] Open
Abstract
Subpopulation identification, usually via some form of unsupervised clustering, is a fundamental step in the analysis of many single-cell RNA-seq data sets. This has motivated the development and application of a broad range of clustering methods, based on various underlying algorithms. Here, we provide a systematic and extensible performance evaluation of 14 clustering algorithms implemented in R, including both methods developed explicitly for scRNA-seq data and more general-purpose methods. The methods were evaluated using nine publicly available scRNA-seq data sets as well as three simulations with varying degree of cluster separability. The same feature selection approaches were used for all methods, allowing us to focus on the investigation of the performance of the clustering algorithms themselves. We evaluated the ability of recovering known subpopulations, the stability and the run time and scalability of the methods. Additionally, we investigated whether the performance could be improved by generating consensus partitions from multiple individual clustering methods. We found substantial differences in the performance, run time and stability between the methods, with SC3 and Seurat showing the most favorable results. Additionally, we found that consensus clustering typically did not improve the performance compared to the best of the combined methods, but that several of the top-performing methods already perform some type of consensus clustering. All the code used for the evaluation is available on GitHub (
https://github.com/markrobinsonuzh/scRNAseq_clustering_comparison). In addition, an R package providing access to data and clustering results, thereby facilitating inclusion of new methods and data sets, is available from Bioconductor (
https://bioconductor.org/packages/DuoClustering2018).
Collapse
Affiliation(s)
- Angelo Duò
- Institute of Molecular Life Sciences, University of Zurich, Zurich, 8057, Switzerland.,SIB Swiss Institute of Bioinformatics, Zurich, 8057, Switzerland
| | - Mark D Robinson
- Institute of Molecular Life Sciences, University of Zurich, Zurich, 8057, Switzerland.,SIB Swiss Institute of Bioinformatics, Zurich, 8057, Switzerland
| | - Charlotte Soneson
- Institute of Molecular Life Sciences, University of Zurich, Zurich, 8057, Switzerland.,SIB Swiss Institute of Bioinformatics, Zurich, 8057, Switzerland
| |
Collapse
|
11
|
Duò A, Robinson MD, Soneson C. A systematic performance evaluation of clustering methods for single-cell RNA-seq data. F1000Res 2018; 7:1141. [PMID: 30271584 DOI: 10.12688/f1000research.15666.2] [Citation(s) in RCA: 122] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/31/2018] [Indexed: 12/31/2022] Open
Abstract
Subpopulation identification, usually via some form of unsupervised clustering, is a fundamental step in the analysis of many single-cell RNA-seq data sets. This has motivated the development and application of a broad range of clustering methods, based on various underlying algorithms. Here, we provide a systematic and extensible performance evaluation of 14 clustering algorithms implemented in R, including both methods developed explicitly for scRNA-seq data and more general-purpose methods. The methods were evaluated using nine publicly available scRNA-seq data sets as well as three simulations with varying degree of cluster separability. The same feature selection approaches were used for all methods, allowing us to focus on the investigation of the performance of the clustering algorithms themselves. We evaluated the ability of recovering known subpopulations, the stability and the run time and scalability of the methods. Additionally, we investigated whether the performance could be improved by generating consensus partitions from multiple individual clustering methods. We found substantial differences in the performance, run time and stability between the methods, with SC3 and Seurat showing the most favorable results. Additionally, we found that consensus clustering typically did not improve the performance compared to the best of the combined methods, but that several of the top-performing methods already perform some type of consensus clustering. All the code used for the evaluation is available on GitHub ( https://github.com/markrobinsonuzh/scRNAseq_clustering_comparison). In addition, an R package providing access to data and clustering results, thereby facilitating inclusion of new methods and data sets, is available from Bioconductor ( https://bioconductor.org/packages/DuoClustering2018).
Collapse
Affiliation(s)
- Angelo Duò
- Institute of Molecular Life Sciences, University of Zurich, Zurich, 8057, Switzerland.,SIB Swiss Institute of Bioinformatics, Zurich, 8057, Switzerland
| | - Mark D Robinson
- Institute of Molecular Life Sciences, University of Zurich, Zurich, 8057, Switzerland.,SIB Swiss Institute of Bioinformatics, Zurich, 8057, Switzerland
| | - Charlotte Soneson
- Institute of Molecular Life Sciences, University of Zurich, Zurich, 8057, Switzerland.,SIB Swiss Institute of Bioinformatics, Zurich, 8057, Switzerland
| |
Collapse
|
12
|
Hon CC, Shin JW, Carninci P, Stubbington MJT. The Human Cell Atlas: Technical approaches and challenges. Brief Funct Genomics 2018; 17:283-294. [PMID: 29092000 PMCID: PMC6063304 DOI: 10.1093/bfgp/elx029] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The Human Cell Atlas is a large, international consortium that aims to identify and describe every cell type in the human body. The comprehensive cellular maps that arise from this ambitious effort have the potential to transform many aspects of fundamental biology and clinical practice. Here, we discuss the technical approaches that could be used today to generate such a resource and also the technical challenges that will be encountered.
Collapse
Affiliation(s)
- Chung-Chau Hon
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa, Japan
| | - Jay W Shin
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa, Japan
| | - Piero Carninci
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa, Japan
| | | |
Collapse
|