1
|
Jovičić SM. Analysis of total RNA as a potential biomarker of developmental neurotoxicity in silico. Health Informatics J 2024; 30:14604582241285832. [PMID: 39384248 DOI: 10.1177/14604582241285832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/11/2024]
Abstract
A vast number of neurodegenerative disorders arise from neurotoxicity. In neurotoxicity, more than 250 RNA molecules are up and downregulated. The manuscript investigates the exposure of chlorpyrifos organophosphate pesticide (COP) effect on total RNA in murine brain tissue in 4 genotypes for in silico neurodegeneration development. The GSE58103 dataset from the Gene Expression Omnibus (GEO) database applies for data preprocessing, normalization, and quality control. Differential expression analysis (DEG) uses the limma package in R. Study compared expression profiles from murine fetal brain tissues across four genotypes: PON-1 knockout (KO), tgHuPON1Q192 (Q-tg), tgHuPON1R192 (R-tg), and wild-type (WT). We analyze 60 samples, 15 samples per genotype, to identify DEGs. The significance criteria are adjusted p-value <.05 and a |log2 fold change| > 1. The study identifies microRNA485 as the potential biomarker of COP toxicity using the GSE58103 dataset. Significant differences exist for microRNA485 between KO and WT groups by differential expression analysis. Moreover, graphical analysis shows sample relationships among genotype groups. MicroRNA485 represents a promising biomarker for developmental COP neurotoxicity by utilizing in silico analysis in scientific practice.
Collapse
Affiliation(s)
- Snežana M Jovičić
- Department of Genetics, Faculty of Biology, University of Belgrade, Belgrade, Serbia
| |
Collapse
|
2
|
Rajderkar SS, Paraiso K, Amaral ML, Kosicki M, Cook LE, Darbellay F, Spurrell CH, Osterwalder M, Zhu Y, Wu H, Afzal SY, Blow MJ, Kelman G, Barozzi I, Fukuda-Yuzawa Y, Akiyama JA, Afzal V, Tran S, Plajzer-Frick I, Novak CS, Kato M, Hunter RD, von Maydell K, Wang A, Lin L, Preissl S, Lisgo S, Ren B, Dickel DE, Pennacchio LA, Visel A. Dynamic enhancer landscapes in human craniofacial development. Nat Commun 2024; 15:2030. [PMID: 38448444 PMCID: PMC10917818 DOI: 10.1038/s41467-024-46396-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 02/25/2024] [Indexed: 03/08/2024] Open
Abstract
The genetic basis of human facial variation and craniofacial birth defects remains poorly understood. Distant-acting transcriptional enhancers control the fine-tuned spatiotemporal expression of genes during critical stages of craniofacial development. However, a lack of accurate maps of the genomic locations and cell type-resolved activities of craniofacial enhancers prevents their systematic exploration in human genetics studies. Here, we combine histone modification, chromatin accessibility, and gene expression profiling of human craniofacial development with single-cell analyses of the developing mouse face to define the regulatory landscape of facial development at tissue- and single cell-resolution. We provide temporal activity profiles for 14,000 human developmental craniofacial enhancers. We find that 56% of human craniofacial enhancers share chromatin accessibility in the mouse and we provide cell population- and embryonic stage-resolved predictions of their in vivo activity. Taken together, our data provide an expansive resource for genetic and developmental studies of human craniofacial development.
Collapse
Affiliation(s)
- Sudha Sunil Rajderkar
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Kitt Paraiso
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Maria Luisa Amaral
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Michael Kosicki
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Laura E Cook
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Fabrice Darbellay
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
- Department of Genetic Medicine and Development, Faculty of Medicine, University of Geneva, 1211, Geneva, Switzerland
| | - Cailyn H Spurrell
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Marco Osterwalder
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
- Department for BioMedical Research (DBMR), University of Bern, 3008, Bern, Switzerland
- Department of Cardiology, Bern University Hospital, Bern, 3010, Switzerland
| | - Yiwen Zhu
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Han Wu
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Sarah Yasmeen Afzal
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
- Lucile Packard Children's Hospital, Stanford University, Stanford, CA, 94304, USA
| | - Matthew J Blow
- U.S. Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Guy Kelman
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
- The Jerusalem Center for Personalized Computational Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Iros Barozzi
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
- Center for Cancer Research, Medical University of Vienna, Borschkegasse 8a 1090, Vienna, Austria
- Department of Surgery and Cancer, Imperial College London, London, UK
| | - Yoko Fukuda-Yuzawa
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
- University Research Management Center, Tohoku University, Sendai, Miyagi, 980-8577, Japan
| | - Jennifer A Akiyama
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Veena Afzal
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Stella Tran
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Ingrid Plajzer-Frick
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Catherine S Novak
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Momoe Kato
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Riana D Hunter
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
- UC San Francisco, Division of Experimental Medicine, 1001 Potrero Ave, San Francisco, CA, 94110, USA
| | - Kianna von Maydell
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Allen Wang
- Center for Epigenomics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Lin Lin
- Center for Epigenomics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Sebastian Preissl
- Center for Epigenomics, University of California San Diego School of Medicine, La Jolla, CA, USA
- Institute of Experimental and Clinical Pharmacology and Toxicology, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Steven Lisgo
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle, NE1 3BZ, UK
| | - Bing Ren
- Institute of Genome Medicine, Moores Cancer Center, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Diane E Dickel
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
- Octant Inc., Emeryville, CA, 94608, USA
| | - Len A Pennacchio
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
- U.S. Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA, 94720, USA
- Comparative Biochemistry Program, University of California, Berkeley, CA, 94720, USA
| | - Axel Visel
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA.
- U.S. Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA, 94720, USA.
- School of Natural Sciences, University of California, Merced, CA, USA.
| |
Collapse
|
3
|
Kai N, Qingsong C, Kejia M, Weiwei L, Xing W, Xuejie C, Lixia C, Minzi D, Yuanyuan Y, Xiaoyan W. An Inflammatory Bowel Diseases Integrated Resources Portal (IBDIRP). Database (Oxford) 2024; 2024:baad097. [PMID: 38227799 PMCID: PMC10791110 DOI: 10.1093/database/baad097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 12/02/2023] [Accepted: 12/22/2023] [Indexed: 01/18/2024]
Abstract
IBD, including ulcerative colitis and Crohn's disease, is a chronic and debilitating gastrointestinal disorder that affects millions of people worldwide. Research on IBD has generated massive amounts of data, including literature, metagenomics, metabolomics, bioresources and databases. We aim to create an IBD Integrated Resources Portal (IBDIRP) that provides the most comprehensive resources for IBD. An integrated platform was developed that provides information on different aspects of IBD research resources, such as single-nucleotide polymorphisms (SNPs), genes, transcriptome, microbiota, metabolomics, single cells and other resources. Valuable and comprehensive IBD-related data were collected from PubMed, Google, GMrepo, gutMega, gutMDisorder, Single Cell Portal and other sources. Then, the data were systematically sorted, and these resources were manually curated. We systematically sorted and cataloged more than 320 unique risk SNPs associated with IBD in the SNP section. We presented over 289 IBD-related genes based on the database collection in the gene section. We also obtained 153 manually curated IBD transcriptomics data, including 12 388 samples, on the Gene Expression Omnibus database. The sorted IBD-related microbiota data from three primary microbiome databases (GMrepo, gutMega and gutMDisorder) were available for download. We selected 23 149 IBD-related taxonomic records from these databases. Additionally, we collected 24 IBD metabolomics studies with 2896 participants in the metabolomics section. We introduced two interactive single-cell data plug-in units that provided data visualization based on cells and genes. Finally, we listed 18 significant IBD web resources, such as the official European Crohn's and Colitis Organisation and International Organization for the Study of IBD websites, IBD scoring tools, IBD genetic and multi-omics resources, IBD biobanks and other useful research resources. The IBDIRP website is the first integrated resource for global IBD researchers. This portal will help researchers by providing comprehensive knowledge and enabling them to reinforce the multidimensional impression of IBD. The IBDIRP website is accessible via www.ibdirp.com Database URL: www.ibdirp.com.
Collapse
Affiliation(s)
- Nie Kai
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | | | - Ma Kejia
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | - Luo Weiwei
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | - Wu Xing
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | - Chen Xuejie
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | - Cai Lixia
- Changsha Hospital for Maternal and Child Health Care Affiliated to Hunan Normal University Changsha Hunan 410000, China
| | - Deng Minzi
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | - Yang Yuanyuan
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | - Wang Xiaoyan
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
- The College of Computer Science in Sichuan University, Chengdu Sichuan 610000, China
| |
Collapse
|
4
|
Knight CH, Khan F, Patel A, Gill US, Okosun J, Wang J. IBRAP: integrated benchmarking single-cell RNA-sequencing analytical pipeline. Brief Bioinform 2023; 24:bbad061. [PMID: 36847692 PMCID: PMC10025434 DOI: 10.1093/bib/bbad061] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 12/19/2022] [Accepted: 02/02/2023] [Indexed: 03/01/2023] Open
Abstract
Single-cell ribonucleic acid (RNA)-sequencing (scRNA-seq) is a powerful tool to study cellular heterogeneity. The high dimensional data generated from this technology are complex and require specialized expertise for analysis and interpretation. The core of scRNA-seq data analysis contains several key analytical steps, which include pre-processing, quality control, normalization, dimensionality reduction, integration and clustering. Each step often has many algorithms developed with varied underlying assumptions and implications. With such a diverse choice of tools available, benchmarking analyses have compared their performances and demonstrated that tools operate differentially according to the data types and complexity. Here, we present Integrated Benchmarking scRNA-seq Analytical Pipeline (IBRAP), which contains a suite of analytical components that can be interchanged throughout the pipeline alongside multiple benchmarking metrics that enable users to compare results and determine the optimal pipeline combinations for their data. We apply IBRAP to single- and multi-sample integration analysis using primary pancreatic tissue, cancer cell line and simulated data accompanied with ground truth cell labels, demonstrating the interchangeable and benchmarking functionality of IBRAP. Our results confirm that the optimal pipelines are dependent on individual samples and studies, further supporting the rationale and necessity of our tool. We then compare reference-based cell annotation with unsupervised analysis, both included in IBRAP, and demonstrate the superiority of the reference-based method in identifying robust major and minor cell types. Thus, IBRAP presents a valuable tool to integrate multiple samples and studies to create reference maps of normal and diseased tissues, facilitating novel biological discovery using the vast volume of scRNA-seq data available.
Collapse
Affiliation(s)
- Connor H Knight
- Centre for Cancer Genomics and Computational Biology, Barts Cancer Institute, Queen Mary University of London, London EC1M 6BQ
| | - Faraz Khan
- Centre for Cancer Genomics and Computational Biology, Barts Cancer Institute, Queen Mary University of London, London EC1M 6BQ
| | - Ankit Patel
- Centre for Cancer Genomics and Computational Biology, Barts Cancer Institute, Queen Mary University of London, London EC1M 6BQ
| | - Upkar S Gill
- Centre for Immunobiology, Blizard Institute, Faculty of Medicine and Dentistry Medicine & Dentistry, Queen Mary University of London, London E1 2AT, United Kingdom
| | - Jessica Okosun
- Centre for Haemato-Oncology, Barts Cancer Institute, Queen Mary University of London, London EC1M 6BQ
| | - Jun Wang
- Centre for Cancer Genomics and Computational Biology, Barts Cancer Institute, Queen Mary University of London, London EC1M 6BQ
| |
Collapse
|
5
|
McCool JL, Hum NR, Sebastian A, Loots GG. Isolation of Murine Articular Chondrocytes for Single-Cell RNA or Bulk RNA Sequencing Analysis. Methods Mol Biol 2023; 2598:187-196. [PMID: 36355293 DOI: 10.1007/978-1-0716-2839-3_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) is highly dependent on cellular composition of a tissue of interest. For soft tissues, isolation of individual cells from the extracellular matrix (ECM) while retaining viability and minimizing degradation within subpopulations is well established. In contrast, articular cartilage is comprised of sparsely positioned chondrocytes embedded within a dense ECM high in glycosaminoglycans, proteoglycans, and many fibrous proteins such as collagens, elastin, fibronectin, and laminins. This densely packed ECM makes it difficult to isolate viable chondrocytes for further single-cell analysis. This protocol highlights a successful technique optimized for isolating chondrocytes from the articulated joints of rodent animal models using a series of enzymatic digestions and chondrocyte enrichment using a double negative selection process through florescence-activated cell sorting (FACS).
Collapse
Affiliation(s)
- Jillian L McCool
- Physical and Life Science Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA
- School of Natural Sciences, University of California Merced, Merced, CA, USA
| | - Nicholas R Hum
- Physical and Life Science Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - Aimy Sebastian
- Physical and Life Science Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - Gabriela G Loots
- Physical and Life Science Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA.
- School of Natural Sciences, University of California Merced, Merced, CA, USA.
- Department of Orthopaedic Surgery, University of California Davis Health, Sacramento, USA.
| |
Collapse
|
6
|
Analysis of Tumor-Infiltrating T-Cell Transcriptomes Reveal a Unique Genetic Signature across Different Types of Cancer. Int J Mol Sci 2022; 23:ijms231911065. [PMID: 36232369 PMCID: PMC9569723 DOI: 10.3390/ijms231911065] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/09/2022] [Accepted: 08/16/2022] [Indexed: 11/16/2022] Open
Abstract
CD8+ and CD4+ T-cells play a key role in cellular immune responses against cancer by cytotoxic responses and effector lineages differentiation, respectively. These subsets have been found in different types of cancer; however, it is unclear whether tumor-infiltrating T-cell subsets exhibit similar transcriptome profiling across different types of cancer in comparison with healthy tissue-resident T-cells. Thus, we analyzed the single cell transcriptome of five tumor-infiltrating CD4-T, CD8-T and Treg cells obtained from different types of cancer to identify specific pathways for each subset in malignant environments. An in silico analysis was performed from single-cell RNA-sequencing data available in public repositories (Gene Expression Omnibus) including breast cancer, melanoma, colorectal cancer, lung cancer and head and neck cancer. After dimensionality reduction, clustering and selection of the different subpopulations from malignant and nonmalignant datasets, common genes across different types of cancer were identified and compared to nonmalignant genes for each T-cell subset to identify specific pathways. Exclusive pathways in CD4+ cells, CD8+ cells and Tregs, and common pathways for the tumor-infiltrating T-cell subsets were identified. Finally, the identified pathways were compared with RNAseq and proteomic data obtained from T-cell subsets cultured under malignant environments and we observed that cytokine signaling, especially Th2-type cytokine, was the top overrepresented pathway in Tregs from malignant samples.
Collapse
|
7
|
Uddin MDM, Nguyen NQH, Yu B, Brody JA, Pampana A, Nakao T, Fornage M, Bressler J, Sotoodehnia N, Weinstock JS, Honigberg MC, Nachun D, Bhattacharya R, Griffin GK, Chander V, Gibbs RA, Rotter JI, Liu C, Baccarelli AA, Chasman DI, Whitsel EA, Kiel DP, Murabito JM, Boerwinkle E, Ebert BL, Jaiswal S, Floyd JS, Bick AG, Ballantyne CM, Psaty BM, Natarajan P, Conneely KN. Clonal hematopoiesis of indeterminate potential, DNA methylation, and risk for coronary artery disease. Nat Commun 2022; 13:5350. [PMID: 36097025 PMCID: PMC9468335 DOI: 10.1038/s41467-022-33093-3] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 09/01/2022] [Indexed: 12/15/2022] Open
Abstract
Age-related changes to the genome-wide DNA methylation (DNAm) pattern observed in blood are well-documented. Clonal hematopoiesis of indeterminate potential (CHIP), characterized by the age-related acquisition and expansion of leukemogenic mutations in hematopoietic stem cells (HSCs), is associated with blood cancer and coronary artery disease (CAD). Epigenetic regulators DNMT3A and TET2 are the two most frequently mutated CHIP genes. Here, we present results from an epigenome-wide association study for CHIP in 582 Cardiovascular Health Study (CHS) participants, with replication in 2655 Atherosclerosis Risk in Communities (ARIC) Study participants. We show that DNMT3A and TET2 CHIP have distinct and directionally opposing genome-wide DNAm association patterns consistent with their regulatory roles, albeit both promoting self-renewal of HSCs. Mendelian randomization analyses indicate that a subset of DNAm alterations associated with these two leading CHIP genes may promote the risk for CAD.
Collapse
Affiliation(s)
- M D Mesbah Uddin
- Medical and Population Genetics and Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Ngoc Quynh H Nguyen
- Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Bing Yu
- Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Jennifer A Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, 98101, USA
| | - Akhil Pampana
- Medical and Population Genetics and Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Tetsushi Nakao
- Medical and Population Genetics and Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Myriam Fornage
- Brown Foundation Institute of Molecular Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Jan Bressler
- Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Nona Sotoodehnia
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, 98101, USA
| | - Joshua S Weinstock
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, 48109, USA
| | - Michael C Honigberg
- Medical and Population Genetics and Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Daniel Nachun
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Romit Bhattacharya
- Medical and Population Genetics and Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA
- Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA
| | - Gabriel K Griffin
- Department of Pathology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, 02115, USA
- Epigenomics Program, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Varuna Chander
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Genetics and Genomics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Richard A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Genetics and Genomics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, 90502, USA
| | - Chunyu Liu
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA, 02118, USA
- Framingham Heart Study, Boston University and NHLBI/NIH, Framingham, MA, 01702, USA
| | - Andrea A Baccarelli
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY, 10032, USA
| | - Daniel I Chasman
- Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, MA, 02215, USA
| | - Eric A Whitsel
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, 27516, USA
- Department of Medicine, School of Medicine, University of North Carolina, Chapel Hill, NC, 27516, USA
| | - Douglas P Kiel
- Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA
- Hinda and Arthur Marcus Institute for Aging Research, Hebrew SeniorLife, Boston, MA, 02131, USA
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, 02215, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Joanne M Murabito
- Framingham Heart Study, Boston University and NHLBI/NIH, Framingham, MA, 01702, USA
- Department of Medicine, Section of General Internal Medicine, Boston University School of Medicine and Boston Medical Center, Boston, MA, 02118, USA
| | - Eric Boerwinkle
- Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Benjamin L Ebert
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Howard Hughes Medical Institute, Boston, MA, 20815, USA
| | - Siddhartha Jaiswal
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - James S Floyd
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, 98101, USA
- Department of Epidemiology, University of Washington, Seattle, WA, 98101, USA
| | - Alexander G Bick
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | | | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, 98101, USA
- Department of Epidemiology, University of Washington, Seattle, WA, 98101, USA
- Department of Health Systems and Population Health, University of Washington, Seattle, WA, 98101, USA
| | - Pradeep Natarajan
- Medical and Population Genetics and Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA.
| | - Karen N Conneely
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA.
| |
Collapse
|
8
|
Wang X, Zhang C, Zhang Y, Meng X, Zhang Z, Shi X, Song T. IMGG: Integrating Multiple Single-Cell Datasets through Connected Graphs and Generative Adversarial Networks. Int J Mol Sci 2022; 23:ijms23042082. [PMID: 35216199 PMCID: PMC8876681 DOI: 10.3390/ijms23042082] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 02/10/2022] [Accepted: 02/11/2022] [Indexed: 12/15/2022] Open
Abstract
There is a strong need to eliminate batch-specific differences when integrating single-cell RNA-sequencing (scRNA-seq) datasets generated under different experimental conditions for downstream task analysis. Existing batch correction methods usually transform different batches of cells into one preselected “anchor” batch or a low-dimensional embedding space, and cannot take full advantage of useful information from multiple sources. We present a novel framework, called IMGG, i.e., integrating multiple single-cell datasets through connected graphs and generative adversarial networks (GAN) to eliminate nonbiological differences between different batches. Compared with current methods, IMGG shows excellent performance on a variety of evaluation metrics, and the IMGG-corrected gene expression data incorporate features from multiple batches, allowing for downstream tasks such as differential gene expression analysis.
Collapse
Affiliation(s)
- Xun Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China; (X.W.); (C.Z.); (Y.Z.); (X.M.); (Z.Z.); (X.S.)
| | - Chaogang Zhang
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China; (X.W.); (C.Z.); (Y.Z.); (X.M.); (Z.Z.); (X.S.)
| | - Ying Zhang
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China; (X.W.); (C.Z.); (Y.Z.); (X.M.); (Z.Z.); (X.S.)
| | - Xiangyu Meng
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China; (X.W.); (C.Z.); (Y.Z.); (X.M.); (Z.Z.); (X.S.)
| | - Zhiyuan Zhang
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China; (X.W.); (C.Z.); (Y.Z.); (X.M.); (Z.Z.); (X.S.)
| | - Xin Shi
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China; (X.W.); (C.Z.); (Y.Z.); (X.M.); (Z.Z.); (X.S.)
| | - Tao Song
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China; (X.W.); (C.Z.); (Y.Z.); (X.M.); (Z.Z.); (X.S.)
- Department of Artificial Intelligence, Faculty of Computer Science, Campus de Montegancedo, Polytechnical University of Madrid, Boadilla del Monte, 28660 Madrid, Spain
- Correspondence:
| |
Collapse
|
9
|
Martínez-García M, Hernández-Lemus E. Data Integration Challenges for Machine Learning in Precision Medicine. Front Med (Lausanne) 2022; 8:784455. [PMID: 35145977 PMCID: PMC8821900 DOI: 10.3389/fmed.2021.784455] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/28/2021] [Indexed: 12/19/2022] Open
Abstract
A main goal of Precision Medicine is that of incorporating and integrating the vast corpora on different databases about the molecular and environmental origins of disease, into analytic frameworks, allowing the development of individualized, context-dependent diagnostics, and therapeutic approaches. In this regard, artificial intelligence and machine learning approaches can be used to build analytical models of complex disease aimed at prediction of personalized health conditions and outcomes. Such models must handle the wide heterogeneity of individuals in both their genetic predisposition and their social and environmental determinants. Computational approaches to medicine need to be able to efficiently manage, visualize and integrate, large datasets combining structure, and unstructured formats. This needs to be done while constrained by different levels of confidentiality, ideally doing so within a unified analytical architecture. Efficient data integration and management is key to the successful application of computational intelligence approaches to medicine. A number of challenges arise in the design of successful designs to medical data analytics under currently demanding conditions of performance in personalized medicine, while also subject to time, computational power, and bioethical constraints. Here, we will review some of these constraints and discuss possible avenues to overcome current challenges.
Collapse
Affiliation(s)
- Mireya Martínez-García
- Clinical Research Division, National Institute of Cardiology ‘Ignacio Chávez’, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autnoma de Mexico, Mexico City, Mexico
| |
Collapse
|
10
|
Lê Cao KA, Abadi AJ, Davis-Marcisak EF, Hsu L, Arora A, Coullomb A, Deshpande A, Feng Y, Jeganathan P, Loth M, Meng C, Mu W, Pancaldi V, Sankaran K, Righelli D, Singh A, Sodicoff JS, Stein-O’Brien GL, Subramanian A, Welch JD, You Y, Argelaguet R, Carey VJ, Dries R, Greene CS, Holmes S, Love MI, Ritchie ME, Yuan GC, Culhane AC, Fertig E. Community-wide hackathons to identify central themes in single-cell multi-omics. Genome Biol 2021; 22:220. [PMID: 34353350 PMCID: PMC8340473 DOI: 10.1186/s13059-021-02433-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Affiliation(s)
- Kim-Anh Lê Cao
- Melbourne Integrative Genomics, School of Mathematics and Statistics, University of Melbourne, Melbourne, Australia
| | - Al J. Abadi
- Melbourne Integrative Genomics, School of Mathematics and Statistics, University of Melbourne, Melbourne, Australia
| | - Emily F. Davis-Marcisak
- McKusick-Nathans Institute of the Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD USA
| | - Lauren Hsu
- Data Science, Dana-Farber Cancer Institute, Boston, MA USA
- Department of Genetics, UNC, Chapel Hill, NC USA
| | - Arshi Arora
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY USA
| | - Alexis Coullomb
- Centre de Recherches en Cancérologie de Toulouse (INSERM), Université Paul Sabatier III, Toulouse, France
| | - Atul Deshpande
- Cancer Convergence Institute and Division of Quantitative Sciences, Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Yuzhou Feng
- Melbourne Integrative Genomics, School of Mathematics and Statistics, University of Melbourne, Melbourne, Australia
| | | | - Melanie Loth
- Cancer Convergence Institute and Division of Quantitative Sciences, Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Chen Meng
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Wancen Mu
- Department of Biostatistics, UNC, Chapel Hill, NC USA
| | - Vera Pancaldi
- Centre de Recherches en Cancérologie de Toulouse (INSERM), Université Paul Sabatier III, Toulouse, France
- Barcelona Supercomputing Center, Barcelona, Spain
| | - Kris Sankaran
- Department of Statistics, University of Wisconsin, Madison, WI USA
| | - Dario Righelli
- Department of Statistical Sciences, University of Padova, Padova, PD Italy
| | - Amrit Singh
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC Canada
- PROOF Centre of Excellence, Vancouver, BC Canada
| | - Joshua S. Sodicoff
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI USA
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI USA
| | - Genevieve L. Stein-O’Brien
- McKusick-Nathans Institute of the Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD USA
- Cancer Convergence Institute and Division of Quantitative Sciences, Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD USA
- Department of Neuroscience, Johns Hopkins University, Baltimore, MD USA
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD USA
| | | | - Joshua D. Welch
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI USA
- Department of Computer Science and Engineering, University of Michigan, Ann Arbor, MI USA
| | - Yue You
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, University of Melbourne, Melbourne, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, Australia
| | | | - Vincent J. Carey
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA USA
| | - Ruben Dries
- Department of Hematology and Oncology, Boston Medical Center, Boston, MA USA
- Department of Computational Biomedicine, Boston University School of Medicine, Boston, MA USA
- Center for Regenerative Medicine (CReM), Boston University, Boston, MA USA
| | - Casey S. Greene
- Center for Health AI and Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO USA
| | - Susan Holmes
- Department of Statistics, Stanford University, Stanford, CA USA
| | - Michael I. Love
- Department of Biostatistics, UNC, Chapel Hill, NC USA
- Department of Genetics, UNC, Chapel Hill, NC USA
| | - Matthew E. Ritchie
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, University of Melbourne, Melbourne, Australia
- School of Mathematics and Statistics, University of Melbourne, Melbourne, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, Australia
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences, Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY USA
| | - Aedin C. Culhane
- Data Science, Dana-Farber Cancer Institute, Boston, MA USA
- Biostatistics, Harvard TH Chan School of Public Health, Boston, MA USA
| | - Elana Fertig
- Cancer Convergence Institute and Division of Quantitative Sciences, Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD USA
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD USA
- Department of Applied Mathematics and Statistics, Johns Hopkins University Whiting School of Engineering, Baltimore, MD USA
| |
Collapse
|
11
|
Liu B, Wu FX, Zou X. scASK: A Novel Ensemble Framework for Classifying Cell Types Based on Single-cell RNA-seq Data. IEEE J Biomed Health Inform 2021; 25:3230-3239. [PMID: 33434139 DOI: 10.1109/jbhi.2021.3050963] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The Human Cell Atlas (HCA) is a large project that aims to identify all cell types in the human body. The dimension reduction and clustering for identification of cell types from single-cell RNA-sequencing (scRNA-seq) data have become foundational approaches to HCA. The major challenges of current computational analyses are of poor performance on large scale data and sensitive to initial data. We present a new ensemble framework called Adaptive Slice KNNs (scASK) to address the challenges for analyzing scRNA-seq data with high dimensionality. scASK consists of three innovational modules, called DAS (Data Adaptive Slicing), MCS (Meta Classifiers Selecting) and EMS (Ensemble Mode Switching), respectively, which facilitate scASK to approximate a bias-variance tradeoff beyond classification. Thirteen real scRNA-seq datasets are used to evaluate the performance of scASK. Compared with five popular classification algorithms, our experimental results indicate that scASK achieves the best accuracy and robustness among all competing methods. In conclusion, adaptive slicing is an effective structural reduction procedure, and meanwhile scASK provides novel and robust ensemble framework especially for classifying cell types based on scRNA-seq data. scASK is now publically available at https://github.com/liubo2358/scASKcmd.
Collapse
|
12
|
Ando Y, Kwon ATJ, Shin JW. An era of single-cell genomics consortia. Exp Mol Med 2020; 52:1409-1418. [PMID: 32929222 PMCID: PMC8080593 DOI: 10.1038/s12276-020-0409-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 01/24/2020] [Accepted: 02/10/2020] [Indexed: 12/24/2022] Open
Abstract
The human body consists of 37 trillion single cells represented by over 50 organs that are stitched together to make us who we are, yet we still have very little understanding about the basic units of our body: what cell types and states make up our organs both compositionally and spatially. Previous efforts to profile a wide range of human cell types have been attempted by the FANTOM and GTEx consortia. Now, with the advancement in genomic technologies, profiling the human body at single-cell resolution is possible and will generate an unprecedented wealth of data that will accelerate basic and clinical research with tangible applications to future medicine. To date, several major organs have been profiled, but the challenges lie in ways to integrate single-cell genomics data in a meaningful way. In recent years, several consortia have begun to introduce harmonization and equity in data collection and analysis. Herein, we introduce existing and nascent single-cell genomics consortia, and present benefits to necessitate single-cell genomic consortia in a regional environment to achieve the universal human cell reference dataset.
Collapse
Affiliation(s)
- Yoshinari Ando
- RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-Cho, Tsurumi-Ku, Yokohama, 230-0045, Japan
| | - Andrew Tae-Jun Kwon
- RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-Cho, Tsurumi-Ku, Yokohama, 230-0045, Japan
| | - Jay W Shin
- RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-Cho, Tsurumi-Ku, Yokohama, 230-0045, Japan.
| |
Collapse
|
13
|
Panina Y, Karagiannis P, Kurtz A, Stacey GN, Fujibuchi W. Human Cell Atlas and cell-type authentication for regenerative medicine. Exp Mol Med 2020; 52:1443-1451. [PMID: 32929224 PMCID: PMC8080834 DOI: 10.1038/s12276-020-0421-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Revised: 03/05/2020] [Accepted: 03/09/2020] [Indexed: 12/22/2022] Open
Abstract
In modern biology, the correct identification of cell types is required for the developmental study of tissues and organs and the production of functional cells for cell therapies and disease modeling. For decades, cell types have been defined on the basis of morphological and physiological markers and, more recently, immunological markers and molecular properties. Recent advances in single-cell RNA sequencing have opened new doors for the characterization of cells at the individual and spatiotemporal levels on the basis of their RNA profiles, vastly transforming our understanding of cell types. The objective of this review is to survey the current progress in the field of cell-type identification, starting with the Human Cell Atlas project, which aims to sequence every cell in the human body, to molecular marker databases for individual cell types and other sources that address cell-type identification for regenerative medicine based on cell data guidelines.
Collapse
Affiliation(s)
- Yulia Panina
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Kawahara-cho, Shogoin, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Peter Karagiannis
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Kawahara-cho, Shogoin, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Andreas Kurtz
- BIH Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353, Berlin, Germany
| | - Glyn N Stacey
- International Stem Cell Banking Initiative, 2 High Street, Barley, Herts, SG88HZ, UK
- National Stem Cell Resource Centre, Institute of Zoology, Chinese Academy of Sciences, 100190, Beijing, China
- Innovation Academy for Stem Cell and Regeneration, Chinese Academy of Sciences, 100101, Beijing, China
| | - Wataru Fujibuchi
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Kawahara-cho, Shogoin, Sakyo-ku, Kyoto, 606-8507, Japan.
| |
Collapse
|
14
|
Macedo A, Gontijo AM. The intersectional genetics landscape for humans. Gigascience 2020; 9:giaa083. [PMID: 32761099 PMCID: PMC7407247 DOI: 10.1093/gigascience/giaa083] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Revised: 04/05/2020] [Accepted: 07/08/2020] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND The human body is made up of hundreds-perhaps thousands-of cell types and states, most of which are currently inaccessible genetically. Intersectional genetic approaches can increase the number of genetically accessible cells, but the scope and safety of these approaches have not been systematically assessed. A typical intersectional method acts like an "AND" logic gate by converting the input of 2 or more active, yet unspecific, regulatory elements (REs) into a single cell type specific synthetic output. RESULTS Here, we systematically assessed the intersectional genetics landscape of the human genome using a subset of cells from a large RE usage atlas (Functional ANnoTation Of the Mammalian genome 5 consortium, FANTOM5) obtained by cap analysis of gene expression sequencing (CAGE-seq). We developed the heuristics and algorithms to retrieve and quality-rank "AND" gate intersections. Of the 154 primary cell types surveyed, >90% can be distinguished from each other with as few as 3 to 4 active REs, with quantifiable safety and robustness. We call these minimal intersections of active REs with cell-type diagnostic potential "versatile entry codes" (VEnCodes). Each of the 158 cancer cell types surveyed could also be distinguished from the healthy primary cell types with small VEnCodes, most of which were robust to intra- and interindividual variation. Methods for the cross-validation of CAGE-seq-derived VEnCodes and for the extraction of VEnCodes from pooled single-cell sequencing data are also presented. CONCLUSIONS Our work provides a systematic view of the intersectional genetics landscape in humans and demonstrates the potential of these approaches for future gene delivery technologies.
Collapse
Affiliation(s)
- Andre Macedo
- Chronic Diseases Research Center, NOVA Medical School, Faculdade de Ciências Médicas, Universidade Nova de Lisboa, Rua do Instituto Bacteriológico 5, 1150–190, Lisbon, Portugal
| | - Alisson M Gontijo
- Chronic Diseases Research Center, NOVA Medical School, Faculdade de Ciências Médicas, Universidade Nova de Lisboa, Rua do Instituto Bacteriológico 5, 1150–190, Lisbon, Portugal
| |
Collapse
|
15
|
Wilbrey-Clark A, Roberts K, Teichmann SA. Cell Atlas technologies and insights into tissue architecture. Biochem J 2020; 477:1427-1442. [PMID: 32339226 PMCID: PMC7200628 DOI: 10.1042/bcj20190341] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 03/21/2020] [Accepted: 03/24/2020] [Indexed: 12/17/2022]
Abstract
Since Robert Hooke first described the existence of 'cells' in 1665, scientists have sought to identify and further characterise these fundamental units of life. While our understanding of cell location, morphology and function has expanded greatly; our understanding of cell types and states at the molecular level, and how these function within tissue architecture, is still limited. A greater understanding of our cells could revolutionise basic biology and medicine. Atlasing initiatives like the Human Cell Atlas aim to identify all cell types at the molecular level, including their physical locations, and to make this reference data openly available to the scientific community. This is made possible by a recent technology revolution: both in single-cell molecular profiling, particularly single-cell RNA sequencing, and in spatially resolved methods for assessing gene and protein expression. Here, we review available and upcoming atlasing technologies, the biological insights gained to date and the promise of this field for the future.
Collapse
|
16
|
Lähnemann D, Köster J, Szczurek E, McCarthy DJ, Hicks SC, Robinson MD, Vallejos CA, Campbell KR, Beerenwinkel N, Mahfouz A, Pinello L, Skums P, Stamatakis A, Attolini CSO, Aparicio S, Baaijens J, Balvert M, Barbanson BD, Cappuccio A, Corleone G, Dutilh BE, Florescu M, Guryev V, Holmer R, Jahn K, Lobo TJ, Keizer EM, Khatri I, Kielbasa SM, Korbel JO, Kozlov AM, Kuo TH, Lelieveldt BP, Mandoiu II, Marioni JC, Marschall T, Mölder F, Niknejad A, Rączkowska A, Reinders M, Ridder JD, Saliba AE, Somarakis A, Stegle O, Theis FJ, Yang H, Zelikovsky A, McHardy AC, Raphael BJ, Shah SP, Schönhuth A. Eleven grand challenges in single-cell data science. Genome Biol 2020; 21:31. [PMID: 32033589 PMCID: PMC7007675 DOI: 10.1186/s13059-020-1926-6] [Citation(s) in RCA: 671] [Impact Index Per Article: 134.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 01/02/2020] [Indexed: 02/08/2023] Open
Abstract
The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years.
Collapse
Affiliation(s)
- David Lähnemann
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Department of Paediatric Oncology, Haematology and Immunology, Medical Faculty, Heinrich Heine University, University Hospital, Düsseldorf, Germany
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Johannes Köster
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA
| | - Ewa Szczurek
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland
| | - Davis J. McCarthy
- Bioinformatics and Cellular Genomics, St Vincent’s Institute of Medical Research, Fitzroy, Australia
- Melbourne Integrative Genomics, School of BioSciences–School of Mathematics & Statistics, Faculty of Science, University of Melbourne, Melbourne, Australia
| | - Stephanie C. Hicks
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD USA
| | - Mark D. Robinson
- Institute of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zürich, Zürich, Switzerland
| | - Catalina A. Vallejos
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, UK
- The Alan Turing Institute, British Library, London, UK
| | - Kieran R. Campbell
- Department of Statistics, University of British Columbia, Vancouver, Canada
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
- Data Science Institute, University of British Columbia, Vancouver, Canada
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Ahmed Mahfouz
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Luca Pinello
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital Research Institute, Charlestown, USA
- Department of Pathology, Harvard Medical School, Boston, USA
- Broad Institute of Harvard and MIT, Cambridge, MA USA
| | - Pavel Skums
- Department of Computer Science, Georgia State University, Atlanta, USA
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | | | - Samuel Aparicio
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, Canada
| | - Jasmijn Baaijens
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
| | - Marleen Balvert
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
| | - Buys de Barbanson
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
- Quantitative biology, Hubrecht Institute, Utrecht, The Netherlands
| | - Antonio Cappuccio
- Institute for Advanced Study, University of Amsterdam, Amsterdam, The Netherlands
| | - Giacomo Corleone
- Department of Surgery and Cancer, The Imperial Centre for Translational and Experimental Medicine, Imperial College London, London, UK
| | - Bas E. Dutilh
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
- Centre for Molecular and Biomolecular Informatics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Maria Florescu
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
- Quantitative biology, Hubrecht Institute, Utrecht, The Netherlands
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Rens Holmer
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Katharina Jahn
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Thamar Jessurun Lobo
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Emma M. Keizer
- Biometris, Wageningen University & Research, Wageningen, The Netherlands
| | - Indu Khatri
- Department of Immunohematology and Blood Transfusion, Leiden University Medical Center, Leiden, The Netherlands
| | - Szymon M. Kielbasa
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Jan O. Korbel
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Alexey M. Kozlov
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Tzu-Hao Kuo
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Boudewijn P.F. Lelieveldt
- PRB lab, Delft University of Technology, Delft, The Netherlands
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Ion I. Mandoiu
- Computer Science & Engineering Department, University of Connecticut, Storrs, USA
| | - John C. Marioni
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Tobias Marschall
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
- Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Felix Mölder
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Institute of Pathology, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| | - Amir Niknejad
- Computation molecular design, Zuse Institute Berlin, Berlin, Germany
- Mathematics Department, Mount Saint Vincent, New York, USA
| | - Alicja Rączkowska
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland
| | - Marcel Reinders
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Jeroen de Ridder
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Antoine-Emmanuel Saliba
- Helmholtz Institute for RNA-based Infection Research, Helmholtz-Center for Infection Research, Würzburg, Germany
| | - Antonios Somarakis
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Oliver Stegle
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center–DKFZ, Heidelberg, Germany
| | - Fabian J. Theis
- Institute of Computational Biology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
| | - Huan Yang
- Division of Drug Discovery and Safety, Leiden Academic Center for Drug Research–LACDR–Leiden University, Leiden, The Netherlands
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, Atlanta, USA
- The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Alice C. McHardy
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | | | - Sohrab P. Shah
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, USA
| | - Alexander Schönhuth
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
17
|
Madissoon E, Wilbrey-Clark A, Miragaia RJ, Saeb-Parsy K, Mahbubani KT, Georgakopoulos N, Harding P, Polanski K, Huang N, Nowicki-Osuch K, Fitzgerald RC, Loudon KW, Ferdinand JR, Clatworthy MR, Tsingene A, van Dongen S, Dabrowska M, Patel M, Stubbington MJT, Teichmann SA, Stegle O, Meyer KB. scRNA-seq assessment of the human lung, spleen, and esophagus tissue stability after cold preservation. Genome Biol 2019; 21:1. [PMID: 31892341 PMCID: PMC6937944 DOI: 10.1186/s13059-019-1906-x] [Citation(s) in RCA: 293] [Impact Index Per Article: 48.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 11/28/2019] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND The Human Cell Atlas is a large international collaborative effort to map all cell types of the human body. Single-cell RNA sequencing can generate high-quality data for the delivery of such an atlas. However, delays between fresh sample collection and processing may lead to poor data and difficulties in experimental design. RESULTS This study assesses the effect of cold storage on fresh healthy spleen, esophagus, and lung from ≥ 5 donors over 72 h. We collect 240,000 high-quality single-cell transcriptomes with detailed cell type annotations and whole genome sequences of donors, enabling future eQTL studies. Our data provide a valuable resource for the study of these 3 organs and will allow cross-organ comparison of cell types. We see little effect of cold ischemic time on cell yield, total number of reads per cell, and other quality control metrics in any of the tissues within the first 24 h. However, we observe a decrease in the proportions of lung T cells at 72 h, higher percentage of mitochondrial reads, and increased contamination by background ambient RNA reads in the 72-h samples in the spleen, which is cell type specific. CONCLUSIONS In conclusion, we present robust protocols for tissue preservation for up to 24 h prior to scRNA-seq analysis. This greatly facilitates the logistics of sample collection for Human Cell Atlas or clinical studies since it increases the time frames for sample processing.
Collapse
Affiliation(s)
- E. Madissoon
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD UK
| | - A. Wilbrey-Clark
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
| | - R. J. Miragaia
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
| | - K. Saeb-Parsy
- Department of Surgery, University of Cambridge and NIHR Cambridge Biomedical Research Centre, Cambridge, CB2 0QQ UK
| | - K. T. Mahbubani
- Department of Surgery, University of Cambridge and NIHR Cambridge Biomedical Research Centre, Cambridge, CB2 0QQ UK
| | - N. Georgakopoulos
- Department of Surgery, University of Cambridge and NIHR Cambridge Biomedical Research Centre, Cambridge, CB2 0QQ UK
| | - P. Harding
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
| | - K. Polanski
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
| | - N. Huang
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
| | - K. Nowicki-Osuch
- MRC Cancer Unit, Hutchison-MRC Research Centre, University of Cambridge, Cambridge, CB2 0XZ UK
| | - R. C. Fitzgerald
- MRC Cancer Unit, Hutchison-MRC Research Centre, University of Cambridge, Cambridge, CB2 0XZ UK
| | - K. W. Loudon
- Molecular Immunology Unit, Department of Medicine, Cambridge, CB2 0QQ UK
| | - J. R. Ferdinand
- Molecular Immunology Unit, Department of Medicine, Cambridge, CB2 0QQ UK
| | - M. R. Clatworthy
- Molecular Immunology Unit, Department of Medicine, Cambridge, CB2 0QQ UK
| | - A. Tsingene
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
| | - S. van Dongen
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
| | - M. Dabrowska
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
| | - M. Patel
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
| | - M. J. T. Stubbington
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
- 10x Genomics Inc., 6230 Stoneridge Mall Road, Pleasanton, CA 94588 USA
| | - S. A. Teichmann
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
| | - O. Stegle
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD UK
| | - K. B. Meyer
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA UK
| |
Collapse
|
18
|
|
19
|
Johansen N, Quon G. scAlign: a tool for alignment, integration, and rare cell identification from scRNA-seq data. Genome Biol 2019; 20:166. [PMID: 31412909 PMCID: PMC6693154 DOI: 10.1186/s13059-019-1766-4] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 07/19/2019] [Indexed: 02/07/2023] Open
Abstract
scRNA-seq dataset integration occurs in different contexts, such as the identification of cell type-specific differences in gene expression across conditions or species, or batch effect correction. We present scAlign, an unsupervised deep learning method for data integration that can incorporate partial, overlapping, or a complete set of cell labels, and estimate per-cell differences in gene expression across datasets. scAlign performance is state-of-the-art and robust to cross-dataset variation in cell type-specific expression and cell type composition. We demonstrate that scAlign reveals gene expression programs for rare populations of malaria parasites. Our framework is widely applicable to integration challenges in other domains.
Collapse
Affiliation(s)
- Nelson Johansen
- Graduate Group in Computer Science, University of California, Davis, Davis, CA, USA.
- Genome Center, University of California, Davis, Davis, CA, USA.
| | - Gerald Quon
- Graduate Group in Computer Science, University of California, Davis, Davis, CA, USA.
- Genome Center, University of California, Davis, Davis, CA, USA.
- Department of Molecular and Cellular Biology, University of California, Davis, Davis, CA, USA.
| |
Collapse
|
20
|
Chen W, Morabito SJ, Kessenbrock K, Enver T, Meyer KB, Teschendorff AE. Single-cell landscape in mammary epithelium reveals bipotent-like cells associated with breast cancer risk and outcome. Commun Biol 2019; 2:306. [PMID: 31428694 PMCID: PMC6689007 DOI: 10.1038/s42003-019-0554-8] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 07/16/2019] [Indexed: 02/08/2023] Open
Abstract
Adult stem-cells may serve as the cell-of-origin for cancer, yet their unbiased identification in single cell RNA sequencing data is challenging due to the high dropout rate. In the case of breast, the existence of a bipotent stem-like state is also controversial. Here we apply a marker-free algorithm to scRNA-Seq data from the human mammary epithelium, revealing a high-potency cell-state enriched for an independent mammary stem-cell expression module. We validate this stem-like state in independent scRNA-Seq data. Our algorithm further predicts that the stem-like state is bipotent, a prediction we are able to validate using FACS sorted bulk expression data. The bipotent stem-like state correlates with clinical outcome in basal breast cancer and is characterized by overexpression of YBX1 and ENO1, two modulators of basal breast cancer risk. This study illustrates the power of a marker-free computational framework to identify a novel bipotent stem-like state in the mammary epithelium.
Collapse
Affiliation(s)
- Weiyan Chen
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai, 200031 China
| | - Samuel J. Morabito
- Chao Family Comprehensive Cancer Center, University of California, Irvine 839 Health Science Road, Sprague Hall 114 Irvine, Irvine, CA 92697-3905 USA
| | - Kai Kessenbrock
- Chao Family Comprehensive Cancer Center, University of California, Irvine 839 Health Science Road, Sprague Hall 114 Irvine, Irvine, CA 92697-3905 USA
| | - Tariq Enver
- UCL Cancer Institute, Paul O’Gorman Building, University College London, 72 Huntley Street, London, WC1E 6BT United Kingdom
| | | | - Andrew E. Teschendorff
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai, 200031 China
| |
Collapse
|
21
|
Uzbas F, Opperer F, Sönmezer C, Shaposhnikov D, Sass S, Krendl C, Angerer P, Theis FJ, Mueller NS, Drukker M. BART-Seq: cost-effective massively parallelized targeted sequencing for genomics, transcriptomics, and single-cell analysis. Genome Biol 2019; 20:155. [PMID: 31387612 PMCID: PMC6683345 DOI: 10.1186/s13059-019-1748-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2019] [Accepted: 06/25/2019] [Indexed: 01/22/2023] Open
Abstract
We describe a highly sensitive, quantitative, and inexpensive technique for targeted sequencing of transcript cohorts or genomic regions from thousands of bulk samples or single cells in parallel. Multiplexing is based on a simple method that produces extensive matrices of diverse DNA barcodes attached to invariant primer sets, which are all pre-selected and optimized in silico. By applying the matrices in a novel workflow named Barcode Assembly foR Targeted Sequencing (BART-Seq), we analyze developmental states of thousands of single human pluripotent stem cells, either in different maintenance media or upon Wnt/β-catenin pathway activation, which identifies the mechanisms of differentiation induction. Moreover, we apply BART-Seq to the genetic screening of breast cancer patients and identify BRCA mutations with very high precision. The processing of thousands of samples and dynamic range measurements that outperform global transcriptomics techniques makes BART-Seq first targeted sequencing technique suitable for numerous research applications.
Collapse
Affiliation(s)
- Fatma Uzbas
- Institute of Stem Cell Research, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Florian Opperer
- Institute of Stem Cell Research, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Can Sönmezer
- Institute of Stem Cell Research, Helmholtz Center Munich, 85764 Neuherberg, Germany
- Genome Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Dmitry Shaposhnikov
- Institute of Stem Cell Research, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Steffen Sass
- Institute of Computational Biology, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Christian Krendl
- Institute of Stem Cell Research, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Philipp Angerer
- Institute of Computational Biology, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Fabian J. Theis
- Institute of Computational Biology, Helmholtz Center Munich, 85764 Neuherberg, Germany
- Department of Mathematics, Technical University Munich, 85748 Garching, Germany
| | - Nikola S. Mueller
- Institute of Computational Biology, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Micha Drukker
- Institute of Stem Cell Research, Helmholtz Center Munich, 85764 Neuherberg, Germany
| |
Collapse
|
22
|
Sonawane AR, Weiss ST, Glass K, Sharma A. Network Medicine in the Age of Biomedical Big Data. Front Genet 2019; 10:294. [PMID: 31031797 PMCID: PMC6470635 DOI: 10.3389/fgene.2019.00294] [Citation(s) in RCA: 124] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 03/19/2019] [Indexed: 12/13/2022] Open
Abstract
Network medicine is an emerging area of research dealing with molecular and genetic interactions, network biomarkers of disease, and therapeutic target discovery. Large-scale biomedical data generation offers a unique opportunity to assess the effect and impact of cellular heterogeneity and environmental perturbations on the observed phenotype. Marrying the two, network medicine with biomedical data provides a framework to build meaningful models and extract impactful results at a network level. In this review, we survey existing network types and biomedical data sources. More importantly, we delve into ways in which the network medicine approach, aided by phenotype-specific biomedical data, can be gainfully applied. We provide three paradigms, mainly dealing with three major biological network archetypes: protein-protein interaction, expression-based, and gene regulatory networks. For each of these paradigms, we discuss a broad overview of philosophies under which various network methods work. We also provide a few examples in each paradigm as a test case of its successful application. Finally, we delineate several opportunities and challenges in the field of network medicine. We hope this review provides a lexicon for researchers from biological sciences and network theory to come on the same page to work on research areas that require interdisciplinary expertise. Taken together, the understanding gained from combining biomedical data with networks can be useful for characterizing disease etiologies and identifying therapeutic targets, which, in turn, will lead to better preventive medicine with translational impact on personalized healthcare.
Collapse
Affiliation(s)
- Abhijeet R. Sonawane
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Scott T. Weiss
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Amitabh Sharma
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division, Brigham and Women’s Hospital, Boston, MA, United States
| |
Collapse
|
23
|
Subramanian Parimalam S, Oguchi Y, Abdelmoez MN, Tsuchida A, Ozaki Y, Yokokawa R, Kotera H, Shintaku H. Electrical Lysis and RNA Extraction from Single Cells Fixed by Dithiobis(succinimidyl propionate). Anal Chem 2018; 90:12512-12518. [PMID: 30350601 DOI: 10.1021/acs.analchem.8b02338] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
We present a microfluidic method for electrical lysis and RNA extraction from single fixed cells leveraging reversible cross-linker dithiobis(succinimidyl propionate) (DSP). Our microfluidic system captures a single DSP-fixed cell at a hydrodynamic trap, reverse-cross-links the DSP molecules on a chip with dithiothreitol, lyses the plasma membrane via electrical field, and extracts cytoplasmic RNA with isotachophoresis-aided nucleic acids extraction. All of the on-chip processes complete in less than 5 min. We demonstrated the method using K562 leukemia cells and benchmarked the performance of RNA extraction with reverse transcription quantitative polymerase chain reaction. We also demonstrated the integration of our method with single-cell RNA sequencing.
Collapse
Affiliation(s)
- Sangamithirai Subramanian Parimalam
- Microfluidics RIKEN Hakubi Research Team , RIKEN Cluster for Pioneering Research , Wako, Saitama 351-0198 , Japan.,Department of Micro Engineering, Graduate School of Engineering , Kyoto University , Kyoto 615-8530 , Japan
| | - Yusuke Oguchi
- Microfluidics RIKEN Hakubi Research Team , RIKEN Cluster for Pioneering Research , Wako, Saitama 351-0198 , Japan.,Department of Biological Sciences, Graduate School of Science , The University of Tokyo , Tokyo 113-0033 , Japan
| | - Mahmoud N Abdelmoez
- Microfluidics RIKEN Hakubi Research Team , RIKEN Cluster for Pioneering Research , Wako, Saitama 351-0198 , Japan.,Department of Micro Engineering, Graduate School of Engineering , Kyoto University , Kyoto 615-8530 , Japan
| | - Arata Tsuchida
- Microfluidics RIKEN Hakubi Research Team , RIKEN Cluster for Pioneering Research , Wako, Saitama 351-0198 , Japan.,Department of Micro Engineering, Graduate School of Engineering , Kyoto University , Kyoto 615-8530 , Japan
| | - Yuka Ozaki
- Microfluidics RIKEN Hakubi Research Team , RIKEN Cluster for Pioneering Research , Wako, Saitama 351-0198 , Japan
| | - Ryuji Yokokawa
- Department of Micro Engineering, Graduate School of Engineering , Kyoto University , Kyoto 615-8530 , Japan
| | - Hidetoshi Kotera
- Department of Micro Engineering, Graduate School of Engineering , Kyoto University , Kyoto 615-8530 , Japan
| | - Hirofumi Shintaku
- Microfluidics RIKEN Hakubi Research Team , RIKEN Cluster for Pioneering Research , Wako, Saitama 351-0198 , Japan
| |
Collapse
|
24
|
Hay SB, Ferchen K, Chetal K, Grimes HL, Salomonis N. The Human Cell Atlas bone marrow single-cell interactive web portal. Exp Hematol 2018; 68:51-61. [PMID: 30243574 PMCID: PMC6296228 DOI: 10.1016/j.exphem.2018.09.004] [Citation(s) in RCA: 149] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Revised: 09/11/2018] [Accepted: 09/14/2018] [Indexed: 01/17/2023]
Abstract
The Human Cell Atlas (HCA) is expected to facilitate the creation of reference cell profiles, marker genes, and gene regulatory networks that will provide a deeper understanding of healthy and disease cell types from clinical biospecimens. The hematopoietic system includes dozens of distinct, transcriptionally coherent cell types, including intermediate transitional populations that have not been previously described at a molecular level. Using the first data release from the HCA bone marrow tissue project, we resolved common, rare, and potentially transitional cell populations from over 100,000 hematopoietic cells spanning 35 transcriptionally coherent groups across eight healthy donors using emerging new computational approaches. These data highlight novel mixed-lineage progenitor populations and putative trajectories governing granulocytic, monocytic, lymphoid, erythroid, megakaryocytic, and eosinophil specification. Our analyses suggest significant variation in cell-type composition and gene expression among donors, including biological processes affected by donor age. To enable broad exploration of these findings, we provide an interactive website to probe intra-cell and extra-cell population differences within and between donors and reference markers for cellular classification and cellular trajectories through associated progenitor states.
Collapse
Affiliation(s)
- Stuart B Hay
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Kyle Ferchen
- Department of Pediatrics, University of Cincinnati School of Medicine, Cincinnati, Ohio, USA
| | - Kashish Chetal
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - H Leighton Grimes
- Department of Pediatrics, University of Cincinnati School of Medicine, Cincinnati, Ohio, USA; Division of Immunobiology and Center for Systems Immunology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA; Division of Experimental Hematology and Cancer Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Nathan Salomonis
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Pediatrics, University of Cincinnati School of Medicine, Cincinnati, Ohio, USA; Department of Biomedical Informatics, University of Cincinnati, Cincinnati, OH, USA.
| |
Collapse
|
25
|
Abstract
Single-cell multiomics technologies typically measure multiple types of molecule from the same individual cell, enabling more profound biological insight than can be inferred by analyzing each molecular layer from separate cells. These single-cell multiomics technologies can reveal cellular heterogeneity at multiple molecular layers within a population of cells and reveal how this variation is coupled or uncoupled between the captured omic layers. The data sets generated by these techniques have the potential to enable a deeper understanding of the key biological processes and mechanisms driving cellular heterogeneity and how they are linked with normal development and aging as well as disease etiology. This review details both established and novel single-cell mono- and multiomics technologies and considers their limitations, applications, and likely future developments.
Collapse
Affiliation(s)
- Lia Chappell
- Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom; , ,
| | | | - Thierry Voet
- Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom; , , .,Department of Human Genetics, KU Leuven, B-3000 Leuven, Belgium;
| |
Collapse
|