1
|
Zhang K, Zemke NR, Armand EJ, Ren B. A fast, scalable and versatile tool for analysis of single-cell omics data. Nat Methods 2024; 21:217-227. [PMID: 38191932 PMCID: PMC10864184 DOI: 10.1038/s41592-023-02139-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 11/23/2023] [Indexed: 01/10/2024]
Abstract
Single-cell omics technologies have revolutionized the study of gene regulation in complex tissues. A major computational challenge in analyzing these datasets is to project the large-scale and high-dimensional data into low-dimensional space while retaining the relative relationships between cells. This low dimension embedding is necessary to decompose cellular heterogeneity and reconstruct cell-type-specific gene regulatory programs. Traditional dimensionality reduction techniques, however, face challenges in computational efficiency and in comprehensively addressing cellular diversity across varied molecular modalities. Here we introduce a nonlinear dimensionality reduction algorithm, embodied in the Python package SnapATAC2, which not only achieves a more precise capture of single-cell omics data heterogeneities but also ensures efficient runtime and memory usage, scaling linearly with the number of cells. Our algorithm demonstrates exceptional performance, scalability and versatility across diverse single-cell omics datasets, including single-cell assay for transposase-accessible chromatin using sequencing, single-cell RNA sequencing, single-cell Hi-C and single-cell multi-omics datasets, underscoring its utility in advancing single-cell analysis.
Collapse
Affiliation(s)
- Kai Zhang
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Westlake Laboratory of Life Sciences and Biomedicine, School of Life Sciences, Westlake University, Hangzhou, China
| | - Nathan R Zemke
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Ethan J Armand
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
| | - Bing Ren
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA.
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA.
- Ludwig Institute for Cancer Research, La Jolla, CA, USA.
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
2
|
Zemke NR, Armand EJ, Wang W, Lee S, Zhou J, Li YE, Liu H, Tian W, Nery JR, Castanon RG, Bartlett A, Osteen JK, Li D, Zhuo X, Xu V, Chang L, Dong K, Indralingam HS, Rink JA, Xie Y, Miller M, Krienen FM, Zhang Q, Taskin N, Ting J, Feng G, McCarroll SA, Callaway EM, Wang T, Lein ES, Behrens MM, Ecker JR, Ren B. Author Correction: Conserved and divergent gene regulatory programs of the mammalian neocortex. Nature 2024; 625:E26. [PMID: 38200319 PMCID: PMC10808050 DOI: 10.1038/s41586-023-07013-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2024]
Affiliation(s)
- Nathan R Zemke
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Ethan J Armand
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
| | - Wenliang Wang
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Seoyeon Lee
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Jingtian Zhou
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Yang Eric Li
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Hanqing Liu
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
- Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Wei Tian
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Joseph R Nery
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Rosa G Castanon
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Anna Bartlett
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Julia K Osteen
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Daofeng Li
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St Louis, MO, USA
| | - Xiaoyu Zhuo
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St Louis, MO, USA
| | - Vincent Xu
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St Louis, MO, USA
| | - Lei Chang
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Keyi Dong
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Hannah S Indralingam
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Jonathan A Rink
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Yang Xie
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Michael Miller
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Fenna M Krienen
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
- Department of Genetics, Harvard Medical School, Boston, USA
| | - Qiangge Zhang
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Naz Taskin
- Allen Institute for Brain Science, Seattle, WA, USA
| | | | - Guoping Feng
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Steven A McCarroll
- Department of Genetics, Harvard Medical School, Boston, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Edward M Callaway
- Systems Neurobiology Laboratories, The Salk Institute for Biological Studies, La Jolla, CA, USA
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
| | - Ting Wang
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St Louis, MO, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Ed S Lein
- Allen Institute for Brain Science, Seattle, WA, USA
- Department of Neurological Surgery, University of Washington, Seattle, WA, USA
| | - M Margarita Behrens
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Joseph R Ecker
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA.
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, USA.
| | - Bing Ren
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA.
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA.
- Institute of Genomic Medicine, Moores Cancer Center, School of Medicine, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
3
|
Zu S, Li YE, Wang K, Armand EJ, Mamde S, Amaral ML, Wang Y, Chu A, Xie Y, Miller M, Xu J, Wang Z, Zhang K, Jia B, Hou X, Lin L, Yang Q, Lee S, Li B, Kuan S, Liu H, Zhou J, Pinto-Duarte A, Lucero J, Osteen J, Nunn M, Smith KA, Tasic B, Yao Z, Zeng H, Wang Z, Shang J, Behrens MM, Ecker JR, Wang A, Preissl S, Ren B. Single-cell analysis of chromatin accessibility in the adult mouse brain. Nature 2023; 624:378-389. [PMID: 38092917 PMCID: PMC10719105 DOI: 10.1038/s41586-023-06824-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 11/01/2023] [Indexed: 12/17/2023]
Abstract
Recent advances in single-cell technologies have led to the discovery of thousands of brain cell types; however, our understanding of the gene regulatory programs in these cell types is far from complete1-4. Here we report a comprehensive atlas of candidate cis-regulatory DNA elements (cCREs) in the adult mouse brain, generated by analysing chromatin accessibility in 2.3 million individual brain cells from 117 anatomical dissections. The atlas includes approximately 1 million cCREs and their chromatin accessibility across 1,482 distinct brain cell populations, adding over 446,000 cCREs to the most recent such annotation in the mouse genome. The mouse brain cCREs are moderately conserved in the human brain. The mouse-specific cCREs-specifically, those identified from a subset of cortical excitatory neurons-are strongly enriched for transposable elements, suggesting a potential role for transposable elements in the emergence of new regulatory programs and neuronal diversity. Finally, we infer the gene regulatory networks in over 260 subclasses of mouse brain cells and develop deep-learning models to predict the activities of gene regulatory elements in different brain cell types from the DNA sequence alone. Our results provide a resource for the analysis of cell-type-specific gene regulation programs in both mouse and human brains.
Collapse
Affiliation(s)
- Songpeng Zu
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Yang Eric Li
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
- Department of Neurosurgery and Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - Kangli Wang
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Ethan J Armand
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Sainath Mamde
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Maria Luisa Amaral
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Yuelai Wang
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Andre Chu
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Yang Xie
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Michael Miller
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Jie Xu
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Zhaoning Wang
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Kai Zhang
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Bojing Jia
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Xiaomeng Hou
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Lin Lin
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Qian Yang
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Seoyeon Lee
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Bin Li
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Samantha Kuan
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Hanqing Liu
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Jingtian Zhou
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | | | - Jacinta Lucero
- The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Julia Osteen
- The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Michael Nunn
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | | | | | - Zizhen Yao
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Hongkui Zeng
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Zihan Wang
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Jingbo Shang
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | | | - Joseph R Ecker
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Allen Wang
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA, USA
| | - Sebastian Preissl
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA, USA
- Institute of Experimental and Clinical Pharmacology and Toxicology, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Bing Ren
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA.
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA, USA.
| |
Collapse
|
4
|
Zemke NR, Armand EJ, Wang W, Lee S, Zhou J, Li YE, Liu H, Tian W, Nery JR, Castanon RG, Bartlett A, Osteen JK, Li D, Zhuo X, Xu V, Chang L, Dong K, Indralingam HS, Rink JA, Xie Y, Miller M, Krienen FM, Zhang Q, Taskin N, Ting J, Feng G, McCarroll SA, Callaway EM, Wang T, Lein ES, Behrens MM, Ecker JR, Ren B. Conserved and divergent gene regulatory programs of the mammalian neocortex. Nature 2023; 624:390-402. [PMID: 38092918 PMCID: PMC10719095 DOI: 10.1038/s41586-023-06819-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 11/01/2023] [Indexed: 12/17/2023]
Abstract
Divergence of cis-regulatory elements drives species-specific traits1, but how this manifests in the evolution of the neocortex at the molecular and cellular level remains unclear. Here we investigated the gene regulatory programs in the primary motor cortex of human, macaque, marmoset and mouse using single-cell multiomics assays, generating gene expression, chromatin accessibility, DNA methylome and chromosomal conformation profiles from a total of over 200,000 cells. From these data, we show evidence that divergence of transcription factor expression corresponds to species-specific epigenome landscapes. We find that conserved and divergent gene regulatory features are reflected in the evolution of the three-dimensional genome. Transposable elements contribute to nearly 80% of the human-specific candidate cis-regulatory elements in cortical cells. Through machine learning, we develop sequence-based predictors of candidate cis-regulatory elements in different species and demonstrate that the genomic regulatory syntax is highly preserved from rodents to primates. Finally, we show that epigenetic conservation combined with sequence similarity helps to uncover functional cis-regulatory elements and enhances our ability to interpret genetic variants contributing to neurological disease and traits.
Collapse
Affiliation(s)
- Nathan R Zemke
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Ethan J Armand
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
| | - Wenliang Wang
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Seoyeon Lee
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Jingtian Zhou
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Yang Eric Li
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Hanqing Liu
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
- Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Wei Tian
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Joseph R Nery
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Rosa G Castanon
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Anna Bartlett
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Julia K Osteen
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Daofeng Li
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St Louis, MO, USA
| | - Xiaoyu Zhuo
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St Louis, MO, USA
| | - Vincent Xu
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St Louis, MO, USA
| | - Lei Chang
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Keyi Dong
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Hannah S Indralingam
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Jonathan A Rink
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Yang Xie
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Michael Miller
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Fenna M Krienen
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
- Department of Genetics, Harvard Medical School, Boston, USA
| | - Qiangge Zhang
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Naz Taskin
- Allen Institute for Brain Science, Seattle, WA, USA
| | | | - Guoping Feng
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Steven A McCarroll
- Department of Genetics, Harvard Medical School, Boston, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Edward M Callaway
- Systems Neurobiology Laboratories, The Salk Institute for Biological Studies, La Jolla, CA, USA
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
| | - Ting Wang
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St Louis, MO, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Ed S Lein
- Allen Institute for Brain Science, Seattle, WA, USA
- Department of Neurological Surgery, University of Washington, Seattle, WA, USA
| | - M Margarita Behrens
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Joseph R Ecker
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA.
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, USA.
| | - Bing Ren
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA.
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA.
- Institute of Genomic Medicine, Moores Cancer Center, School of Medicine, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
5
|
Zhang K, Zemke NR, Armand EJ, Ren B. SnapATAC2: a fast, scalable and versatile tool for analysis of single-cell omics data. bioRxiv 2023:2023.09.11.557221. [PMID: 37745443 PMCID: PMC10515871 DOI: 10.1101/2023.09.11.557221] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Single-cell omics technologies have ushered in a new era for the study of dynamic gene regulation in complex tissues during development and disease pathogenesis. A major computational challenge in analyzing these datasets is to project the large-scale and high dimensional data into low-dimensional space while retaining the relative relationships between cells in order to decompose the cellular heterogeneity and reconstruct cell-type-specific gene regulatory programs. Conventional dimensionality reduction methods suffer from computational inefficiency, difficulty to capture the full spectrum of cellular heterogeneity, or inability to apply across diverse molecular modalities. Here, we report a fast and nonlinear dimensionality reduction algorithm that not only more accurately captures the heterogeneities of single-cell omics data, but also features runtime and memory usage that is computational efficient and linearly proportional to cell numbers. We implement this algorithm in a Python package named SnapATAC2, and demonstrate its superior performance, remarkable scalability and general adaptability using an array of single-cell omics data types, including single-cell ATAC-seq, single-cell RNA-seq, single-cell Hi-C, and single-cell multiomics datasets.
Collapse
|
6
|
Xie F, Armand EJ, Yao Z, Liu H, Bartlett A, Behrens MM, Li YE, Lucero JD, Luo C, Nery JR, Pinto-Duarte A, Poirion OB, Preissl S, Rivkin AC, Tasic B, Zeng H, Ren B, Ecker JR, Mukamel EA. Robust enhancer-gene regulation identified by single-cell transcriptomes and epigenomes. Cell Genom 2023; 3:100342. [PMID: 37492103 PMCID: PMC10363915 DOI: 10.1016/j.xgen.2023.100342] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 03/09/2023] [Accepted: 05/17/2023] [Indexed: 07/27/2023]
Abstract
Single-cell sequencing could help to solve the fundamental challenge of linking millions of cell-type-specific enhancers with their target genes. However, this task is confounded by patterns of gene co-expression in much the same way that genetic correlation due to linkage disequilibrium confounds fine-mapping in genome-wide association studies (GWAS). We developed a non-parametric permutation-based procedure to establish stringent statistical criteria to control the risk of false-positive associations in enhancer-gene association studies (EGAS). We applied our procedure to large-scale transcriptome and epigenome data from multiple tissues and species, including the mouse and human brain, to predict enhancer-gene associations genome wide. We tested the functional validity of our predictions by comparing them with chromatin conformation data and causal enhancer perturbation experiments. Our study shows how controlling for gene co-expression enables robust enhancer-gene linkage using single-cell sequencing data.
Collapse
Affiliation(s)
- Fangming Xie
- Department of Physics, University of California San Diego, La Jolla, CA 92037, USA
- Department of Biological Chemistry, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Ethan J. Armand
- Department of Cognitive Science, University of California San Diego, La Jolla, CA 92037, USA
| | - Zizhen Yao
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - Hanqing Liu
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Anna Bartlett
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - M. Margarita Behrens
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Yang Eric Li
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA 92037, USA
| | - Jacinta D. Lucero
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Chongyuan Luo
- Department of Human Genetics, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Joseph R. Nery
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Antonio Pinto-Duarte
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Olivier B. Poirion
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA 92037, USA
- The Jackson Laboratory, Farmington, CT, USA
| | - Sebastian Preissl
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA 92037, USA
- Institute of Experimental and Clinical Pharmacology and Toxicology, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Angeline C. Rivkin
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Bosiljka Tasic
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - Hongkui Zeng
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - Bing Ren
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA 92037, USA
| | - Joseph R. Ecker
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Eran A. Mukamel
- Department of Cognitive Science, University of California San Diego, La Jolla, CA 92037, USA
| |
Collapse
|
7
|
Zemke NR, Armand EJ, Wang W, Lee S, Zhou J, Li YE, Liu H, Tian W, Nery JR, Castanon RG, Bartlett A, Osteen JK, Li D, Zhuo X, Xu V, Miller M, Krienen FM, Zhang Q, Taskin N, Ting J, Feng G, McCarroll SA, Callaway EM, Wang T, Behrens MM, Lein ES, Ecker JR, Ren B. Comparative single cell epigenomic analysis of gene regulatory programs in the rodent and primate neocortex. bioRxiv 2023:2023.04.08.536119. [PMID: 37066152 PMCID: PMC10104177 DOI: 10.1101/2023.04.08.536119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Sequence divergence of cis- regulatory elements drives species-specific traits, but how this manifests in the evolution of the neocortex at the molecular and cellular level remains to be elucidated. We investigated the gene regulatory programs in the primary motor cortex of human, macaque, marmoset, and mouse with single-cell multiomics assays, generating gene expression, chromatin accessibility, DNA methylome, and chromosomal conformation profiles from a total of over 180,000 cells. For each modality, we determined species-specific, divergent, and conserved gene expression and epigenetic features at multiple levels. We find that cell type-specific gene expression evolves more rapidly than broadly expressed genes and that epigenetic status at distal candidate cis -regulatory elements (cCREs) evolves faster than promoters. Strikingly, transposable elements (TEs) contribute to nearly 80% of the human-specific cCREs in cortical cells. Through machine learning, we develop sequence-based predictors of cCREs in different species and demonstrate that the genomic regulatory syntax is highly preserved from rodents to primates. Lastly, we show that epigenetic conservation combined with sequence similarity helps uncover functional cis -regulatory elements and enhances our ability to interpret genetic variants contributing to neurological disease and traits.
Collapse
|
8
|
Armand EJ, Li J, Xie F, Luo C, Mukamel EA. Single-Cell Sequencing of Brain Cell Transcriptomes and Epigenomes. Neuron 2021; 109:11-26. [PMID: 33412093 PMCID: PMC7808568 DOI: 10.1016/j.neuron.2020.12.010] [Citation(s) in RCA: 111] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 11/17/2020] [Accepted: 12/08/2020] [Indexed: 12/21/2022]
Abstract
Single-cell sequencing technologies, including transcriptomic and epigenomic assays, are transforming our understanding of the cellular building blocks of neural circuits. By directly measuring multiple molecular signatures in thousands to millions of individual cells, single-cell sequencing methods can comprehensively characterize the diversity of brain cell types. These measurements uncover gene regulatory mechanisms that shape cellular identity and provide insight into developmental and evolutionary relationships between brain cell populations. Single-cell sequencing data can aid the design of tools for targeted functional studies of brain circuit components, linking molecular signatures with anatomy, connectivity, morphology, and physiology. Here, we discuss the fundamental principles of single-cell transcriptome and epigenome sequencing, integrative computational analysis of the data, and key applications in neuroscience.
Collapse
Affiliation(s)
- Ethan J Armand
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA 92037, USA
| | - Junhao Li
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA 92037, USA
| | - Fangming Xie
- Department of Physics, University of California, San Diego, La Jolla, CA 92037, USA
| | - Chongyuan Luo
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Eran A Mukamel
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA 92037, USA.
| |
Collapse
|