1
|
Schmitz EG, Griffith M, Griffith OL, Cooper MA. Identifying genetic errors of immunity due to mosaicism. J Exp Med 2025; 222:e20241045. [PMID: 40232243 PMCID: PMC11998702 DOI: 10.1084/jem.20241045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2025] [Revised: 02/24/2025] [Accepted: 03/24/2025] [Indexed: 04/16/2025] Open
Abstract
Inborn errors of immunity are monogenic disorders of the immune system that lead to immune deficiency and/or dysregulation in patients. Identification of precise genetic causes of disease aids diagnosis and advances our understanding of the human immune system; however, a significant portion of patients lack a molecular diagnosis. Somatic mosaicism, genetic changes in a subset of cells, is emerging as an important mechanism of immune disease in both young and older patients. Here, we review the current landscape of somatic genetic errors of immunity and methods for the detection and validation of somatic variants.
Collapse
Affiliation(s)
- Elizabeth G. Schmitz
- Division of Rheumatology/Immunology, Department of Pediatrics, Washington University in St. Louis, St. Louis, MO, USA
| | - Malachi Griffith
- Division of Oncology, Department of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| | - Obi L. Griffith
- Division of Oncology, Department of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| | - Megan A. Cooper
- Division of Rheumatology/Immunology, Department of Pediatrics, Washington University in St. Louis, St. Louis, MO, USA
| |
Collapse
|
2
|
Kock KH, Tan LM, Han KY, Ando Y, Jevapatarakul D, Chatterjee A, Lin QXX, Buyamin EV, Sonthalia R, Rajagopalan D, Tomofuji Y, Sankaran S, Park MS, Abe M, Chantaraamporn J, Furukawa S, Ghosh S, Inoue G, Kojima M, Kouno T, Lim J, Myouzen K, Nguantad S, Oh JM, Rayan NA, Sarkar S, Suzuki A, Thungsatianpun N, Venkatesh PN, Moody J, Nakano M, Chen Z, Tian C, Zhang Y, Tong Y, Tan CTY, Tizazu AM, Loh M, Hwang YY, Ho RC, Larbi A, Ng TP, Won HH, Wright FA, Villani AC, Park JE, Choi M, Liu B, Maitra A, Pithukpakorn M, Suktitipat B, Ishigaki K, Okada Y, Yamamoto K, Carninci P, Chambers JC, Hon CC, Matangkasombut P, Charoensawan V, Majumder PP, Shin JW, Park WY, Prabhakar S. Asian diversity in human immune cells. Cell 2025; 188:2288-2306.e24. [PMID: 40112801 DOI: 10.1016/j.cell.2025.02.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Revised: 06/03/2024] [Accepted: 02/20/2025] [Indexed: 03/22/2025]
Abstract
The relationships of human diversity with biomedical phenotypes are pervasive yet remain understudied, particularly in a single-cell genomics context. Here, we present the Asian Immune Diversity Atlas (AIDA), a multi-national single-cell RNA sequencing (scRNA-seq) healthy reference atlas of human immune cells. AIDA comprises 1,265,624 circulating immune cells from 619 donors, spanning 7 population groups across 5 Asian countries, and 6 controls. Though population groups are frequently compared at the continental level, we found that sub-continental diversity, age, and sex pervasively impacted cellular and molecular properties of immune cells. These included differential abundance of cell neighborhoods as well as cell populations and genes relevant to disease risk, pathogenesis, and diagnostics. We discovered functional genetic variants influencing cell-type-specific gene expression, which were under-represented in non-Asian populations, and helped contextualize disease-associated variants. AIDA enables analyses of multi-ancestry disease datasets and facilitates the development of precision medicine efforts in Asia and beyond.
Collapse
Affiliation(s)
- Kian Hong Kock
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore
| | - Le Min Tan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore
| | - Kyung Yeon Han
- Samsung Genome Institute, Samsung Medical Center, Seoul 06351, Republic of Korea
| | - Yoshinari Ando
- Laboratory for Advanced Genomics Circuit, RIKEN Center for Integrative Medical Sciences (IMS), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan; Laboratory for Transcriptome Technology, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Damita Jevapatarakul
- Single-cell omics and Systems Biology of Diseases (scSyBiD) Research Unit, Faculty of Science, Mahidol University, Bangkok 10400, Thailand; Department of Microbiology, Faculty of Science, Mahidol University, Bangkok 10400, Thailand
| | - Ankita Chatterjee
- John C. Martin Centre for Liver Research and Innovations, Sonarpur, Kolkata 700150, India
| | - Quy Xiao Xuan Lin
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore
| | - Eliora Violain Buyamin
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore
| | - Radhika Sonthalia
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore
| | - Deepa Rajagopalan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore
| | - Yoshihiko Tomofuji
- Laboratory for Systems Genetics, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan; Department of Statistical Genetics, Graduate School of Medicine, Osaka University, 2-2 Yamadaoka, Suita, Osaka 565-0871, Japan; Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, 2-2 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Shvetha Sankaran
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore
| | - Mi-So Park
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore; Samsung Advanced Institute for Health Sciences & Technology, Sungkyunkwan University, Seoul 06351, Republic of Korea
| | - Mai Abe
- Laboratory for Autoimmune Diseases, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Juthamard Chantaraamporn
- Single-cell omics and Systems Biology of Diseases (scSyBiD) Research Unit, Faculty of Science, Mahidol University, Bangkok 10400, Thailand; Department of Biochemistry, Faculty of Science, Mahidol University, Bangkok 10400, Thailand; Integrative Computational BioScience Center, Mahidol University, Nakhon Pathom 73170, Thailand
| | - Seiko Furukawa
- Laboratory for Autoimmune Diseases, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Supratim Ghosh
- Biotechnology Research and Innovation Council - National Institute of Biomedical Genomics, Kalyani, West Bengal 741251, India
| | - Gyo Inoue
- Laboratory for Autoimmune Diseases, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Miki Kojima
- Laboratory for Transcriptome Technology, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Tsukasa Kouno
- Laboratory for Advanced Genomics Circuit, RIKEN Center for Integrative Medical Sciences (IMS), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Jinyeong Lim
- Samsung Genome Institute, Samsung Medical Center, Seoul 06351, Republic of Korea
| | - Keiko Myouzen
- Laboratory for Autoimmune Diseases, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Sarintip Nguantad
- Single-cell omics and Systems Biology of Diseases (scSyBiD) Research Unit, Faculty of Science, Mahidol University, Bangkok 10400, Thailand; Department of Biochemistry, Faculty of Science, Mahidol University, Bangkok 10400, Thailand; Integrative Computational BioScience Center, Mahidol University, Nakhon Pathom 73170, Thailand
| | - Jin-Mi Oh
- Samsung Genome Institute, Samsung Medical Center, Seoul 06351, Republic of Korea
| | - Nirmala Arul Rayan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore
| | - Sumanta Sarkar
- Biotechnology Research and Innovation Council - National Institute of Biomedical Genomics, Kalyani, West Bengal 741251, India
| | - Akari Suzuki
- Laboratory for Autoimmune Diseases, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Narita Thungsatianpun
- Single-cell omics and Systems Biology of Diseases (scSyBiD) Research Unit, Faculty of Science, Mahidol University, Bangkok 10400, Thailand; Department of Microbiology, Faculty of Science, Mahidol University, Bangkok 10400, Thailand
| | - Prasanna Nori Venkatesh
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore
| | - Jonathan Moody
- Laboratory for Genome Information Analysis, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Masahiro Nakano
- Laboratory for Autoimmune Diseases, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan; Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8654, Japan
| | - Ziyue Chen
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore
| | - Chi Tian
- Department of Pharmacy, Faculty of Science, National University of Singapore (NUS), Singapore 117543, Singapore
| | - Yuntian Zhang
- Department of Biomedical Informatics, Yong Loo Lin School of Medicine (YLLSoM), NUS, Singapore 119228, Singapore
| | - Yihan Tong
- Department of Pharmacy, Faculty of Science, National University of Singapore (NUS), Singapore 117543, Singapore
| | - Crystal T Y Tan
- Singapore Immunology Network (SIgN), A(∗)STAR, 8A Biomedical Grove, Immunos, Singapore 138648, Singapore
| | - Anteneh Mehari Tizazu
- Singapore Immunology Network (SIgN), A(∗)STAR, 8A Biomedical Grove, Immunos, Singapore 138648, Singapore
| | - Marie Loh
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore; Nanyang Technological University (NTU), Lee Kong Chian School of Medicine (LKCMedicine), 11 Mandalay Road, Singapore 308232, Singapore
| | - You Yi Hwang
- Singapore Immunology Network (SIgN), A(∗)STAR, 8A Biomedical Grove, Immunos, Singapore 138648, Singapore
| | - Roger C Ho
- Department of Psychological Medicine, YLLSoM, NUS, 1E Kent Ridge Road, Singapore 119228, Singapore; Institute for Health Innovation & Technology, NUS, 14 Medical Drive, Singapore 117599, Singapore
| | - Anis Larbi
- Singapore Immunology Network (SIgN), A(∗)STAR, 8A Biomedical Grove, Immunos, Singapore 138648, Singapore
| | - Tze Pin Ng
- Department of Geriatric Medicine, Khoo Teck Puat Hospital, Singapore 768828, Singapore; St Luke's Hospital, Singapore 659674, Singapore; Geriatric Education and Research Institute, Singapore 768024, Singapore
| | - Hong-Hee Won
- Samsung Genome Institute, Samsung Medical Center, Seoul 06351, Republic of Korea; Samsung Advanced Institute for Health Sciences & Technology, Sungkyunkwan University, Seoul 06351, Republic of Korea
| | - Fred A Wright
- Department of Biological Sciences, Bioinformatics Research Center, and Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| | - Alexandra-Chloé Villani
- Center for Immunology and Inflammatory Diseases, Department of Medicine, and Mass General Cancer Center, Center for Cancer Research, Massachusetts General Hospital, Boston, MA 02114, USA; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Harvard Medical School, Boston, MA 02115, USA
| | - Jong-Eun Park
- Graduate School of Medical Science and Engineering, KAIST, Daejeon 34051, Republic of Korea
| | - Murim Choi
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul 03080, Republic of Korea
| | - Boxiang Liu
- Department of Pharmacy, Faculty of Science, National University of Singapore (NUS), Singapore 117543, Singapore; Department of Biomedical Informatics, Yong Loo Lin School of Medicine (YLLSoM), NUS, Singapore 119228, Singapore; Precision Medicine Translational Research Programme, NUS Centre for Cancer Research, and Cardiovascular-Metabolic Disease Translational Research Programme, YLLSoM, NUS, Singapore 119228, Singapore
| | - Arindam Maitra
- Biotechnology Research and Innovation Council - National Institute of Biomedical Genomics, Kalyani, West Bengal 741251, India
| | - Manop Pithukpakorn
- Siriraj Genomics, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand; Department of Medicine, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Bhoom Suktitipat
- Integrative Computational BioScience Center, Mahidol University, Nakhon Pathom 73170, Thailand; Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Kazuyoshi Ishigaki
- Laboratory for Human Immunogenetics, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Yukinori Okada
- Laboratory for Systems Genetics, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan; Department of Statistical Genetics, Graduate School of Medicine, Osaka University, 2-2 Yamadaoka, Suita, Osaka 565-0871, Japan; Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, 2-2 Yamadaoka, Suita, Osaka 565-0871, Japan; Department of Genome Informatics, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8654, Japan; Laboratory of Statistical Immunology, Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Japan; Premium Research Institute for Human Metaverse Medicine, Osaka University, Suita 565-0871, Japan
| | - Kazuhiko Yamamoto
- Laboratory for Autoimmune Diseases, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Piero Carninci
- Laboratory for Transcriptome Technology, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan; Genomics Research Center, Fondazione Human Technopole, Viale Rita Levi-Montalcini, 1 - Area MIND, Milano, Lombardy 20157, Italy
| | - John C Chambers
- Nanyang Technological University (NTU), Lee Kong Chian School of Medicine (LKCMedicine), 11 Mandalay Road, Singapore 308232, Singapore
| | - Chung-Chau Hon
- Laboratory for Genome Information Analysis, RIKEN Center for IMS, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan; Graduate School of Integrated Sciences for Life, Hiroshima University, 1-3-3-2 Kagamiyama, Higashihiroshima, Hiroshima 739-0046, Japan
| | - Ponpan Matangkasombut
- Single-cell omics and Systems Biology of Diseases (scSyBiD) Research Unit, Faculty of Science, Mahidol University, Bangkok 10400, Thailand; Department of Microbiology, Faculty of Science, Mahidol University, Bangkok 10400, Thailand
| | - Varodom Charoensawan
- Single-cell omics and Systems Biology of Diseases (scSyBiD) Research Unit, Faculty of Science, Mahidol University, Bangkok 10400, Thailand; Department of Biochemistry, Faculty of Science, Mahidol University, Bangkok 10400, Thailand; Integrative Computational BioScience Center, Mahidol University, Nakhon Pathom 73170, Thailand; Siriraj Genomics, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand; Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand; Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand; School of Chemistry, Institute of Science, Suranaree University of Technology, Nakhon Ratchasima 30000, Thailand
| | - Partha P Majumder
- John C. Martin Centre for Liver Research and Innovations, Sonarpur, Kolkata 700150, India; Indian Statistical Institute, 203 B.T. Road, Kolkata 700108, India
| | - Jay W Shin
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore; Laboratory for Advanced Genomics Circuit, RIKEN Center for Integrative Medical Sciences (IMS), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan.
| | - Woong-Yang Park
- Samsung Genome Institute, Samsung Medical Center, Seoul 06351, Republic of Korea.
| | - Shyam Prabhakar
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A(∗)STAR), 60 Biopolis Street, Genome, Singapore 138672, Singapore; Nanyang Technological University (NTU), Lee Kong Chian School of Medicine (LKCMedicine), 11 Mandalay Road, Singapore 308232, Singapore; Cancer Science Institute of Singapore, NUS, 14 Medical Drive, Singapore 117599, Singapore.
| |
Collapse
|
3
|
Dababneh SF, Babini H, Jiménez-Sábado V, Teves SS, Kim KH, Tibbits GF. Dissecting cardiovascular disease-associated noncoding genetic variants using human iPSC models. Stem Cell Reports 2025; 20:102467. [PMID: 40118058 DOI: 10.1016/j.stemcr.2025.102467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 02/21/2025] [Accepted: 02/22/2025] [Indexed: 03/23/2025] Open
Abstract
Advancements in genomics have revealed hundreds of loci associated with cardiovascular diseases, highlighting the role genetic variants play in disease pathogenesis. Notably, most variants lie within noncoding genomic regions that modulate transcription factor binding, chromatin accessibility, and thereby the expression levels and cell type specificity of gene transcripts. Human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) have emerged as a powerful tool to delineate the pathogenicity of such variants and elucidate the underlying transcriptional mechanisms. Our review discusses the basics of noncoding variant-mediated pathogenesis, the methodologies utilized, and how hiPSC-based heart models can be leveraged to dissect the mechanisms of noncoding variants.
Collapse
Affiliation(s)
- Saif F Dababneh
- Department of Cellular and Physiological Sciences, University of British Columbia, Vancouver, BC V6T 1Z3, Canada; Cellular and Regenerative Medicine Centre, BC Children's Hospital Research Institute, 938 West 28th Avenue, Vancouver, BC V5Z 4H4, Canada
| | - Hosna Babini
- Cellular and Regenerative Medicine Centre, BC Children's Hospital Research Institute, 938 West 28th Avenue, Vancouver, BC V5Z 4H4, Canada; Departments of Molecular Biology and Biochemistry / Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Verónica Jiménez-Sábado
- Cellular and Regenerative Medicine Centre, BC Children's Hospital Research Institute, 938 West 28th Avenue, Vancouver, BC V5Z 4H4, Canada; Departments of Molecular Biology and Biochemistry / Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Sheila S Teves
- Department of Biochemistry and Molecular Biology, Life Sciences Institute, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - Kyoung-Han Kim
- Department of Cellular and Molecular Medicine, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada; University of Ottawa Heart Institute, Ottawa, ON K1Y 4W7, Canada
| | - Glen F Tibbits
- Cellular and Regenerative Medicine Centre, BC Children's Hospital Research Institute, 938 West 28th Avenue, Vancouver, BC V5Z 4H4, Canada; Departments of Molecular Biology and Biochemistry / Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC V5A 1S6, Canada; School of Biomedical Engineering, University of British Columbia, Vancouver, BC V6T 2B9, Canada.
| |
Collapse
|
4
|
Li H, Ma T, Zhao Z, Chen Y, Xi X, Zhao X, Zhou X, Gao Y, Wei L, Zhang X. scTML: a pan-cancer single-cell landscape of multiple mutation types. Nucleic Acids Res 2025; 53:D1547-D1556. [PMID: 39420637 PMCID: PMC11701564 DOI: 10.1093/nar/gkae898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Revised: 09/18/2024] [Accepted: 09/27/2024] [Indexed: 10/19/2024] Open
Abstract
Investigating mutations, including single nucleotide variations (SNVs), gene fusions, alternative splicing and copy number variations (CNVs), is fundamental to cancer study. Recent computational methods and biological research have demonstrated the reliability and biological significance of detecting mutations from single-cell transcriptomic data. However, there is a lack of a single-cell-level database containing comprehensive mutation information in all types of cancer. Establishing a single-cell mutation landscape from the huge emerging single-cell transcriptomic data can provide a critical resource for elucidating the mechanisms of tumorigenesis and evolution. Here, we developed scTML (http://sctml.xglab.tech/), the first database offering a pan-cancer single-cell landscape of multiple mutation types. It includes SNVs, insertions/deletions, gene fusions, alternative splicing and CNVs, along with gene expression, cell states and other phenotype information. The data are from 74 datasets with 2 582 633 cells, including 35 full-length (Smart-seq2) transcriptomic single-cell datasets (all publicly available data with raw sequencing files), 23 datasets from 10X technology and 16 spatial transcriptomic datasets. scTML enables users to interactively explore multiple mutation landscapes across tumors or cell types, analyze single-cell-level mutation-phenotype associations and detect cell subclusters of interest. scTML is an important resource that will significantly advance deciphering intra-tumor and inter-tumor heterogeneity, and how mutations shape cell phenotypes.
Collapse
Affiliation(s)
- Haochen Li
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing 100084, China
- School of Medicine, Tsinghua Medicine, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing 100084, China
| | - Tianxing Ma
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing 100084, China
| | - Zetong Zhao
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing 100084, China
- Department of Biostatistics, School of Public Health, Yale University, 60 College St, New Haven, CT 06510, USA
| | - Yixin Chen
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing 100084, China
| | - Xi Xi
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing 100084, China
| | - Xiaofei Zhao
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing 100084, China
| | - Xiaoxiang Zhou
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, No. 17 Panjiayuan Nanli, Chaoyang District, Beijing 100021, China
| | - Yibo Gao
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, No. 17 Panjiayuan Nanli, Chaoyang District, Beijing 100021, China
- Institute of Cancer Research, Henan Academy of Innovations in Medical Science, No. 2 Biotechnology Street, Hangkonggang District, Zhengzhou 450000, China
- Department of Gastroenterology, Shanxi Province Cancer Hospital/Shanxi Hospital Affiliated to Cancers Hospital, Chinese Academy of Medical Sciences/Cancer Hospital Affiliated to Shanxi Medical University, No. 3,ZhiGongXin Street, Xinghualing District, Taiyuan 030013, China
- Central Laboratory and Shenzhen Key Laboratory of Epigenetics and Precision Medicine for Cancers, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital and Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, 113 Baohe Road, Longgang District, Shenzhen 518116, China
- Laboratory of Translational Medicine, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, No. 17 Panjiayuan Nanli, Chaoyang District, Beijing 100021, China
- State Key Laboratory of Molecular Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, No. 17 Panjiayuan Nanli, Chaoyang District, Beijing 100021, China
| | - Lei Wei
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing 100084, China
| | - Xuegong Zhang
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing 100084, China
- School of Medicine, Tsinghua Medicine, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing 100084, China
| |
Collapse
|
5
|
Qi G, Battle A. Computational methods for allele-specific expression in single cells. Trends Genet 2024; 40:939-949. [PMID: 39127549 PMCID: PMC11537817 DOI: 10.1016/j.tig.2024.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 07/16/2024] [Accepted: 07/17/2024] [Indexed: 08/12/2024]
Abstract
Allele-specific expression (ASE) is a powerful signal that can be used to investigate multiple molecular mechanisms, such as cis-regulatory effects and imprinting. Single-cell RNA-sequencing (scRNA-seq) enables ASE characterization at the resolution of individual cells. In this review, we highlight the computational methods for processing and analyzing single-cell ASE data. We first describe a bioinformatics pipeline to obtain ASE counts from raw reads synthesized from previous literature. We then discuss statistical methods for detecting allelic imbalance and its variability across conditions using scRNA-seq data. In addition, we describe other methods that use single-cell ASE to address specific biological questions. Finally, we discuss future directions and emphasize the need for an integrated, optimized bioinformatics pipeline, and further development of statistical methods for different technologies.
Collapse
Affiliation(s)
- Guanghao Qi
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA.
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA; Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA; Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD 21205, USA.
| |
Collapse
|
6
|
Islam M, Yang Y, Simmons AJ, Shah VM, Musale KP, Xu Y, Tasneem N, Chen Z, Trinh LT, Molina P, Ramirez-Solano MA, Sadien ID, Dou J, Rolong A, Chen K, Magnuson MA, Rathmell JC, Macara IG, Winton DJ, Liu Q, Zafar H, Kalhor R, Church GM, Shrubsole MJ, Coffey RJ, Lau KS. Temporal recording of mammalian development and precancer. Nature 2024; 634:1187-1195. [PMID: 39478207 PMCID: PMC11525190 DOI: 10.1038/s41586-024-07954-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 08/15/2024] [Indexed: 11/02/2024]
Abstract
Temporal ordering of cellular events offers fundamental insights into biological phenomena. Although this is traditionally achieved through continuous direct observations1,2, an alternative solution leverages irreversible genetic changes, such as naturally occurring mutations, to create indelible marks that enables retrospective temporal ordering3-5. Using a multipurpose, single-cell CRISPR platform, we developed a molecular clock approach to record the timing of cellular events and clonality in vivo, with incorporation of cell state and lineage information. Using this approach, we uncovered precise timing of tissue-specific cell expansion during mouse embryonic development, unconventional developmental relationships between cell types and new epithelial progenitor states by their unique genetic histories. Analysis of mouse adenomas, coupled to multiomic and single-cell profiling of human precancers, with clonal analysis of 418 human polyps, demonstrated the occurrence of polyclonal initiation in 15-30% of colonic precancers, showing their origins from multiple normal founders. Our study presents a multimodal framework that lays the foundation for in vivo recording, integrating synthetic or natural indelible genetic changes with single-cell analyses, to explore the origins and timing of development and tumorigenesis in mammalian systems.
Collapse
Affiliation(s)
- Mirazul Islam
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
| | - Yilin Yang
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
| | - Alan J Simmons
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
| | - Vishal M Shah
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
| | - Krushna Pavan Musale
- Department of Computer Science and Engineering, Indian Institute of Technology Kanpur, Kanpur, India
| | - Yanwen Xu
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
| | - Naila Tasneem
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
| | - Zhengyi Chen
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Chemical and Physical Biology Program, Vanderbilt University, Nashville, TN, USA
| | - Linh T Trinh
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Center for Stem Cell Biology, Vanderbilt University, Nashville, TN, USA
| | - Paola Molina
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
| | - Marisol A Ramirez-Solano
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Iannish D Sadien
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Jinzhuang Dou
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Andrea Rolong
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
| | - Ken Chen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Mark A Magnuson
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Center for Stem Cell Biology, Vanderbilt University, Nashville, TN, USA
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN, USA
| | - Jeffrey C Rathmell
- Vanderbilt Center for Immunobiology, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Ian G Macara
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
| | - Douglas J Winton
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Qi Liu
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Hamim Zafar
- Department of Computer Science and Engineering, Indian Institute of Technology Kanpur, Kanpur, India
- Department of Biological Sciences and Bioengineering, Indian Institute of Technology Kanpur, Kanpur, India
| | - Reza Kalhor
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - George M Church
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| | - Martha J Shrubsole
- Department of Medicine, Division of Epidemiology, Vanderbilt University Medical Center, Nashville, TN, USA
- Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Robert J Coffey
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA.
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA.
- Vanderbilt Center for Stem Cell Biology, Vanderbilt University, Nashville, TN, USA.
- Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA.
- Department of Medicine, Division of Gastroenterology, Hepatology and Nutrition, Vanderbilt University Medical Center, Nashville, TN, USA.
| | - Ken S Lau
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA.
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA.
- Chemical and Physical Biology Program, Vanderbilt University, Nashville, TN, USA.
- Vanderbilt Center for Stem Cell Biology, Vanderbilt University, Nashville, TN, USA.
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN, USA.
- Vanderbilt Center for Immunobiology, Vanderbilt University Medical Center, Nashville, TN, USA.
- Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA.
- Department of Surgery, Vanderbilt University Medical Center, Nashville, TN, USA.
| |
Collapse
|
7
|
Wiens M, Farahani H, Scott RW, Underhill TM, Bashashati A. Benchmarking bulk and single-cell variant-calling approaches on Chromium scRNA-seq and scATAC-seq libraries. Genome Res 2024; 34:1196-1210. [PMID: 39147582 PMCID: PMC11444184 DOI: 10.1101/gr.277066.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 08/12/2024] [Indexed: 08/17/2024]
Abstract
Single-cell sequencing methodologies such as scRNA-seq and scATAC-seq have become widespread and effective tools to interrogate tissue composition. Increasingly, variant callers are being applied to these methodologies to resolve the genetic heterogeneity of a sample, especially in the case of detecting the clonal architecture of a tumor. Typically, traditional bulk DNA variant callers are applied to the pooled reads of a single-cell library to detect candidate mutations. Recently, multiple studies have applied such callers on reads from individual cells, with some citing the ability to detect rare variants with higher sensitivity. Many studies apply these two approaches to the Chromium (10x Genomics) scRNA-seq and scATAC-seq methodologies. However, Chromium-based libraries may offer additional challenges to variant calling compared with existing single-cell methodologies, raising questions regarding the validity of variants obtained from such a workflow. To determine the merits and challenges of various variant-calling approaches on Chromium scRNA-seq and scATAC-seq libraries, we use sample libraries with matched bulk whole-genome sequencing to evaluate the performance of callers. We review caller performance, finding that bulk callers applied on pooled reads significantly outperform individual-cell approaches. We also evaluate variants unique to scRNA-seq and scATAC-seq methodologies, finding patterns of noise but also potential capture of RNA-editing events. Finally, we review the notion that variant calling at the single-cell level can detect rare somatic variants, providing empirical results that suggest resolving such variants is infeasible in single-cell Chromium libraries.
Collapse
Affiliation(s)
- Matthew Wiens
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia V6T 2B9, Canada
| | - Hossein Farahani
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia V6T 2B9, Canada
| | - R Wilder Scott
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia V6T 2B9, Canada
| | - T Michael Underhill
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia V6T 2B9, Canada
- Department of Cellular & Physiological Sciences, University of British Columbia, Vancouver, British Columbia V6T 2A1, Canada
| | - Ali Bashashati
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia V6T 2B9, Canada;
- Department of Pathology & Laboratory Medicine, University of British Columbia, Vancouver, British Columbia V6T 1Z7, Canada
| |
Collapse
|
8
|
Demirci I, Larsson AJM, Chen X, Hartman J, Sandberg R, Frisén J. Inferring clonal somatic mutations directed by X chromosome inactivation status in single cells. Genome Biol 2024; 25:214. [PMID: 39123248 PMCID: PMC11312698 DOI: 10.1186/s13059-024-03360-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 07/30/2024] [Indexed: 08/12/2024] Open
Abstract
Analysis of clonal dynamics in human tissues is enabled by somatic genetic variation. Here, we show that analysis of mitochondrial mutations in single cells is dramatically improved in females when using X chromosome inactivation to select informative clonal mutations. Applying this strategy to human peripheral mononuclear blood cells reveals clonal structures within T cells that otherwise are blurred by non-informative mutations, including the separation of gamma-delta T cells, suggesting this approach can be used to decipher clonal dynamics of cells in human tissues.
Collapse
Affiliation(s)
- Ilke Demirci
- Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden
| | - Anton J M Larsson
- Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden
| | - Xinsong Chen
- Department of Oncology-Pathology, Karolinska Institutet, Stockholm, Sweden
| | - Johan Hartman
- Department of Oncology-Pathology, Karolinska Institutet, Stockholm, Sweden
- Department of Clinical Pathology and Cancer Diagnostics, Karolinska University Hospital, Stockholm, Sweden
| | - Rickard Sandberg
- Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden.
| | - Jonas Frisén
- Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden.
| |
Collapse
|
9
|
Hong SC, Muyas F, Cortés-Ciriano I, Hormoz S. scAI-SNP: a method for inferring ancestry from single-cell data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.14.594208. [PMID: 38798590 PMCID: PMC11118306 DOI: 10.1101/2024.05.14.594208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Collaborative efforts, such as the Human Cell Atlas, are rapidly accumulating large amounts of single-cell data. To ensure that single-cell atlases are representative of human genetic diversity, we need to determine the ancestry of the donors from whom single-cell data are generated. Self-reporting of race and ethnicity, although important, can be biased and is not always available for the datasets already collected. Here, we introduce scAI-SNP, a tool to infer ancestry directly from single-cell genomics data. To train scAI-SNP, we identified 4.5 million ancestry-informative single-nucleotide polymorphisms (SNPs) in the 1000 Genomes Project dataset across 3201 individuals from 26 population groups. For a query single-cell data set, scAI-SNP uses these ancestry-informative SNPs to compute the contribution of each of the 26 population groups to the ancestry of the donor from whom the cells were obtained. Using diverse single-cell data sets with matched whole-genome sequencing data, we show that scAI-SNP is robust to the sparsity of single-cell data, can accurately and consistently infer ancestry from samples derived from diverse types of tissues and cancer cells, and can be applied to different modalities of single-cell profiling assays, such as single-cell RNA-seq and single-cell ATAC-seq. Finally, we argue that ensuring that single-cell atlases represent diverse ancestry, ideally alongside race and ethnicity, is ultimately important for improved and equitable health outcomes by accounting for human diversity.
Collapse
Affiliation(s)
- Sung Chul Hong
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Francesc Muyas
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK
| | - Isidro Cortés-Ciriano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK
| | - Sahand Hormoz
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
10
|
Yuan CU, Quah FX, Hemberg M. Single-cell and spatial transcriptomics: Bridging current technologies with long-read sequencing. Mol Aspects Med 2024; 96:101255. [PMID: 38368637 DOI: 10.1016/j.mam.2024.101255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 01/30/2024] [Accepted: 02/07/2024] [Indexed: 02/20/2024]
Abstract
Single-cell technologies have transformed biomedical research over the last decade, opening up new possibilities for understanding cellular heterogeneity, both at the genomic and transcriptomic level. In addition, more recent developments of spatial transcriptomics technologies have made it possible to profile cells in their tissue context. In parallel, there have been substantial advances in sequencing technologies, and the third generation of methods are able to produce reads that are tens of kilobases long, with error rates matching the second generation short reads. Long reads technologies make it possible to better map large genome rearrangements and quantify isoform specific abundances. This further improves our ability to characterize functionally relevant heterogeneity. Here, we show how researchers have begun to combine single-cell, spatial transcriptomics, and long-read technologies, and how this is resulting in powerful new approaches to profiling both the genome and the transcriptome. We discuss the achievements so far, and we highlight remaining challenges and opportunities.
Collapse
Affiliation(s)
- Chengwei Ulrika Yuan
- Department of Biochemistry, University of Cambridge, Cambridge, UK; Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Fu Xiang Quah
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Martin Hemberg
- Gene Lay Institute, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
11
|
Lynch MP, Wang Y, Ho Sui S, Gatto L, Culhane AC. demuxSNP: supervised demultiplexing single-cell RNA sequencing using cell hashing and SNPs. Gigascience 2024; 13:giae090. [PMID: 39607981 PMCID: PMC11604057 DOI: 10.1093/gigascience/giae090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 09/27/2024] [Accepted: 10/23/2024] [Indexed: 11/30/2024] Open
Abstract
BACKGROUND Multiplexing single-cell RNA sequencing experiments reduces sequencing cost and facilitates larger-scale studies. However, factors such as cell hashing quality and class size imbalance impact demultiplexing algorithm performance, reducing cost-effectiveness. FINDINGS We propose a supervised algorithm, demuxSNP, which leverages both cell hashing and genetic variation between individuals (single-nucletotide polymorphisms [SNPs]). demuxSNP addresses fundamental limitations in demultiplexing methods that use only one data modality. Some cells may be confidently demultiplexed using probabilistic hashing methods. demuxSNP uses these data to infer the genotype of singlet and doublet clusters and predict on cells assigned as negative, uncertain, or doublet using a nearest-neighbor approach adapted for missing data.We benchmarked demuxSNP against hashing, genotype-free SNP and hybrid methods on simulated and real data from renal cell cancer. demuxSNP outperformed standalone hashing methods on low-quality hashing data benchmark, improved overall classification accuracy, and allowed more high RNA quality cells to be recovered. Through varying simulated doublet rates, we showed that genotype-free SNP and hybrid methods that leverage them were impacted by class size imbalance and doublet rate. demuxSNP's supervised approach was more robust to doublet rate in experiments with class size imbalance. CONCLUSIONS demuxSNP uses hashing and SNP data to demultiplex datasets with low hashing quality where biological samples are genetically distinct. Unassigned or negative cells with high RNA quality are recovered, making more cells available for analysis. Data simulation and benchmarking pipelines as well as processed benchmarking data for 5-50% doublets are publicly available. demuxSNP is available as an R/Bioconductor package (https://doi.org/doi:10.18129/B9.bioc.demuxSNP).
Collapse
Affiliation(s)
- Michael P Lynch
- School of Medicine, Limerick Digital Cancer Research Centre, Health Research Institute (HRI), University of Limerick, Limerick V94 T9PX, Ireland
| | - Yufei Wang
- Department of Cancer Immunology and Virology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Harvard Medical School, Boston, MA 02115, USA
| | - Shannan Ho Sui
- Harvard T.H. Chan School of Public Health, Boston, MA 02215, USA
| | - Laurent Gatto
- Computational Biology and Bioinformatics Unit (CBIO), de Duve Institute, Université catholique de Louvain, Brussels 1200, Belgium
| | - Aedin C Culhane
- School of Medicine, Limerick Digital Cancer Research Centre, Health Research Institute (HRI), University of Limerick, Limerick V94 T9PX, Ireland
| |
Collapse
|
12
|
Islam M, Yang Y, Simmons AJ, Shah VM, Pavan MK, Xu Y, Tasneem N, Chen Z, Trinh LT, Molina P, Ramirez-Solano MA, Sadien I, Dou J, Chen K, Magnuson MA, Rathmell JC, Macara IG, Winton D, Liu Q, Zafar H, Kalhor R, Church GM, Shrubsole MJ, Coffey RJ, Lau KS. Temporal recording of mammalian development and precancer. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.18.572260. [PMID: 38187699 PMCID: PMC10769302 DOI: 10.1101/2023.12.18.572260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Key to understanding many biological phenomena is knowing the temporal ordering of cellular events, which often require continuous direct observations [1, 2]. An alternative solution involves the utilization of irreversible genetic changes, such as naturally occurring mutations, to create indelible markers that enables retrospective temporal ordering [3-8]. Using NSC-seq, a newly designed and validated multi-purpose single-cell CRISPR platform, we developed a molecular clock approach to record the timing of cellular events and clonality in vivo , while incorporating assigned cell state and lineage information. Using this approach, we uncovered precise timing of tissue-specific cell expansion during murine embryonic development and identified new intestinal epithelial progenitor states by their unique genetic histories. NSC-seq analysis of murine adenomas and single-cell multi-omic profiling of human precancers as part of the Human Tumor Atlas Network (HTAN), including 116 scRNA-seq datasets and clonal analysis of 418 human polyps, demonstrated the occurrence of polyancestral initiation in 15-30% of colonic precancers, revealing their origins from multiple normal founders. Thus, our multimodal framework augments existing single-cell analyses and lays the foundation for in vivo multimodal recording, enabling the tracking of lineage and temporal events during development and tumorigenesis.
Collapse
|