1
|
Shao DD, Kriz AJ, Snellings DA, Zhou Z, Zhao Y, Enyenihi L, Walsh C. Advances in single-cell DNA sequencing enable insights into human somatic mosaicism. Nat Rev Genet 2025:10.1038/s41576-025-00832-3. [PMID: 40281095 DOI: 10.1038/s41576-025-00832-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/05/2025] [Indexed: 04/29/2025]
Abstract
DNA sequencing from bulk or clonal human tissues has shown that genetic mosaicism is common and contributes to both cancer and non-cancerous disorders. However, single-cell resolution is required to understand the full genetic heterogeneity that exists within a tissue and the mechanisms that lead to somatic mosaicism. Single-cell DNA-sequencing technologies have traditionally trailed behind those of single-cell transcriptomics and epigenomics, largely because most applications require whole-genome amplification before costly whole-genome sequencing. Now, recent technological and computational advances are enabling the use of single-cell DNA sequencing to tackle previously intractable problems, such as delineating the genetic landscape of tissues with complex clonal patterns, of samples where cellular material is scarce and of non-cycling, postmitotic cells. Single-cell genomes are also revealing the mutational patterns that arise from biological processes or disease states, and have made it possible to track cell lineage in human tissues. These advances in our understanding of tissue biology and our ability to identify disease mechanisms will ultimately transform how disease is diagnosed and monitored.
Collapse
Affiliation(s)
- Diane D Shao
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA.
- Division of Genetics and Genomics, Department of Paediatrics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA.
| | - Andrea J Kriz
- Division of Genetics and Genomics, Department of Paediatrics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Daniel A Snellings
- Division of Genetics and Genomics, Department of Paediatrics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Zinan Zhou
- Division of Genetics and Genomics, Department of Paediatrics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Yifan Zhao
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Liz Enyenihi
- Division of Genetics and Genomics, Department of Paediatrics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- Biological and Biomedical Sciences Graduate Program, Harvard Medical School, Boston, MA, USA
| | - Christopher Walsh
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA.
- Division of Genetics and Genomics, Department of Paediatrics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| |
Collapse
|
2
|
Ivanovic S, El-Kebir M. CNRein: an evolution-aware deep reinforcement learning algorithm for single-cell DNA copy number calling. Genome Biol 2025; 26:87. [PMID: 40197547 PMCID: PMC11974095 DOI: 10.1186/s13059-025-03553-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 03/21/2025] [Indexed: 04/10/2025] Open
Abstract
Low-pass single-cell DNA sequencing technologies and algorithmic advancements have enabled haplotype-specific copy number calling on thousands of cells within tumors. However, measurement uncertainty may result in spurious CNAs inconsistent with realistic evolutionary constraints. We introduce evolution-aware copy number calling via deep reinforcement learning (CNRein). Our simulations demonstrate CNRein infers more accurate copy-number profiles and better recapitulates ground truth clonal structure than existing methods. On sequencing data of breast and ovarian cancer, CNRein produces more parsimonious solutions than existing methods while maintaining agreement with single-nucleotide variants. Additionally, CNRein shows consistency on a breast cancer patient sequenced with distinct low-pass technologies.
Collapse
Affiliation(s)
- Stefan Ivanovic
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
| | - Mohammed El-Kebir
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA.
- Cancer Center Illinois, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA.
| |
Collapse
|
3
|
Kuipers J, Tuncel MA, Ferreira PF, Jahn K, Beerenwinkel N. Single-cell copy number calling and event history reconstruction. Bioinformatics 2025; 41:btaf072. [PMID: 39946094 PMCID: PMC11897432 DOI: 10.1093/bioinformatics/btaf072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 01/06/2025] [Accepted: 02/11/2025] [Indexed: 03/14/2025] Open
Abstract
MOTIVATION Copy number alterations are driving forces of tumour development and the emergence of intra-tumour heterogeneity. A comprehensive picture of these genomic aberrations is therefore essential for the development of personalised and precise cancer diagnostics and therapies. Single-cell sequencing offers the highest resolution for copy number profiling down to the level of individual cells. Recent high-throughput protocols allow for the processing of hundreds of cells through shallow whole-genome DNA sequencing. The resulting low read-depth data poses substantial statistical and computational challenges to the identification of copy number alterations. RESULTS We developed SCICoNE, a statistical model and MCMC algorithm tailored to single-cell copy number profiling from shallow whole-genome DNA sequencing data. SCICoNE reconstructs the history of copy number events in the tumour and uses these evolutionary relationships to identify the copy number profiles of the individual cells. We show the accuracy of this approach in evaluations on simulated data and demonstrate its practicability in applications to two breast cancer samples from different sequencing protocols. AVAILABILITY AND IMPLEMENTATION SCICoNE is available at https://github.com/cbg-ethz/SCICoNE.
Collapse
Affiliation(s)
- Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Mustafa Anıl Tuncel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Pedro F Ferreira
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Katharina Jahn
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| |
Collapse
|
4
|
Weiner S, Bansal MS. DICE: fast and accurate distance-based reconstruction of single-cell copy number phylogenies. Life Sci Alliance 2025; 8:e202402923. [PMID: 39667913 PMCID: PMC11638338 DOI: 10.26508/lsa.202402923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 11/29/2024] [Accepted: 12/02/2024] [Indexed: 12/14/2024] Open
Abstract
Somatic copy number alterations (sCNAs) are valuable phylogenetic markers for inferring evolutionary relationships among tumor cell subpopulations. Advances in single-cell DNA sequencing technologies are making it possible to obtain such sCNAs datasets at ever-larger scales. However, existing methods for reconstructing phylogenies from sCNAs are often too slow for large datasets. We propose two new distance-based methods, DICE-bar and DICE-star, for reconstructing single-cell tumor phylogenies from sCNA data. Using carefully simulated datasets, we find that DICE-bar matches or exceeds the accuracies of all other methods on noise-free datasets and that DICE-star shows exceptional robustness to noise and outperforms all other methods on noisy datasets. Both methods are also orders of magnitude faster than many existing methods. Our experimental analysis also reveals how noise/error in copy number inference, as expected for real datasets, can drastically impact the accuracies of most methods. We apply DICE-star, the most accurate method on error-prone datasets, to several real single-cell breast and ovarian cancer datasets and find that it rapidly produces phylogenies of equivalent or greater reliability compared with existing methods.
Collapse
Affiliation(s)
- Samson Weiner
- School of Computing, University of Connecticut, Storrs, CT, USA
| | - Mukul S Bansal
- School of Computing, University of Connecticut, Storrs, CT, USA
- The Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
5
|
Satas G, Myers MA, McPherson A, Shah SP. Inferring active mutational processes in cancer using single cell sequencing and evolutionary constraints. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.24.639589. [PMID: 40060559 PMCID: PMC11888314 DOI: 10.1101/2025.02.24.639589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/17/2025]
Abstract
Ongoing mutagenesis in cancer drives genetic diversity throughout the natural history of cancers. As the activities of mutational processes are dynamic throughout evolution, distinguishing the mutational signatures of 'active' and 'historical' processes has important implications for studying how tumors evolve. This can aid in understanding mutagenic states at the time of presentation, and in associating active mutational process with therapeutic resistance. As bulk sequencing primarily captures historical mutational processes, we studied whether ultra-low-coverage single-cell whole-genome sequencing (scWGS), which measures the distribution of mutations across hundreds or thousands of individual cells, could enable the distinction between historical and active mutational processes. While technical challenges and data sparsity have limited mutation analysis in scWGS, we show that these data contain valuable information about dynamic mutational processes. To robustly interpret single nucleotide variants (SNVs) in scWGS, we introduce ArtiCull, a method to identify and remove SNV artifacts by leveraging evolutionary constraints, enabling reliable detection of mutations for signature analysis. Applying this approach to scWGS data from pancreatic ductal adenocarcinoma (PDAC), triple-negative breast cancer (TNBC), and high-grade serous ovarian cancer (HGSOC), we uncover temporal and spatial patterns in mutational processes. In PDAC, we observe a temporal increase in mismatch repair deficiency (MMRd). In cisplatin-treated TNBC patient-derived xenografts, we identify therapy-induced mutagenesis and inactivation of APOBEC3 activity. In HGSOC, we show distinct patterns of APOBEC3 mutagenesis, including late tumor-wide activation in one case and clade-specific enrichment in another. Additionally, we detect a clone-specific increase in SBS17 activity, in a clone previously linked to recurrence. Our findings establish ultra-low-coverage scWGS as a powerful approach for studying active mutational processes that may influence ongoing clonal evolution and therapeutic resistance.
Collapse
Affiliation(s)
- Gryte Satas
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- The Halvorsen Center for Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Matthew A Myers
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- The Halvorsen Center for Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Andrew McPherson
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- The Halvorsen Center for Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Sohrab P Shah
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- The Halvorsen Center for Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| |
Collapse
|
6
|
Ortega-Batista A, Jaén-Alvarado Y, Moreno-Labrador D, Gómez N, García G, Guerrero EN. Single-Cell Sequencing: Genomic and Transcriptomic Approaches in Cancer Cell Biology. Int J Mol Sci 2025; 26:2074. [PMID: 40076700 PMCID: PMC11901077 DOI: 10.3390/ijms26052074] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Revised: 02/18/2025] [Accepted: 02/24/2025] [Indexed: 03/14/2025] Open
Abstract
This article reviews the impact of single-cell sequencing (SCS) on cancer biology research. SCS has revolutionized our understanding of cancer and tumor heterogeneity, clonal evolution, and the complex interplay between cancer cells and tumor microenvironment. SCS provides high-resolution profiling of individual cells in genomic, transcriptomic, and epigenomic landscapes, facilitating the detection of rare mutations, the characterization of cellular diversity, and the integration of molecular data with phenotypic traits. The integration of SCS with multi-omics has provided a multidimensional view of cellular states and regulatory mechanisms in cancer, uncovering novel regulatory mechanisms and therapeutic targets. Advances in computational tools, artificial intelligence (AI), and machine learning have been crucial in interpreting the vast amounts of data generated, leading to the identification of new biomarkers and the development of predictive models for patient stratification. Furthermore, there have been emerging technologies such as spatial transcriptomics and in situ sequencing, which promise to further enhance our understanding of tumor microenvironment organization and cellular interactions. As SCS and its related technologies continue to advance, they are expected to drive significant advances in personalized cancer diagnostics, prognosis, and therapy, ultimately improving patient outcomes in the era of precision oncology.
Collapse
Affiliation(s)
- Ana Ortega-Batista
- Faculty of Science and Technology, Technological University of Panama, Ave Justo Arosemena, Entre Calle 35 y 36, Corregimiento de Calidonia, Panama City, Panama; (A.O.-B.)
| | - Yanelys Jaén-Alvarado
- Faculty of Science and Technology, Technological University of Panama, Ave Justo Arosemena, Entre Calle 35 y 36, Corregimiento de Calidonia, Panama City, Panama; (A.O.-B.)
- Gorgas Memorial Institute for Health Studies, Ave Justo Arosemena, Entre Calle 35 y 36, Corregimiento de Calidonia, Panama City, Panama
| | - Dilan Moreno-Labrador
- Faculty of Science and Technology, Technological University of Panama, Ave Justo Arosemena, Entre Calle 35 y 36, Corregimiento de Calidonia, Panama City, Panama; (A.O.-B.)
| | - Natasha Gómez
- Faculty of Science and Technology, Technological University of Panama, Ave Justo Arosemena, Entre Calle 35 y 36, Corregimiento de Calidonia, Panama City, Panama; (A.O.-B.)
| | - Gabriela García
- Faculty of Science and Technology, Technological University of Panama, Ave Justo Arosemena, Entre Calle 35 y 36, Corregimiento de Calidonia, Panama City, Panama; (A.O.-B.)
| | - Erika N. Guerrero
- Gorgas Memorial Institute for Health Studies, Ave Justo Arosemena, Entre Calle 35 y 36, Corregimiento de Calidonia, Panama City, Panama
- Sistema Nacional de Investigación, Secretaria Nacional de Ciencia y Tecnología, Edificio 205, Ciudad del Saber, Panama City, Panama
| |
Collapse
|
7
|
Wang X, Jin Z, Shi Y, Xi R. Detecting copy-number alterations from single-cell chromatin sequencing data by AtaCNA. CELL REPORTS METHODS 2025; 5:100939. [PMID: 39814025 PMCID: PMC11840951 DOI: 10.1016/j.crmeth.2024.100939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 10/06/2024] [Accepted: 12/10/2024] [Indexed: 01/18/2025]
Abstract
Single-cell assay of transposase-accessible chromatin sequencing (scATAC-seq) unbiasedly profiles genome-wide chromatin accessibility in single cells. In single-cell tumor studies, identification of normal cells or tumor clonal structures often relies on copy-number alterations (CNAs). However, CNA detection from scATAC-seq is difficult due to the high noise, sparsity, and confounding factors. Here, we describe AtaCNA, a computational algorithm that accurately detects high-resolution CNAs from scATAC-seq data. We benchmark AtaCNA using simulation and real data and find AtaCNA's superior performance. Analyses of 10 scATAC-seq datasets show that AtaCNA could effectively distinguish malignant from non-malignant cells. In glioblastoma, endometrial, and ovarian cancer samples, AtaCNA identifies subclones at distinct cellular states, suggesting an important interplay between genetic and epigenetic plasticity. Some tumor subclones only differ in small-scale (10-20 Mb) CNAs, demonstrating the importance of high-resolution CNA detection. These data show that AtaCNA can aid in integrative analysis to understand the complex heterogeneity in cancer.
Collapse
Affiliation(s)
- Xiaochen Wang
- School of Mathematical Sciences, Peking University, Beijing 100871, China
| | - Zijie Jin
- Peking University International Cancer Institute, Health Science Center, Peking University, Beijing 100191, China
| | - Yang Shi
- Beigene Co., Ltd., Beijing 102206, China
| | - Ruibin Xi
- School of Mathematical Sciences, Peking University, Beijing 100871, China; Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China; Center for Statistical Science, Peking University, Beijing 100871, China.
| |
Collapse
|
8
|
Lucas O, Ward S, Zaidi R, Bunkum A, Frankell AM, Moore DA, Hill MS, Liu WK, Marinelli D, Lim EL, Hessey S, Naceur-Lombardelli C, Rowan A, Purewal-Mann SK, Zhai H, Dietzen M, Ding B, Royle G, Aparicio S, McGranahan N, Jamal-Hanjani M, Kanu N, Swanton C, Zaccaria S. Characterizing the evolutionary dynamics of cancer proliferation in single-cell clones with SPRINTER. Nat Genet 2025; 57:103-114. [PMID: 39614124 PMCID: PMC11735394 DOI: 10.1038/s41588-024-01989-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 10/15/2024] [Indexed: 12/01/2024]
Abstract
Proliferation is a key hallmark of cancer, but whether it differs between evolutionarily distinct clones co-existing within a tumor is unknown. We introduce the Single-cell Proliferation Rate Inference in Non-homogeneous Tumors through Evolutionary Routes (SPRINTER) algorithm that uses single-cell whole-genome DNA sequencing data to enable accurate identification and clone assignment of S- and G2-phase cells, as assessed by generating accurate ground truth data. Applied to a newly generated longitudinal, primary-metastasis-matched dataset of 14,994 non-small cell lung cancer cells, SPRINTER revealed widespread clone proliferation heterogeneity, orthogonally supported by Ki-67 staining, nuclei imaging and clinical imaging. We further demonstrated that high-proliferation clones have increased metastatic seeding potential, increased circulating tumor DNA shedding and clone-specific altered replication timing in proliferation- or metastasis-related genes associated with expression changes. Applied to previously generated datasets of 61,914 breast and ovarian cancer cells, SPRINTER revealed increased single-cell rates of different genomic variants and enrichment of proliferation-related gene amplifications in high-proliferation clones.
Collapse
Affiliation(s)
- Olivia Lucas
- Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK
- University College London Hospitals, London, UK
| | - Sophia Ward
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK
- Genomics Science Technology Platform, The Francis Crick Institute, London, UK
| | - Rija Zaidi
- Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
| | - Abigail Bunkum
- Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Metastasis Laboratory, University College London Cancer Institute, London, UK
| | - Alexander M Frankell
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK
| | - David A Moore
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK
- Department of Cellular Pathology, University College London Hospitals, London, UK
| | - Mark S Hill
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK
| | - Wing Kin Liu
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Metastasis Laboratory, University College London Cancer Institute, London, UK
| | - Daniele Marinelli
- Cancer Metastasis Laboratory, University College London Cancer Institute, London, UK
- Cancer Genome Evolution Research Group, University College London Cancer Institute, London, UK
- Department of Experimental Medicine, Sapienza University of Rome, Rome, Italy
| | - Emilia L Lim
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK
| | - Sonya Hessey
- Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- University College London Hospitals, London, UK
- Cancer Metastasis Laboratory, University College London Cancer Institute, London, UK
| | | | - Andrew Rowan
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK
| | | | - Haoran Zhai
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK
| | - Michelle Dietzen
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK
- Cancer Genome Evolution Research Group, University College London Cancer Institute, London, UK
| | - Boyue Ding
- Department of Medical Physics and Biomedical Engineering, University College London, London, UK
| | - Gary Royle
- Department of Medical Physics and Biomedical Engineering, University College London, London, UK
| | - Samuel Aparicio
- Department of Molecular Oncology, British Columbia Cancer Research Centre, Vancouver, British Columbia, Canada
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Nicholas McGranahan
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Genome Evolution Research Group, University College London Cancer Institute, London, UK
| | - Mariam Jamal-Hanjani
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- University College London Hospitals, London, UK
- Cancer Metastasis Laboratory, University College London Cancer Institute, London, UK
| | - Nnennaya Kanu
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK.
| | - Charles Swanton
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK.
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK.
- University College London Hospitals, London, UK.
| | - Simone Zaccaria
- Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK.
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK.
| |
Collapse
|
9
|
Du Q. Single-cell genomics breaks new ground in cell cycle detection. Nat Genet 2025; 57:3-5. [PMID: 39614123 DOI: 10.1038/s41588-024-01987-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2024]
Affiliation(s)
- Qian Du
- Novo Nordisk Foundation Center for Protein Research (CPR), University of Copenhagen, Copenhagen, Denmark.
- Garvan Institute of Medical Research, St Vincent's Clinical School, UNSW Sydney, Sydney, New South Wales, Australia.
| |
Collapse
|
10
|
Lu B. Cancer phylogenetic inference using copy number alterations detected from DNA sequencing data. CANCER PATHOGENESIS AND THERAPY 2025; 3:16-29. [PMID: 39872371 PMCID: PMC11764021 DOI: 10.1016/j.cpt.2024.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 04/05/2024] [Accepted: 04/15/2024] [Indexed: 01/30/2025]
Abstract
Cancer is an evolutionary process involving the accumulation of diverse somatic mutations and clonal evolution over time. Phylogenetic inference from samples obtained from an individual patient offers a powerful approach to unraveling the intricate evolutionary history of cancer and provides insights that can inform cancer treatment. Somatic copy number alterations (CNAs) are important in cancer evolution and are often used as markers, alone or with other somatic mutations, for phylogenetic inferences, particularly in low-coverage DNA sequencing data. Many phylogenetic inference methods using CNAs detected from bulk or single-cell DNA sequencing data have been developed over the years. However, there have been no systematic reviews on these methods. To summarize the state-of-the-art of the field and inform future development, this review presents a comprehensive survey on the major challenges in inference, different types of methods, and applications of these methods. The challenges are discussed from the aspects of input data, models of evolution, and inference algorithms. The different methods are grouped according to the markers used for inference and the types of the reconstructed trees. The applications include using phylogenetic inference to understand intra-tumor heterogeneity, metastasis, treatment resistance, and early cancer development. This review also sheds light on future directions of cancer phylogenetic inference using CNAs, including the improvement of scalability, the utilization of new types of data, and the development of more realistic models of evolution.
Collapse
Affiliation(s)
- Bingxin Lu
- School of Biosciences and Medicine, University of Surrey, Guildford GU2 7XH, UK
- Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, UK
| |
Collapse
|
11
|
Zhou W, Mumm C, Gan Y, Switzenberg JA, Wang J, De Oliveira P, Kathuria K, Losh SJ, McDonald TL, Bessell B, Van Deynze K, McConnell MJ, Boyle AP, Mills RE. A personalized multi-platform assessment of somatic mosaicism in the human frontal cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.18.629274. [PMID: 39763954 PMCID: PMC11702624 DOI: 10.1101/2024.12.18.629274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/16/2025]
Abstract
Somatic mutations in individual cells lead to genomic mosaicism, contributing to the intricate regulatory landscape of genetic disorders and cancers. To evaluate and refine the detection of somatic mosaicism across different technologies with personalized donor-specific assembly (DSA), we obtained tissue from the dorsolateral prefrontal cortex (DLPFC) of a post-mortem neurotypical 31-year-old individual. We sequenced bulk DLPFC tissue using Oxford Nanopore Technologies (~60X), NovaSeq (~30X), and linked-read sequencing (~28X). Additionally, we applied Cas9 capture methodology coupled with long-read sequencing (TEnCATS), targeting active transposable elements. We also isolated and amplified DNA from flow-sorted single DLPFC neurons using MALBAC, sequencing 115 of these MALBAC libraries on Nanopore and 94 on NovaSeq. We constructed a haplotype-resolved assembly with a total length of 5.77 Gb and a phase block length of 2.67 Mb (N50) to facilitate cross-platform analysis of somatic genetic variations. We observed an increase in the phasing rate from 11.6% to 38.0% between short-read and long-read technologies. By generating a catalog of phased germline SNVs, CNVs, and TEs from the assembled genome, we applied standard approaches to recall these variants across sequencing technologies. We achieved aggregated recall rates from 97.3% to 99.4% based on long-read bulk tissue data, setting an upper bound for detection limits. Moreover, utilizing haplotype-based analysis from DSA, we achieved a remarkable reduction in false positive somatic calls in bulk tissue, ranging from 14.9% to 72.4%. We developed pipelines leveraging DSA information to enhance somatic large genetic variant calling in long-read single cells. By examining somatic variation using long-reads in 115 individual neurons, we identified 468 candidate somatic heterozygous large deletions (1.5Mb - 20Mb), 137 of which intersected with short-read single-cell data. Additionally, we identified 61 putative somatic TEs (60 Alus, one LINE-1) in the single-cell data. Collectively, our analysis spans personalized assembly to single-cell somatic variant calling, providing a comprehensive ab initio ad finem approach and resource in real human tissue.
Collapse
Affiliation(s)
- Weichen Zhou
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Camille Mumm
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Yanming Gan
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Jessica A. Switzenberg
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Jinhao Wang
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | | | - Kunal Kathuria
- Lieber Institute for Brain Development, Baltimore, MD, USA
| | - Steven J. Losh
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Torrin L. McDonald
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Brandt Bessell
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Kinsey Van Deynze
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | | | - Alan P. Boyle
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Ryan E. Mills
- Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
12
|
Ma C, Balaban M, Liu J, Chen S, Wilson MJ, Sun CH, Ding L, Raphael BJ. Inferring allele-specific copy number aberrations and tumor phylogeography from spatially resolved transcriptomics. Nat Methods 2024; 21:2239-2247. [PMID: 39478176 PMCID: PMC11621028 DOI: 10.1038/s41592-024-02438-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 09/04/2024] [Indexed: 11/16/2024]
Abstract
Analyzing somatic evolution within a tumor over time and across space is a key challenge in cancer research. Spatially resolved transcriptomics (SRT) measures gene expression at thousands of spatial locations in a tumor, but does not directly reveal genomic aberrations. We introduce CalicoST, an algorithm to simultaneously infer allele-specific copy number aberrations (CNAs) and reconstruct spatial tumor evolution, or phylogeography, from SRT data. CalicoST identifies important classes of CNAs-including copy-neutral loss of heterozygosity and mirrored subclonal CNAs-that are invisible to total copy number analysis. Using nine patients' data from the Human Tumor Atlas Network, CalicoST achieves an average accuracy of 86%, approximately 21% higher than existing methods. CalicoST reconstructs a tumor phylogeography in three-dimensional space for two patients with multiple adjacent slices. CalicoST analysis of multiple SRT slices from a cancerous prostate organ reveals mirrored subclonal CNAs on the two sides of the prostate, forming a bifurcating phylogeography in both genetic and physical space.
Collapse
Affiliation(s)
- Cong Ma
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Metin Balaban
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Jingxian Liu
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, USA
| | - Siqi Chen
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, USA
| | - Michael J Wilson
- Department of Astrophysical Sciences, Princeton University, Princeton, NJ, USA
| | - Christopher H Sun
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Li Ding
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, USA.
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, USA.
- Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA.
- Department of Genetics, Washington University in St. Louis, St. Louis, MO, USA.
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
13
|
Safinianaini N, De Souza CPE, Roth A, Koptagel H, Toosi H, Lagergren J. CopyMix: Mixture model based single-cell clustering and copy number profiling using variational inference. Comput Biol Chem 2024; 113:108257. [PMID: 39500117 DOI: 10.1016/j.compbiolchem.2024.108257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 08/15/2024] [Accepted: 10/15/2024] [Indexed: 12/15/2024]
Abstract
Investigating tumor heterogeneity using single-cell sequencing technologies is imperative to understand how tumors evolve since each cell subpopulation harbors a unique set of genomic features that yields a unique phenotype, which is bound to have clinical relevance. Clustering of cells based on copy number data obtained from single-cell DNA sequencing provides an opportunity to identify different tumor cell subpopulations. Accordingly, computational methods have emerged for single-cell copy number profiling and clustering; however, these two tasks have been handled sequentially by applying various ad-hoc pre- and post-processing steps; hence, a procedure vulnerable to introducing clustering artifacts. We avoid the clustering artifact issues in our method, CopyMix, a Variational Inference for a novel mixture model, by jointly inferring cell clusters and their underlying copy number profile. Our probabilistic graphical model is an improved version of the mixture of hidden Markov models, which is designed uniquely to infer single-cell copy number profiling and clustering. For the evaluation, we used likelihood-ratio test, CH index, Silhouette, V-measure, total variation scores. CopyMix performs well on both biological and simulated data. Our favorable results indicate a considerable potential to obtain clinical impact by using CopyMix in studies of cancer tumor heterogeneity.
Collapse
Affiliation(s)
- Negar Safinianaini
- Department of Computer Science, Aalto University, Konemiehentie 2, Espoo, 02150, Helsinki, Finland.
| | - Camila P E De Souza
- Department of Statistical and Actuarial Sciences, University of Western Ontario, 1151 Richmond Street, London, N6A 5B7, Ontario, Canada
| | - Andrew Roth
- British Columbia Cancer Agency, 675 West 10th Avenue, Vancouver, V5Z 1L3, BC, Canada; Faculty of Computer Science, University of British Columbia, Building 201-2366 Main Mall, London, V6T 1Z4, BC, Canada
| | - Hazal Koptagel
- Science for Life Laboratory, Tomtebodavägen 23, Solna, 171 65, Stockholm, Sweden
| | - Hosein Toosi
- Science for Life Laboratory, Tomtebodavägen 23, Solna, 171 65, Stockholm, Sweden
| | - Jens Lagergren
- Science for Life Laboratory, Tomtebodavägen 23, Solna, 171 65, Stockholm, Sweden; Department of Computer Science, KTH, Malvinas v 10, Stockholm, 114 28, Stockholm, Sweden
| |
Collapse
|
14
|
Yu Z, Liu F, Li Y. scTCA: a hybrid Transformer-CNN architecture for imputation and denoising of scDNA-seq data. Brief Bioinform 2024; 25:bbae577. [PMID: 39523623 PMCID: PMC11551055 DOI: 10.1093/bib/bbae577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Revised: 10/05/2024] [Accepted: 10/29/2024] [Indexed: 11/16/2024] Open
Abstract
Single-cell DNA sequencing (scDNA-seq) has been widely used to unmask tumor copy number alterations (CNAs) at single-cell resolution. Despite that arm-level CNAs can be accurately detected from single-cell read counts, it is difficult to precisely identify focal CNAs as the read counts are featured with high dimensionality, high sparsity and low signal-to-noise ratio. This gives rise to a desperate demand for reconstructing high-quality scDNA-seq data. We develop a new method called scTCA for imputation and denoising of single-cell read counts, thus aiding in downstream analysis of both arm-level and focal CNAs. scTCA employs hybrid Transformer-CNN architectures to identify local and non-local correlations between genes for precise recovery of the read counts. Unlike conventional Transformers, the Transformer block in scTCA is a two-stage attention module containing a stepwise self-attention layer and a window Transformer, and can efficiently deal with the high-dimensional read counts data. We showcase the superior performance of scTCA through comparison with the state-of-the-arts on both synthetic and real datasets. The results indicate it is highly effective in imputation and denoising of scDNA-seq data.
Collapse
Affiliation(s)
- Zhenhua Yu
- School of Information Engineering, Ningxia University, 750021 Ningxia, China
- Ningxia Key Laboratory of Artificial Intelligence and Information Security for Channeling Computing Resources from the East to the West, Ningxia University, 750021 Ningxia, China
| | - Furui Liu
- School of Information Engineering, Ningxia University, 750021 Ningxia, China
| | - Yang Li
- School of Information Engineering, Ningxia University, 750021 Ningxia, China
| |
Collapse
|
15
|
Foltz SM, Li Y, Yao L, Terekhanova NV, Weerasinghe A, Gao Q, Dong G, Schindler M, Cao S, Sun H, Jayasinghe RG, Fulton RS, Fronick CC, King J, Kohnen DR, Fiala MA, Chen K, DiPersio JF, Vij R, Ding L. Somatic mutation phasing and haplotype extension using linked-reads in multiple myeloma. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.09.607342. [PMID: 39149342 PMCID: PMC11326269 DOI: 10.1101/2024.08.09.607342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Somatic mutation phasing informs our understanding of cancer-related events, like driver mutations. We generated linked-read whole genome sequencing data for 23 samples across disease stages from 14 multiple myeloma (MM) patients and systematically assigned somatic mutations to haplotypes using linked-reads. Here, we report the reconstructed cancer haplotypes and phase blocks from several MM samples and show how phase block length can be extended by integrating samples from the same individual. We also uncover phasing information in genes frequently mutated in MM, including DIS3, HIST1H1E, KRAS, NRAS, and TP53, phasing 79.4% of 20,705 high-confidence somatic mutations. In some cases, this enabled us to interpret clonal evolution models at higher resolution using pairs of phased somatic mutations. For example, our analysis of one patient suggested that two NRAS hotspot mutations occurred on the same haplotype but were independent events in different subclones. Given sufficient tumor purity and data quality, our framework illustrates how haplotype-aware analysis of somatic mutations in cancer can be beneficial for some cancer cases.
Collapse
Affiliation(s)
- Steven M. Foltz
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Yize Li
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Lijun Yao
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Nadezhda V. Terekhanova
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Amila Weerasinghe
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Qingsong Gao
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Guanlan Dong
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Moses Schindler
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Song Cao
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Hua Sun
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Reyka G. Jayasinghe
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Robert S. Fulton
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Catrina C. Fronick
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
| | - Justin King
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
| | - Daniel R. Kohnen
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
| | - Mark A. Fiala
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
| | - Ken Chen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - John F. DiPersio
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, 63110, USA
| | - Ravi Vij
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, 63110, USA
| | - Li Ding
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO, 63108, USA
- Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, 63110, USA
- Department of Genetics, Washington University in St. Louis, St. Louis, MO, 63110, USA
| |
Collapse
|
16
|
Huang R, Huang X, Tong Y, Yan HYN, Leung SY, Stegle O, Huang Y. Robust analysis of allele-specific copy number alterations from scRNA-seq data with XClone. Nat Commun 2024; 15:6684. [PMID: 39107346 PMCID: PMC11303794 DOI: 10.1038/s41467-024-51026-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 07/27/2024] [Indexed: 08/10/2024] Open
Abstract
Somatic copy number alterations (CNAs) are major mutations that contribute to the development and progression of various cancers. Despite a few computational methods proposed to detect CNAs from single-cell transcriptomic data, the technical sparsity of such data makes it challenging to identify allele-specific CNAs, particularly in complex clonal structures. In this study, we present a statistical method, XClone, that strengthens the signals of read depth and allelic imbalance by effective smoothing on cell neighborhood and gene coordinate graphs to detect haplotype-aware CNAs from scRNA-seq data. By applying XClone to multiple datasets with challenging compositions, we demonstrated its ability to robustly detect different types of allele-specific CNAs and potentially indicate whole genome duplication, therefore enabling the discovery of corresponding subclones and the dissection of their phenotypic impacts.
Collapse
Affiliation(s)
- Rongting Huang
- School of Biomedical Sciences, The University of Hong Kong, Hong Kong SAR, China
| | - Xianjie Huang
- School of Biomedical Sciences, The University of Hong Kong, Hong Kong SAR, China
- Center for Translational Stem Cell Biology, Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Yin Tong
- Department of Pathology, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Queen Mary Hospital, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Helen Y N Yan
- Department of Pathology, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Queen Mary Hospital, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Suet Yi Leung
- Department of Pathology, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Queen Mary Hospital, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
- The Jockey Club Centre for Clinical Innovation and Discovery, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- Centre for PanorOmic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Oliver Stegle
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Yuanhua Huang
- School of Biomedical Sciences, The University of Hong Kong, Hong Kong SAR, China.
- Center for Translational Stem Cell Biology, Hong Kong Science and Technology Park, Hong Kong SAR, China.
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
17
|
Weiner S, Li B, Nabavi S. Improved allele-specific single-cell copy number estimation in low-coverage DNA-sequencing. Bioinformatics 2024; 40:btae506. [PMID: 39133157 PMCID: PMC11346770 DOI: 10.1093/bioinformatics/btae506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 07/12/2024] [Accepted: 08/09/2024] [Indexed: 08/13/2024] Open
Abstract
MOTIVATION Advances in whole-genome single-cell DNA sequencing (scDNA-seq) have led to the development of numerous methods for detecting copy number aberrations (CNAs), a key driver of genetic heterogeneity in cancer. While most of these methods are limited to the inference of total copy number, some recent approaches now infer allele-specific CNAs using innovative techniques for estimating allele-frequencies in low coverage scDNA-seq data. However, these existing allele-specific methods are limited in their segmentation strategies, a crucial step in the CNA detection pipeline. RESULTS We present SEACON (Single-cell Estimation of Allele-specific COpy Numbers), an allele-specific copy number profiler for scDNA-seq data. SEACON uses a Gaussian Mixture Model to identify latent copy number states and breakpoints between contiguous segments across cells, filters the segments for high-quality breakpoints using an ensemble technique, and adopts several strategies for tolerating noisy read-depth and allele frequency measurements. Using a wide array of both real and simulated datasets, we show that SEACON derives accurate copy numbers and surpasses existing approaches under numerous experimental conditions, and identify its strengths and weaknesses. AVAILABILITY AND IMPLEMENTATION SEACON is implemented in Python and is freely available open-source from https://github.com/NabaviLab/SEACON and https://doi.org/10.5281/zenodo.12727008.
Collapse
Affiliation(s)
- Samson Weiner
- School of Computing, University of Connecticut, Storrs, CT 06082, United States
| | - Bingjun Li
- School of Computing, University of Connecticut, Storrs, CT 06082, United States
| | - Sheida Nabavi
- School of Computing, University of Connecticut, Storrs, CT 06082, United States
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06082, United States
| |
Collapse
|
18
|
Panda A, Suvakov M, Thorvaldsdottir H, Mesirov JP, Robinson JT, Abyzov A. Genome-wide analysis and visualization of copy number with CNVpytor in igv.js. Bioinformatics 2024; 40:btae453. [PMID: 39018173 PMCID: PMC11303504 DOI: 10.1093/bioinformatics/btae453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 05/27/2024] [Accepted: 07/15/2024] [Indexed: 07/19/2024] Open
Abstract
SUMMARY Copy number variation (CNV) and alteration (CNA) analysis is a crucial component in many genomic studies and its applications span from basic research to clinic diagnostics and personalized medicine. CNVpytor is a tool featuring a read depth-based caller and combined read depth and B-allele frequency (BAF) based 2D caller to find CNVs and CNAs. The tool stores processed intermediate data and CNV/CNA calls in a compact HDF5 file-pytor file. Here, we describe a new track in igv.js that utilizes pytor and whole genome variant files as input for on-the-fly read depth and BAF visualization, CNV/CNA calling and analysis. Embedding into HTML pages and Jupiter Notebooks enables convenient remote data access and visualization simplifying interpretation and analysis of omics data. AVAILABILITY AND IMPLEMENTATION The CNVpytor track is integrated with igv.js and available at https://github.com/igvteam/igv.js. The documentation is available at https://github.com/igvteam/igv.js/wiki/cnvpytor. Usage can be tested in the IGV-Web app at https://igv.org/app and also on https://github.com/abyzovlab/CNVpytor.
Collapse
Affiliation(s)
- Arijit Panda
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, United States
| | - Milovan Suvakov
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, United States
| | | | - Jill P Mesirov
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, United States
- Moores Cancer Center, University of California San Diego, La Jolla, CA 92037, United States
| | - James T Robinson
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, United States
| | - Alexej Abyzov
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, United States
| |
Collapse
|
19
|
Zhang L, Zhou XM, Mallory X. SCCNAInfer: a robust and accurate tool to infer the absolute copy number on scDNA-seq data. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae454. [PMID: 39067018 DOI: 10.1093/bioinformatics/btae454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 06/13/2024] [Accepted: 07/26/2024] [Indexed: 07/30/2024]
Abstract
MOTIVATION Copy number alterations (CNAs) play an important role in disease progression, especially in cancer. Single-cell DNA sequencing (scDNA-seq) facilitates the detection of CNAs of each cell that is sequenced at a shallow and uneven coverage. However, the state-of-the-art CNA detection tools based on scDNA-seq are still subject to genome-wide errors due to the wrong estimation of the ploidy. RESULTS We developed SCCNAInfer, a computational tool that utilizes the subclonal signal inside the tumor cells to more accurately infer each cell's ploidy and CNAs. Given the segmentation result of an existing CNA detection method, SCCNAInfer clusters the cells, infers the ploidy of each subclone, refines the read count by bin clustering, and accurately infers the CNAs for each cell. Both simulated and real datasets show that SCCNAInfer consistently improves upon the state-of-the-art CNA detection tools such as Aneufinder, Ginkgo, SCOPE and SeCNV. AVAILABILITY AND IMPLEMENTATION SCCNAInfer is freely available at https://github.com/compbio-mallory/SCCNAInfer. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Liting Zhang
- Department of Computer Science, Florida State University, Florida 32304, USA
| | - Xin Maizie Zhou
- Department of Biomedical Engineering, Vanderbilt University, Tennessee 37235, USA
| | - Xian Mallory
- Department of Computer Science, Florida State University, Florida 32304, USA
| |
Collapse
|
20
|
Sashittal P, Chen V, Pasarkar A, Raphael BJ. Joint inference of cell lineage and mitochondrial evolution from single-cell sequencing data. Bioinformatics 2024; 40:i218-i227. [PMID: 38940122 PMCID: PMC11211840 DOI: 10.1093/bioinformatics/btae231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION Eukaryotic cells contain organelles called mitochondria that have their own genome. Most cells contain thousands of mitochondria which replicate, even in nondividing cells, by means of a relatively error-prone process resulting in somatic mutations in their genome. Because of the higher mutation rate compared to the nuclear genome, mitochondrial mutations have been used to track cellular lineage, particularly using single-cell sequencing that measures mitochondrial mutations in individual cells. However, existing methods to infer the cell lineage tree from mitochondrial mutations do not model "heteroplasmy," which is the presence of multiple mitochondrial clones with distinct sets of mutations in an individual cell. Single-cell sequencing data thus provide a mixture of the mitochondrial clones in individual cells, with the ancestral relationships between these clones described by a mitochondrial clone tree. While deconvolution of somatic mutations from a mixture of evolutionarily related genomes has been extensively studied in the context of bulk sequencing of cancer tumor samples, the problem of mitochondrial deconvolution has the additional constraint that the mitochondrial clone tree must be concordant with the cell lineage tree. RESULTS We formalize the problem of inferring a concordant pair of a mitochondrial clone tree and a cell lineage tree from single-cell sequencing data as the Nested Perfect Phylogeny Mixture (NPPM) problem. We derive a combinatorial characterization of the solutions to the NPPM problem, and formulate an algorithm, MERLIN, to solve this problem exactly using a mixed integer linear program. We show on simulated data that MERLIN outperforms existing methods that do not model mitochondrial heteroplasmy nor the concordance between the mitochondrial clone tree and the cell lineage tree. We use MERLIN to analyze single-cell whole-genome sequencing data of 5220 cells of a gastric cancer cell line and show that MERLIN infers a more biologically plausible cell lineage tree and mitochondrial clone tree compared to existing methods. AVAILABILITY AND IMPLEMENTATION https://github.com/raphael-group/MERLIN.
Collapse
Affiliation(s)
- Palash Sashittal
- Department of Computer Science, Princeton University, Princeton, NJ 08540, United States
| | - Viola Chen
- Department of Computer Science, Princeton University, Princeton, NJ 08540, United States
| | - Amey Pasarkar
- Department of Computer Science, Princeton University, Princeton, NJ 08540, United States
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ 08540, United States
| |
Collapse
|
21
|
Zhang S, Xiao X, Yi Y, Wang X, Zhu L, Shen Y, Lin D, Wu C. Tumor initiation and early tumorigenesis: molecular mechanisms and interventional targets. Signal Transduct Target Ther 2024; 9:149. [PMID: 38890350 PMCID: PMC11189549 DOI: 10.1038/s41392-024-01848-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 04/23/2024] [Accepted: 04/27/2024] [Indexed: 06/20/2024] Open
Abstract
Tumorigenesis is a multistep process, with oncogenic mutations in a normal cell conferring clonal advantage as the initial event. However, despite pervasive somatic mutations and clonal expansion in normal tissues, their transformation into cancer remains a rare event, indicating the presence of additional driver events for progression to an irreversible, highly heterogeneous, and invasive lesion. Recently, researchers are emphasizing the mechanisms of environmental tumor risk factors and epigenetic alterations that are profoundly influencing early clonal expansion and malignant evolution, independently of inducing mutations. Additionally, clonal evolution in tumorigenesis reflects a multifaceted interplay between cell-intrinsic identities and various cell-extrinsic factors that exert selective pressures to either restrain uncontrolled proliferation or allow specific clones to progress into tumors. However, the mechanisms by which driver events induce both intrinsic cellular competency and remodel environmental stress to facilitate malignant transformation are not fully understood. In this review, we summarize the genetic, epigenetic, and external driver events, and their effects on the co-evolution of the transformed cells and their ecosystem during tumor initiation and early malignant evolution. A deeper understanding of the earliest molecular events holds promise for translational applications, predicting individuals at high-risk of tumor and developing strategies to intercept malignant transformation.
Collapse
Affiliation(s)
- Shaosen Zhang
- Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China
- Key Laboratory of Cancer Genomic Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China
| | - Xinyi Xiao
- Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China
- Key Laboratory of Cancer Genomic Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China
| | - Yonglin Yi
- Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China
- Key Laboratory of Cancer Genomic Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China
| | - Xinyu Wang
- Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China
- Key Laboratory of Cancer Genomic Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China
| | - Lingxuan Zhu
- Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China
- Key Laboratory of Cancer Genomic Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China
- Changping Laboratory, 100021, Beijing, China
| | - Yanrong Shen
- Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China
- Key Laboratory of Cancer Genomic Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China
| | - Dongxin Lin
- Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China.
- Key Laboratory of Cancer Genomic Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China.
- Changping Laboratory, 100021, Beijing, China.
- Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, 211166, China.
- Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Guangzhou, 510060, China.
| | - Chen Wu
- Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China.
- Key Laboratory of Cancer Genomic Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, 100021, Beijing, China.
- Changping Laboratory, 100021, Beijing, China.
- Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, 211166, China.
- CAMS Oxford Institute, Chinese Academy of Medical Sciences, 100006, Beijing, China.
| |
Collapse
|
22
|
Myers MA, Arnold BJ, Bansal V, Balaban M, Mullen KM, Zaccaria S, Raphael BJ. HATCHet2: clone- and haplotype-specific copy number inference from bulk tumor sequencing data. Genome Biol 2024; 25:130. [PMID: 38773520 PMCID: PMC11110434 DOI: 10.1186/s13059-024-03267-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 05/03/2024] [Indexed: 05/24/2024] Open
Abstract
Bulk DNA sequencing of multiple samples from the same tumor is becoming common, yet most methods to infer copy-number aberrations (CNAs) from this data analyze individual samples independently. We introduce HATCHet2, an algorithm to identify haplotype- and clone-specific CNAs simultaneously from multiple bulk samples. HATCHet2 extends the earlier HATCHet method by improving identification of focal CNAs and introducing a novel statistic, the minor haplotype B-allele frequency (mhBAF), that enables identification of mirrored-subclonal CNAs. We demonstrate HATCHet2's improved accuracy using simulations and a single-cell sequencing dataset. HATCHet2 analysis of 10 prostate cancer patients reveals previously unreported mirrored-subclonal CNAs affecting cancer genes.
Collapse
Affiliation(s)
- Matthew A Myers
- Department of Computer Science, Princeton University, Princeton, USA
| | - Brian J Arnold
- Center for Statistics and Machine Learning, Princeton University, Princeton, USA
| | - Vineet Bansal
- Princeton Research Computing, Princeton University, Princeton, NJ, USA
| | - Metin Balaban
- Department of Computer Science, Princeton University, Princeton, USA
| | - Katelyn M Mullen
- Human Oncology & Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Simone Zaccaria
- Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK.
| | | |
Collapse
|
23
|
Sun C, Kathuria K, Emery SB, Kim B, Burbulis IE, Shin JH, Weinberger DR, Moran JV, Kidd JM, Mills RE, McConnell MJ. Mapping recurrent mosaic copy number variation in human neurons. Nat Commun 2024; 15:4220. [PMID: 38760338 PMCID: PMC11101435 DOI: 10.1038/s41467-024-48392-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 04/29/2024] [Indexed: 05/19/2024] Open
Abstract
When somatic cells acquire complex karyotypes, they often are removed by the immune system. Mutant somatic cells that evade immune surveillance can lead to cancer. Neurons with complex karyotypes arise during neurotypical brain development, but neurons are almost never the origin of brain cancers. Instead, somatic mutations in neurons can bring about neurodevelopmental disorders, and contribute to the polygenic landscape of neuropsychiatric and neurodegenerative disease. A subset of human neurons harbors idiosyncratic copy number variants (CNVs, "CNV neurons"), but previous analyses of CNV neurons are limited by relatively small sample sizes. Here, we develop an allele-based validation approach, SCOVAL, to corroborate or reject read-depth based CNV calls in single human neurons. We apply this approach to 2,125 frontal cortical neurons from a neurotypical human brain. SCOVAL identifies 226 CNV neurons, which include a subclass of 65 CNV neurons with highly aberrant karyotypes containing whole or substantial losses on multiple chromosomes. Moreover, we find that CNV location appears to be nonrandom. Recurrent regions of neuronal genome rearrangement contain fewer, but longer, genes.
Collapse
Affiliation(s)
- Chen Sun
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA
| | - Kunal Kathuria
- Lieber Institute for Brain Development, 855 North Wolfe Street, Baltimore, MD, 21205, USA
| | - Sarah B Emery
- Department of Human Genetics, University of Michigan Medical School, 1241 East Catherine Street, Ann Arbor, MI, 48109, USA
| | - ByungJun Kim
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA
| | - Ian E Burbulis
- Department of Biochemistry and Molecular Genetics, University of Virginia, School of Medicine, Charlottesville, VA, 22902, USA
- Facultad de Medicina y Ciencia, Universidad San Sebastián, Sede de la Patagonia, Puerto Montt, Chile
| | - Joo Heon Shin
- Lieber Institute for Brain Development, 855 North Wolfe Street, Baltimore, MD, 21205, USA
| | - Daniel R Weinberger
- Lieber Institute for Brain Development, 855 North Wolfe Street, Baltimore, MD, 21205, USA
- Department of Psychiatry and Behavioral Sciences and Neuroscience, Johns Hopkins School of Medicine, 600 North Wolfe Street, Baltimore, MD, 21287, USA
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, 733 North Broadway, Baltimore, MD, 21230, USA
| | - John V Moran
- Department of Human Genetics, University of Michigan Medical School, 1241 East Catherine Street, Ann Arbor, MI, 48109, USA
- Department of Internal Medicine, University of Michigan Medical School, 1500 East Medical Center Drive, Ann Arbor, MI, 48109, USA
| | - Jeffrey M Kidd
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA
- Department of Human Genetics, University of Michigan Medical School, 1241 East Catherine Street, Ann Arbor, MI, 48109, USA
| | - Ryan E Mills
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA.
- Department of Human Genetics, University of Michigan Medical School, 1241 East Catherine Street, Ann Arbor, MI, 48109, USA.
| | - Michael J McConnell
- Lieber Institute for Brain Development, 855 North Wolfe Street, Baltimore, MD, 21205, USA.
| |
Collapse
|
24
|
Kurt S, Chen M, Toosi H, Chen X, Engblom C, Mold J, Hartman J, Lagergren J. CopyVAE: a variational autoencoder-based approach for copy number variation inference using single-cell transcriptomics. Bioinformatics 2024; 40:btae284. [PMID: 38676578 PMCID: PMC11087824 DOI: 10.1093/bioinformatics/btae284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 03/06/2024] [Accepted: 04/25/2024] [Indexed: 04/29/2024] Open
Abstract
MOTIVATION Copy number variations (CNVs) are common genetic alterations in tumour cells. The delineation of CNVs holds promise for enhancing our comprehension of cancer progression. Moreover, accurate inference of CNVs from single-cell sequencing data is essential for unravelling intratumoral heterogeneity. However, existing inference methods face limitations in resolution and sensitivity. RESULTS To address these challenges, we present CopyVAE, a deep learning framework based on a variational autoencoder architecture. Through experiments, we demonstrated that CopyVAE can accurately and reliably detect CNVs from data obtained using single-cell RNA sequencing. CopyVAE surpasses existing methods in terms of sensitivity and specificity. We also discussed CopyVAE's potential to advance our understanding of genetic alterations and their impact on disease advancement. AVAILABILITY AND IMPLEMENTATION CopyVAE is implemented and freely available under MIT license at https://github.com/kurtsemih/copyVAE.
Collapse
Affiliation(s)
- Semih Kurt
- School of EECS and SciLifeLab, KTH Royal Institute of Technology, Stockholm, 100 44, Sweden
| | - Mandi Chen
- School of EECS and SciLifeLab, KTH Royal Institute of Technology, Stockholm, 100 44, Sweden
| | - Hosein Toosi
- School of EECS and SciLifeLab, KTH Royal Institute of Technology, Stockholm, 100 44, Sweden
| | - Xinsong Chen
- Department of Oncology and Pathology, Karolinska Institutet, Solna, 171 77, Sweden
| | - Camilla Engblom
- Department of Cell and Molecular Biology, Karolinska Institutet, Solna, 171 77, Sweden
| | - Jeff Mold
- Department of Cell and Molecular Biology, Karolinska Institutet, Solna, 171 77, Sweden
| | - Johan Hartman
- Department of Oncology and Pathology, Karolinska Institutet, Solna, 171 77, Sweden
- Department of Clinical Pathology and Cytology, Karolinska University Laboratory, Solna, 171 76, Sweden
| | - Jens Lagergren
- School of EECS and SciLifeLab, KTH Royal Institute of Technology, Stockholm, 100 44, Sweden
| |
Collapse
|
25
|
Kuzmin E, Baker TM, Lesluyes T, Monlong J, Abe KT, Coelho PP, Schwartz M, Del Corpo J, Zou D, Morin G, Pacis A, Yang Y, Martinez C, Barber J, Kuasne H, Li R, Bourgey M, Fortier AM, Davison PG, Omeroglu A, Guiot MC, Morris Q, Kleinman CL, Huang S, Gingras AC, Ragoussis J, Bourque G, Van Loo P, Park M. Evolution of chromosome-arm aberrations in breast cancer through genetic network rewiring. Cell Rep 2024; 43:113988. [PMID: 38517886 PMCID: PMC11063629 DOI: 10.1016/j.celrep.2024.113988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 02/02/2024] [Accepted: 03/07/2024] [Indexed: 03/24/2024] Open
Abstract
The basal breast cancer subtype is enriched for triple-negative breast cancer (TNBC) and displays consistent large chromosomal deletions. Here, we characterize evolution and maintenance of chromosome 4p (chr4p) loss in basal breast cancer. Analysis of The Cancer Genome Atlas data shows recurrent deletion of chr4p in basal breast cancer. Phylogenetic analysis of a panel of 23 primary tumor/patient-derived xenograft basal breast cancers reveals early evolution of chr4p deletion. Mechanistically we show that chr4p loss is associated with enhanced proliferation. Gene function studies identify an unknown gene, C4orf19, within chr4p, which suppresses proliferation when overexpressed-a member of the PDCD10-GCKIII kinase module we name PGCKA1. Genome-wide pooled overexpression screens using a barcoded library of human open reading frames identify chromosomal regions, including chr4p, that suppress proliferation when overexpressed in a context-dependent manner, implicating network interactions. Together, these results shed light on the early emergence of complex aneuploid karyotypes involving chr4p and adaptive landscapes shaping breast cancer genomes.
Collapse
Affiliation(s)
- Elena Kuzmin
- Rosalind and Morris Goodman Cancer Institute, Montreal, QC H3A 1A3, Canada; Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada.
| | | | | | - Jean Monlong
- Department of Human Genetics, McGill University, Montreal, QC H3A 0C7, Canada; McGill Genome Centre, Montreal, QC H3A 0G1, Canada
| | - Kento T Abe
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Sinai Health, Toronto, ON M5G 1X5, Canada
| | - Paula P Coelho
- Rosalind and Morris Goodman Cancer Institute, Montreal, QC H3A 1A3, Canada; Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada
| | - Michael Schwartz
- Rosalind and Morris Goodman Cancer Institute, Montreal, QC H3A 1A3, Canada; Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada
| | - Joseph Del Corpo
- Department of Biology, Concordia University, Montreal, QC H4B 1R6, Canada
| | - Dongmei Zou
- Rosalind and Morris Goodman Cancer Institute, Montreal, QC H3A 1A3, Canada
| | - Genevieve Morin
- Rosalind and Morris Goodman Cancer Institute, Montreal, QC H3A 1A3, Canada; Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada
| | - Alain Pacis
- McGill Genome Centre, Montreal, QC H3A 0G1, Canada; Canadian Centre for Computational Genomics (C3G), McGill University, Montreal, QC H3A 0G1, Canada
| | - Yang Yang
- Department of Human Genetics, McGill University, Montreal, QC H3A 0C7, Canada
| | - Constanza Martinez
- Rosalind and Morris Goodman Cancer Institute, Montreal, QC H3A 1A3, Canada; Department of Pathology, McGill University, Montreal, QC H3A 2B4, Canada; Gerald Bronfman Department of Oncology, McGill University, Montreal, QC H4A 3T2, Canada
| | - Jarrett Barber
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Vector Institute, Toronto, ON M5G 1M1, Canada; Ontario Institute for Cancer Research, Toronto, ON M5G 0A3, Canada; Computational and Systems Biology, Sloan Kettering Institute, New York City, NY 10065, USA
| | - Hellen Kuasne
- Rosalind and Morris Goodman Cancer Institute, Montreal, QC H3A 1A3, Canada
| | - Rui Li
- Department of Human Genetics, McGill University, Montreal, QC H3A 0C7, Canada; McGill Genome Centre, Montreal, QC H3A 0G1, Canada
| | - Mathieu Bourgey
- McGill Genome Centre, Montreal, QC H3A 0G1, Canada; Canadian Centre for Computational Genomics (C3G), McGill University, Montreal, QC H3A 0G1, Canada
| | - Anne-Marie Fortier
- Rosalind and Morris Goodman Cancer Institute, Montreal, QC H3A 1A3, Canada
| | - Peter G Davison
- Department of Surgery, McGill University, Montreal, QC H3G 1A4, Canada; McGill University Health Centre, Montreal, QC H4A 3J1, Canada
| | - Atilla Omeroglu
- Department of Pathology, McGill University, Montreal, QC H3A 2B4, Canada
| | | | - Quaid Morris
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Vector Institute, Toronto, ON M5G 1M1, Canada; Ontario Institute for Cancer Research, Toronto, ON M5G 0A3, Canada; Computational and Systems Biology, Sloan Kettering Institute, New York City, NY 10065, USA; Gerstner Sloan Kettering Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Claudia L Kleinman
- Department of Human Genetics, McGill University, Montreal, QC H3A 0C7, Canada; Lady Davis Institute for Medical Research, Montreal, QC H3T 1E2, Canada
| | - Sidong Huang
- Rosalind and Morris Goodman Cancer Institute, Montreal, QC H3A 1A3, Canada; Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada; Department of Human Genetics, McGill University, Montreal, QC H3A 0C7, Canada
| | - Anne-Claude Gingras
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Sinai Health, Toronto, ON M5G 1X5, Canada
| | - Jiannis Ragoussis
- Department of Human Genetics, McGill University, Montreal, QC H3A 0C7, Canada; McGill Genome Centre, Montreal, QC H3A 0G1, Canada
| | - Guillaume Bourque
- Department of Human Genetics, McGill University, Montreal, QC H3A 0C7, Canada; McGill Genome Centre, Montreal, QC H3A 0G1, Canada; Canadian Centre for Computational Genomics (C3G), McGill University, Montreal, QC H3A 0G1, Canada
| | - Peter Van Loo
- The Francis Crick Institute, NW1 1AT London, UK; Department of Genetics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Morag Park
- Rosalind and Morris Goodman Cancer Institute, Montreal, QC H3A 1A3, Canada; Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada; Gerald Bronfman Department of Oncology, McGill University, Montreal, QC H4A 3T2, Canada.
| |
Collapse
|
26
|
Li R, Shi F, Song L, Yu Z. scGAL: unmask tumor clonal substructure by jointly analyzing independent single-cell copy number and scRNA-seq data. BMC Genomics 2024; 25:393. [PMID: 38649804 PMCID: PMC11034052 DOI: 10.1186/s12864-024-10319-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 04/17/2024] [Indexed: 04/25/2024] Open
Abstract
BACKGROUND Accurately deciphering clonal copy number substructure can provide insights into the evolutionary mechanism of cancer, and clustering single-cell copy number profiles has become an effective means to unmask intra-tumor heterogeneity (ITH). However, copy numbers inferred from single-cell DNA sequencing (scDNA-seq) data are error-prone due to technically confounding factors such as amplification bias and allele-dropout, and this makes it difficult to precisely identify the ITH. RESULTS We introduce a hybrid model called scGAL to infer clonal copy number substructure. It combines an autoencoder with a generative adversarial network to jointly analyze independent single-cell copy number profiles and gene expression data from same cell line. Under an adversarial learning framework, scGAL exploits complementary information from gene expression data to relieve the effects of noise in copy number data, and learns latent representations of scDNA-seq cells for accurate inference of the ITH. Evaluation results on three real cancer datasets suggest scGAL is able to accurately infer clonal architecture and surpasses other similar methods. In addition, assessment of scGAL on various simulated datasets demonstrates its high robustness against the changes of data size and distribution. scGAL can be accessed at: https://github.com/zhyu-lab/scgal . CONCLUSIONS Joint analysis of independent single-cell copy number and gene expression data from a same cell line can effectively exploit complementary information from individual omics, and thus gives more refined indication of clonal copy number substructure.
Collapse
Affiliation(s)
- Ruixiang Li
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China
| | - Fangyuan Shi
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, 750021, China
| | - Lijuan Song
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, 750021, China
| | - Zhenhua Yu
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China.
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, 750021, China.
| |
Collapse
|
27
|
Liu F, Shi F, Du F, Cao X, Yu Z. CoT: a transformer-based method for inferring tumor clonal copy number substructure from scDNA-seq data. Brief Bioinform 2024; 25:bbae187. [PMID: 38670159 PMCID: PMC11052634 DOI: 10.1093/bib/bbae187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 03/08/2024] [Accepted: 04/16/2024] [Indexed: 04/28/2024] Open
Abstract
Single-cell DNA sequencing (scDNA-seq) has been an effective means to unscramble intra-tumor heterogeneity, while joint inference of tumor clones and their respective copy number profiles remains a challenging task due to the noisy nature of scDNA-seq data. We introduce a new bioinformatics method called CoT for deciphering clonal copy number substructure. The backbone of CoT is a Copy number Transformer autoencoder that leverages multi-head attention mechanism to explore correlations between different genomic regions, and thus capture global features to create latent embeddings for the cells. CoT makes it convenient to first infer cell subpopulations based on the learned embeddings, and then estimate single-cell copy numbers through joint analysis of read counts data for the cells belonging to the same cluster. This exploitation of clonal substructure information in copy number analysis helps to alleviate the effect of read counts non-uniformity, and yield robust estimations of the tumor copy numbers. Performance evaluation on synthetic and real datasets showcases that CoT outperforms the state of the arts, and is highly useful for deciphering clonal copy number substructure.
Collapse
Affiliation(s)
- Furui Liu
- School of Information Engineering, Ningxia University, 750021, Ningxia, China
| | - Fangyuan Shi
- School of Information Engineering, Ningxia University, 750021, Ningxia, China
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, 750021, Ningxia, China
| | - Fang Du
- School of Information Engineering, Ningxia University, 750021, Ningxia, China
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, 750021, Ningxia, China
| | - Xiangmei Cao
- Basic Medical School, Ningxia Medical University, 750001, Ningxia, China
| | - Zhenhua Yu
- School of Information Engineering, Ningxia University, 750021, Ningxia, China
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, 750021, Ningxia, China
| |
Collapse
|
28
|
Shi H, Williams MJ, Satas G, Weiner AC, McPherson A, Shah SP. Allele-specific transcriptional effects of subclonal copy number alterations enable genotype-phenotype mapping in cancer cells. Nat Commun 2024; 15:2482. [PMID: 38509111 PMCID: PMC10954741 DOI: 10.1038/s41467-024-46710-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 03/01/2024] [Indexed: 03/22/2024] Open
Abstract
Subclonal copy number alterations are a prevalent feature in tumors with high chromosomal instability and result in heterogeneous cancer cell populations with distinct phenotypes. However, the extent to which subclonal copy number alterations contribute to clone-specific phenotypes remains poorly understood. We develop TreeAlign, which computationally integrates independently sampled single-cell DNA and RNA sequencing data from the same cell population. TreeAlign accurately encodes dosage effects from subclonal copy number alterations, the impact of allelic imbalance on allele-specific transcription, and obviates the need to define genotypic clones from a phylogeny a priori, leading to highly granular definitions of clones with distinct expression programs. These improvements enable clone-clone gene expression comparisons with higher resolution and identification of expression programs that are genomically independent. Our approach sets the stage for dissecting the relative contribution of fixed genomic alterations and dynamic epigenetic processes on gene expression programs in cancer.
Collapse
Affiliation(s)
- Hongyu Shi
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Gerstner Sloan Kettering Graduate School of Biomedical Sciences, New York, NY, USA
| | - Marc J Williams
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Gryte Satas
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Adam C Weiner
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Andrew McPherson
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Sohrab P Shah
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| |
Collapse
|
29
|
Carroll C, Manaprasertsak A, Boffelli Castro A, van den Bos H, Spierings DC, Wardenaar R, Bukkuri A, Engström N, Baratchart E, Yang M, Biloglav A, Cornwallis CK, Johansson B, Hagerling C, Arsenian-Henriksson M, Paulsson K, Amend SR, Mohlin S, Foijer F, McIntyre A, Pienta KJ, Hammarlund EU. Drug-resilient Cancer Cell Phenotype Is Acquired via Polyploidization Associated with Early Stress Response Coupled to HIF2α Transcriptional Regulation. CANCER RESEARCH COMMUNICATIONS 2024; 4:691-705. [PMID: 38385626 PMCID: PMC10919208 DOI: 10.1158/2767-9764.crc-23-0396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 12/27/2023] [Accepted: 02/16/2024] [Indexed: 02/23/2024]
Abstract
Therapeutic resistance and recurrence remain core challenges in cancer therapy. How therapy resistance arises is currently not fully understood with tumors surviving via multiple alternative routes. Here, we demonstrate that a subset of cancer cells survives therapeutic stress by entering a transient state characterized by whole-genome doubling. At the onset of the polyploidization program, we identified an upregulation of key transcriptional regulators, including the early stress-response protein AP-1 and normoxic stabilization of HIF2α. We found altered chromatin accessibility, ablated expression of retinoblastoma protein (RB1), and enrichment of AP-1 motif accessibility. We demonstrate that AP-1 and HIF2α regulate a therapy resilient and survivor phenotype in cancer cells. Consistent with this, genetic or pharmacologic targeting of AP-1 and HIF2α reduced the number of surviving cells following chemotherapy treatment. The role of AP-1 and HIF2α in stress response by polyploidy suggests a novel avenue for tackling chemotherapy-induced resistance in cancer. SIGNIFICANCE In response to cisplatin treatment, some surviving cancer cells undergo whole-genome duplications without mitosis, which represents a mechanism of drug resistance. This study presents mechanistic data to implicate AP-1 and HIF2α signaling in the formation of this surviving cell phenotype. The results open a new avenue for targeting drug-resistant cells.
Collapse
Affiliation(s)
- Christopher Carroll
- Department of Experimental Medical Science, Lund University, Lund, Sweden
- Lund Stem Cell Center (SCC), Lund University, Lund, Sweden
- Lund University Cancer Center (LUCC), Lund University, Lund, Sweden
| | - Auraya Manaprasertsak
- Department of Experimental Medical Science, Lund University, Lund, Sweden
- Lund Stem Cell Center (SCC), Lund University, Lund, Sweden
- Lund University Cancer Center (LUCC), Lund University, Lund, Sweden
| | - Arthur Boffelli Castro
- Department of Experimental Medical Science, Lund University, Lund, Sweden
- Lund Stem Cell Center (SCC), Lund University, Lund, Sweden
- Lund University Cancer Center (LUCC), Lund University, Lund, Sweden
| | - Hilda van den Bos
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, Groningen, the Netherlands
| | - Diana C.J. Spierings
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, Groningen, the Netherlands
| | - René Wardenaar
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, Groningen, the Netherlands
| | - Anuraag Bukkuri
- Department of Experimental Medical Science, Lund University, Lund, Sweden
- Lund Stem Cell Center (SCC), Lund University, Lund, Sweden
- Lund University Cancer Center (LUCC), Lund University, Lund, Sweden
| | - Niklas Engström
- Department of Experimental Medical Science, Lund University, Lund, Sweden
- Lund Stem Cell Center (SCC), Lund University, Lund, Sweden
- Lund University Cancer Center (LUCC), Lund University, Lund, Sweden
| | - Etienne Baratchart
- Department of Experimental Medical Science, Lund University, Lund, Sweden
- Lund Stem Cell Center (SCC), Lund University, Lund, Sweden
- Lund University Cancer Center (LUCC), Lund University, Lund, Sweden
| | - Minjun Yang
- Division of Clinical Genetics, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Andrea Biloglav
- Division of Clinical Genetics, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | | | - Bertil Johansson
- Division of Clinical Genetics, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Catharina Hagerling
- Department of Experimental Medical Science, Lund University, Lund, Sweden
- Lund Stem Cell Center (SCC), Lund University, Lund, Sweden
- Lund University Cancer Center (LUCC), Lund University, Lund, Sweden
| | - Marie Arsenian-Henriksson
- Department of Experimental Medical Science, Lund University, Lund, Sweden
- Department of Microbiology, Tumor and Cell Biology (MTC), Karolinska Institutet, Biomedicum, Stockholm, Sweden
| | - Kajsa Paulsson
- Division of Clinical Genetics, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Sarah R. Amend
- Cancer Ecology Center, the Brady Urological Institute, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Sofie Mohlin
- Lund Stem Cell Center (SCC), Lund University, Lund, Sweden
- Lund University Cancer Center (LUCC), Lund University, Lund, Sweden
- Division of Pediatrics, Department of Clinical Sciences, Lund University, Lund, Sweden
| | - Floris Foijer
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, Groningen, the Netherlands
| | - Alan McIntyre
- Hypoxia and Acidosis Group, Nottingham Breast Cancer Research Centre, School of Medicine, Biodiscovery Institute, University of Nottingham, Nottingham, United Kingdom
| | - Kenneth J. Pienta
- Cancer Ecology Center, the Brady Urological Institute, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Emma U. Hammarlund
- Department of Experimental Medical Science, Lund University, Lund, Sweden
- Lund Stem Cell Center (SCC), Lund University, Lund, Sweden
- Lund University Cancer Center (LUCC), Lund University, Lund, Sweden
| |
Collapse
|
30
|
Schneider MP, Cullen AE, Pangonyte J, Skelton J, Major H, Van Oudenhove E, Garcia MJ, Chaves Urbano B, Piskorz AM, Brenton JD, Macintyre G, Markowetz F. scAbsolute: measuring single-cell ploidy and replication status. Genome Biol 2024; 25:62. [PMID: 38438920 PMCID: PMC10910719 DOI: 10.1186/s13059-024-03204-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 02/22/2024] [Indexed: 03/06/2024] Open
Abstract
Cancer cells often exhibit DNA copy number aberrations and can vary widely in their ploidy. Correct estimation of the ploidy of single-cell genomes is paramount for downstream analysis. Based only on single-cell DNA sequencing information, scAbsolute achieves accurate and unbiased measurement of single-cell ploidy and replication status, including whole-genome duplications. We demonstrate scAbsolute's capabilities using experimental cell multiplets, a FUCCI cell cycle expression system, and a benchmark against state-of-the-art methods. scAbsolute provides a robust foundation for single-cell DNA sequencing analysis across different technologies and has the potential to enable improvements in a number of downstream analyses.
Collapse
Affiliation(s)
- Michael P Schneider
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Amy E Cullen
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Justina Pangonyte
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Jason Skelton
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Harvey Major
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Elke Van Oudenhove
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Maria J Garcia
- Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | | | - Anna M Piskorz
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - James D Brenton
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Geoff Macintyre
- Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Florian Markowetz
- University of Cambridge, Cambridge, UK.
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK.
| |
Collapse
|
31
|
Bai X, Duren Z, Wan L, Xia LC. Joint inference of clonal structure using single-cell genome and transcriptome sequencing data. NAR Genom Bioinform 2024; 6:lqae017. [PMID: 38486887 PMCID: PMC10939367 DOI: 10.1093/nargab/lqae017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 11/19/2023] [Accepted: 01/29/2024] [Indexed: 03/17/2024] Open
Abstract
Latest advancements in the high-throughput single-cell genome (scDNA) and transcriptome (scRNA) sequencing technologies enabled cell-resolved investigation of tissue clones. However, it remains challenging to cluster and couple single cells for heterogeneous scRNA and scDNA data generated from the same specimen. In this study, we present a computational framework called CCNMF, which employs a novel Coupled-Clone Non-negative Matrix Factorization technique to jointly infer clonal structure for matched scDNA and scRNA data. CCNMF couples multi-omics single cells by linking copy number and gene expression profiles through their general concordance. It successfully resolved the underlying coexisting clones with high correlations between the clonal genome and transcriptome from the same specimen. We validated that CCNMF can achieve high accuracy and robustness using both simulated benchmarks and real-world applications, including an ovarian cancer cell lines mixture, a gastric cancer cell line, and a primary gastric cancer. In summary, CCNMF provides a powerful tool for integrating multi-omics single-cell data, enabling simultaneous resolution of genomic and transcriptomic clonal architecture. This computational framework facilitates the understanding of how cellular gene expression changes in conjunction with clonal genome alternations, shedding light on the cellular genomic difference of subclones that contributes to tumor evolution.
Collapse
Affiliation(s)
- Xiangqi Bai
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Zhana Duren
- Center for Human Genetics and Department of Genetics and Biochemistry, Clemson University, Greenwood, SC 29646, USA
| | - Lin Wan
- NCMIS, LSC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Li C Xia
- Department of Statistics and Financial Mathematics, School of Mathematics, South China University of Technology, Guangzhou, 510006, China
| |
Collapse
|
32
|
Qin F, Cai G, Amos CI, Xiao F. A statistical learning method for simultaneous copy number estimation and subclone clustering with single-cell sequencing data. Genome Res 2024; 34:85-93. [PMID: 38290978 PMCID: PMC10903939 DOI: 10.1101/gr.278098.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 01/08/2024] [Indexed: 02/01/2024]
Abstract
The availability of single-cell sequencing (SCS) enables us to assess intra-tumor heterogeneity and identify cellular subclones without the confounding effect of mixed cells. Copy number aberrations (CNAs) have been commonly used to identify subclones in SCS data using various clustering methods, as cells comprising a subpopulation are found to share a genetic profile. However, currently available methods may generate spurious results (e.g., falsely identified variants) in the procedure of CNA detection, thereby diminishing the accuracy of subclone identification within a large, complex cell population. In this study, we developed a subclone clustering method based on a fused lasso model, referred to as FLCNA, which can simultaneously detect CNAs in single-cell DNA sequencing (scDNA-seq) data. Spike-in simulations were conducted to evaluate the clustering and CNA detection performance of FLCNA, benchmarking it against existing copy number estimation methods (SCOPE, HMMcopy) in combination with commonly used clustering methods. Application of FLCNA to a scDNA-seq data set of breast cancer revealed different genomic variation patterns in neoadjuvant chemotherapy-treated samples and pretreated samples. We show that FLCNA is a practical and powerful method for subclone identification and CNA detection with scDNA-seq data.
Collapse
Affiliation(s)
- Fei Qin
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina 29208, USA
| | - Guoshuai Cai
- Department of Environmental Health Science, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina 29208, USA
| | - Christopher I Amos
- Department of Quantitative Sciences, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Feifei Xiao
- Department of Biostatistics, College of Public Health and Health Professions and College of Medicine, University of Florida, Gainesville, Florida 32603, USA
| |
Collapse
|
33
|
Baker TM, Waise S, Tarabichi M, Van Loo P. Aneuploidy and complex genomic rearrangements in cancer evolution. NATURE CANCER 2024; 5:228-239. [PMID: 38286829 PMCID: PMC7616040 DOI: 10.1038/s43018-023-00711-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 12/14/2023] [Indexed: 01/31/2024]
Abstract
Mutational processes that alter large genomic regions occur frequently in developing tumors. They range from simple copy number gains and losses to the shattering and reassembly of entire chromosomes. These catastrophic events, such as chromothripsis, chromoplexy and the formation of extrachromosomal DNA, affect the expression of many genes and therefore have a substantial effect on the fitness of the cells in which they arise. In this review, we cover large genomic alterations, the mechanisms that cause them and their effect on tumor development and evolution.
Collapse
Affiliation(s)
- Toby M Baker
- The Francis Crick Institute, London, UK
- Department of Genetics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Sara Waise
- The Francis Crick Institute, London, UK
- Cancer Sciences Unit, University of Southampton, Southampton, UK
| | - Maxime Tarabichi
- The Francis Crick Institute, London, UK
- Institute for Interdisciplinary Research (IRIBHM), Université Libre de Bruxelles, Brussels, Belgium
| | - Peter Van Loo
- The Francis Crick Institute, London, UK.
- Department of Genetics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
| |
Collapse
|
34
|
Antonello A, Bergamin R, Calonaci N, Househam J, Milite S, Williams MJ, Anselmi F, d'Onofrio A, Sundaram V, Sosinsky A, Cross WCH, Caravagna G. Computational validation of clonal and subclonal copy number alterations from bulk tumor sequencing using CNAqc. Genome Biol 2024; 25:38. [PMID: 38297376 PMCID: PMC10832148 DOI: 10.1186/s13059-024-03170-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 01/10/2024] [Indexed: 02/02/2024] Open
Abstract
Copy number alterations (CNAs) are among the most important genetic events in cancer, but their detection from sequencing data is challenging because of unknown sample purity, tumor ploidy, and general intra-tumor heterogeneity. Here, we present CNAqc, an evolution-inspired method to perform the computational validation of clonal and subclonal CNAs detected from bulk DNA sequencing. CNAqc is validated using single-cell data and simulations, is applied to over 4000 TCGA and PCAWG samples, and is incorporated into the validation process for the clinically accredited bioinformatics pipeline at Genomics England. CNAqc is designed to support automated quality control procedures for tumor somatic data validation.
Collapse
Affiliation(s)
- Alice Antonello
- Department of Mathematics, Informatics and Geosciences (MIGe), University of Trieste, Trieste, Italy
| | - Riccardo Bergamin
- Department of Mathematics, Informatics and Geosciences (MIGe), University of Trieste, Trieste, Italy
| | - Nicola Calonaci
- Department of Mathematics, Informatics and Geosciences (MIGe), University of Trieste, Trieste, Italy
| | - Jacob Househam
- Evolution and Cancer Lab, Centre for Genomics and Computational Biology, Barts Cancer Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Salvatore Milite
- Department of Mathematics, Informatics and Geosciences (MIGe), University of Trieste, Trieste, Italy
- Centre for Computational Biology, Human Technopole, Milan, Italy
| | - Marc J Williams
- Department of Computational Oncology, Memorial Sloan Kettering, New York, USA
| | - Fabio Anselmi
- Department of Mathematics, Informatics and Geosciences (MIGe), University of Trieste, Trieste, Italy
| | - Alberto d'Onofrio
- Department of Mathematics, Informatics and Geosciences (MIGe), University of Trieste, Trieste, Italy
| | | | | | - William C H Cross
- Department of Research Pathology, UCL Cancer Institute, University College London, London, UK
| | - Giulio Caravagna
- Department of Mathematics, Informatics and Geosciences (MIGe), University of Trieste, Trieste, Italy.
- Evolutionary Genomics and Modelling Team, Centre for Evolution and Cancer, Institute of Cancer Research, London, UK.
| |
Collapse
|
35
|
Liu F, Shi F, Yu Z. Inferring single-cell copy number profiles through cross-cell segmentation of read counts. BMC Genomics 2024; 25:25. [PMID: 38166601 PMCID: PMC10762977 DOI: 10.1186/s12864-023-09901-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 12/12/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND Copy number alteration (CNA) is one of the major genomic variations that frequently occur in cancers, and accurate inference of CNAs is essential for unmasking intra-tumor heterogeneity (ITH) and tumor evolutionary history. Single-cell DNA sequencing (scDNA-seq) makes it convenient to profile CNAs at single-cell resolution, and thus aids in better characterization of ITH. Despite that several computational methods have been proposed to decipher single-cell CNAs, their performance is limited in either breakpoint detection or copy number estimation due to the high dimensionality and noisy nature of read counts data. RESULTS By treating breakpoint detection as a process to segment high dimensional read count sequence, we develop a novel method called DeepCNA for cross-cell segmentation of read count sequence and per-cell inference of CNAs. To cope with the difficulty of segmentation, an autoencoder (AE) network is employed in DeepCNA to project the original data into a low-dimensional space, where the breakpoints can be efficiently detected along each latent dimension and further merged to obtain the final breakpoints. Unlike the existing methods that manually calculate certain statistics of read counts to find breakpoints, the AE model makes it convenient to automatically learn the representations. Based on the inferred breakpoints, we employ a mixture model to predict copy numbers of segments for each cell, and leverage expectation-maximization algorithm to efficiently estimate cell ploidy by exploring the most abundant copy number state. Benchmarking results on simulated and real data demonstrate our method is able to accurately infer breakpoints as well as absolute copy numbers and surpasses the existing methods under different test conditions. DeepCNA can be accessed at: https://github.com/zhyu-lab/deepcna . CONCLUSIONS Profiling single-cell CNAs based on deep learning is becoming a new paradigm of scDNA-seq data analysis, and DeepCNA is an enhancement to the current arsenal of computational methods for investigating cancer genomics.
Collapse
Affiliation(s)
- Furui Liu
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China
| | - Fangyuan Shi
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-Founded By Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, 750021, China
| | - Zhenhua Yu
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China.
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-Founded By Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, 750021, China.
| |
Collapse
|
36
|
Sashittal P, Zhang H, Iacobuzio-Donahue CA, Raphael BJ. ConDoR: tumor phylogeny inference with a copy-number constrained mutation loss model. Genome Biol 2023; 24:272. [PMID: 38037115 PMCID: PMC10688497 DOI: 10.1186/s13059-023-03106-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Accepted: 11/07/2023] [Indexed: 12/02/2023] Open
Abstract
A tumor contains a diverse collection of somatic mutations that reflect its past evolutionary history and that range in scale from single nucleotide variants (SNVs) to large-scale copy-number aberrations (CNAs). However, no current single-cell DNA sequencing (scDNA-seq) technology produces accurate measurements of both SNVs and CNAs, complicating the inference of tumor phylogenies. We introduce a new evolutionary model, the constrained k-Dollo model, that uses SNVs as phylogenetic markers but constrains losses of SNVs according to clusters of cells. We derive an algorithm, ConDoR, that infers phylogenies from targeted scDNA-seq data using this model. We demonstrate the advantages of ConDoR on simulated and real scDNA-seq data.
Collapse
Affiliation(s)
| | - Haochen Zhang
- Gerstner Sloan Kettering Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, NY, USA
| | - Christine A Iacobuzio-Donahue
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, NY, USA
- David M. Rubenstein Center for Pancreatic Cancer Research, Memorial Sloan Kettering Cancer Center, NY, USA
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, NY, USA
| | | |
Collapse
|
37
|
Nulsen J, Hussain N, Al-Deka A, Yap J, Uddin K, Yau C, Ahmed AA. Completing a genomic characterisation of microscopic tumour samples with copy number. BMC Bioinformatics 2023; 24:453. [PMID: 38036971 PMCID: PMC10688092 DOI: 10.1186/s12859-023-05576-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Accepted: 11/21/2023] [Indexed: 12/02/2023] Open
Abstract
BACKGROUND Genomic insights in settings where tumour sample sizes are limited to just hundreds or even tens of cells hold great clinical potential, but also present significant technical challenges. We previously developed the DigiPico sequencing platform to accurately identify somatic mutations from such samples. RESULTS Here, we complete this genomic characterisation with copy number. We present a novel protocol, PicoCNV, to call allele-specific somatic copy number alterations from picogram quantities of tumour DNA. We find that PicoCNV provides exactly accurate copy number in 84% of the genome for even the smallest samples, and demonstrate its clinical potential in maintenance therapy. CONCLUSIONS PicoCNV complements our existing platform, allowing for accurate and comprehensive genomic characterisations of cancers in settings where only microscopic samples are available.
Collapse
Affiliation(s)
- Joel Nulsen
- Weatherall Institute for Molecular Medicine, University of Oxford, Oxford, UK
- Nuffield Department for Women's and Reproductive Health, University of Oxford, Oxford, UK
- Singula Bio Ltd., Oxford, UK
| | - Nosheen Hussain
- Weatherall Institute for Molecular Medicine, University of Oxford, Oxford, UK
- Nuffield Department for Women's and Reproductive Health, University of Oxford, Oxford, UK
- Singula Bio Ltd., Oxford, UK
| | - Aws Al-Deka
- Weatherall Institute for Molecular Medicine, University of Oxford, Oxford, UK
- Nuffield Department for Women's and Reproductive Health, University of Oxford, Oxford, UK
- Singula Bio Ltd., Oxford, UK
| | - Jason Yap
- University of Birmingham, Birmingham, UK
| | | | - Christopher Yau
- Nuffield Department for Women's and Reproductive Health, University of Oxford, Oxford, UK
- Health Data Research UK, London, UK
| | - Ahmed Ashour Ahmed
- Weatherall Institute for Molecular Medicine, University of Oxford, Oxford, UK.
- Nuffield Department for Women's and Reproductive Health, University of Oxford, Oxford, UK.
- Singula Bio Ltd., Oxford, UK.
- Oxford Biomedical Research Centre, National Institute of Health Research, Oxford, UK.
| |
Collapse
|
38
|
Schmidt H, Sashittal P, Raphael BJ. A zero-agnostic model for copy number evolution in cancer. PLoS Comput Biol 2023; 19:e1011590. [PMID: 37943952 PMCID: PMC10662746 DOI: 10.1371/journal.pcbi.1011590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 11/21/2023] [Accepted: 10/11/2023] [Indexed: 11/12/2023] Open
Abstract
MOTIVATION New low-coverage single-cell DNA sequencing technologies enable the measurement of copy number profiles from thousands of individual cells within tumors. From this data, one can infer the evolutionary history of the tumor by modeling transformations of the genome via copy number aberrations. Copy number aberrations alter multiple adjacent genomic loci, violating the standard phylogenetic assumption that loci evolve independently. Thus, specialized models to infer copy number phylogenies have been introduced. A widely used model is the copy number transformation (CNT) model in which a genome is represented by an integer vector and a copy number aberration is an event that either increases or decreases the number of copies of a contiguous segment of the genome. The CNT distance between a pair of copy number profiles is the minimum number of events required to transform one profile to another. While this distance can be computed efficiently, no efficient algorithm has been developed to find the most parsimonious phylogeny under the CNT model. RESULTS We introduce the zero-agnostic copy number transformation (ZCNT) model, a simplification of the CNT model that allows the amplification or deletion of regions with zero copies. We derive a closed form expression for the ZCNT distance between two copy number profiles and show that, unlike the CNT distance, the ZCNT distance forms a metric. We leverage the closed-form expression for the ZCNT distance and an alternative characterization of copy number profiles to derive polynomial time algorithms for two natural relaxations of the small parsimony problem on copy number profiles. While the alteration of zero copy number regions allowed under the ZCNT model is not biologically realistic, we show on both simulated and real datasets that the ZCNT distance is a close approximation to the CNT distance. Extending our polynomial time algorithm for the ZCNT small parsimony problem, we develop an algorithm, Lazac, for solving the large parsimony problem on copy number profiles. We demonstrate that Lazac outperforms existing methods for inferring copy number phylogenies on both simulated and real data.
Collapse
Affiliation(s)
- Henri Schmidt
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
| | - Palash Sashittal
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
| | - Benjamin J. Raphael
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
| |
Collapse
|
39
|
Weber LL, Zhang C, Ochoa I, El-Kebir M. Phertilizer: Growing a clonal tree from ultra-low coverage single-cell DNA sequencing of tumors. PLoS Comput Biol 2023; 19:e1011544. [PMID: 37819942 PMCID: PMC10593221 DOI: 10.1371/journal.pcbi.1011544] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 10/23/2023] [Accepted: 09/26/2023] [Indexed: 10/13/2023] Open
Abstract
Emerging ultra-low coverage single-cell DNA sequencing (scDNA-seq) technologies have enabled high resolution evolutionary studies of copy number aberrations (CNAs) within tumors. While these sequencing technologies are well suited for identifying CNAs due to the uniformity of sequencing coverage, the sparsity of coverage poses challenges for the study of single-nucleotide variants (SNVs). In order to maximize the utility of increasingly available ultra-low coverage scDNA-seq data and obtain a comprehensive understanding of tumor evolution, it is important to also analyze the evolution of SNVs from the same set of tumor cells. We present Phertilizer, a method to infer a clonal tree from ultra-low coverage scDNA-seq data of a tumor. Based on a probabilistic model, our method recursively partitions the data by identifying key evolutionary events in the history of the tumor. We demonstrate the performance of Phertilizer on simulated data as well as on two real datasets, finding that Phertilizer effectively utilizes the copy-number signal inherent in the data to more accurately uncover clonal structure and genotypes compared to previous methods.
Collapse
Affiliation(s)
- Leah L. Weber
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
| | - Chuanyi Zhang
- Department of Electrical & Computer Engineering, University of Illinois Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
| | - Idoia Ochoa
- Department of Electrical & Computer Engineering, University of Illinois Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
- Department of Electrical and Electronics Engineering, University of Navarre, Donostia, Spain
| | - Mohammed El-Kebir
- Department of Electrical and Electronics Engineering, University of Navarre, Donostia, Spain
- Cancer Center at Illinois, University of Illinois Urbana-Champaign, Urbana-Champaign, Illinois, United States of America
| |
Collapse
|
40
|
Watkins TBK, Colliver EC, Huska MR, Kaufmann TL, Lim EL, Duncan CB, Haase K, Van Loo P, Swanton C, McGranahan N, Schwarz RF. Refphase: Multi-sample phasing reveals haplotype-specific copy number heterogeneity. PLoS Comput Biol 2023; 19:e1011379. [PMID: 37871126 PMCID: PMC10621967 DOI: 10.1371/journal.pcbi.1011379] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 11/02/2023] [Accepted: 07/22/2023] [Indexed: 10/25/2023] Open
Abstract
Most computational methods that infer somatic copy number alterations (SCNAs) from bulk sequencing of DNA analyse tumour samples individually. However, the sequencing of multiple tumour samples from a patient's disease is an increasingly common practice. We introduce Refphase, an algorithm that leverages this multi-sampling approach to infer haplotype-specific copy numbers through multi-sample phasing. We demonstrate Refphase's ability to infer haplotype-specific SCNAs and characterise their intra-tumour heterogeneity, to uncover previously undetected allelic imbalance in low purity samples, and to identify parallel evolution in the context of whole genome doubling in a pan-cancer cohort of 336 samples from 99 tumours.
Collapse
Affiliation(s)
- Thomas B. K. Watkins
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, United Kingdom
- The Francis Crick Institute, London, United Kingdom
| | | | - Matthew R. Huska
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC) Berlin, Germany
| | - Tom L. Kaufmann
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC) Berlin, Germany
- Department of Electrical Engineering & Computer Science, Technische Universität Berlin, Berlin, Germany
- BIFOLD—Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
- Institute for Computational Cancer Biology (ICCB), Center for Integrated Oncology (CIO), Cancer Research Center Cologne Essen (CCCE), Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Emilia L. Lim
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, United Kingdom
- The Francis Crick Institute, London, United Kingdom
| | - Cody B. Duncan
- Institute for Computational Cancer Biology (ICCB), Center for Integrated Oncology (CIO), Cancer Research Center Cologne Essen (CCCE), Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Kerstin Haase
- German Cancer Consortium (DKTK), partner site Berlin, and German Cancer Research Center (DKFZ), Heidelberg, Germany
- Experimental and Clinical Research Center (ECRC) of the MDC and Charité Berlin, Berlin, Germany
- Department of Pediatric Oncology and Hematology, Charité–Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Peter Van Loo
- The Francis Crick Institute, London, United Kingdom
- Department of Genetics, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
| | - Charles Swanton
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, United Kingdom
- The Francis Crick Institute, London, United Kingdom
- Department of Medical Oncology, University College London Hospitals, London, United Kingdom
| | - Nicholas McGranahan
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, United Kingdom
- Cancer Genome Evolution Research Group, University College London Cancer Institute, London, United Kingdom
| | - Roland F. Schwarz
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC) Berlin, Germany
- BIFOLD—Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
- Institute for Computational Cancer Biology (ICCB), Center for Integrated Oncology (CIO), Cancer Research Center Cologne Essen (CCCE), Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| |
Collapse
|
41
|
Weiner AC, Williams MJ, Shi H, Vázquez-García I, Salehi S, Rusk N, Aparicio S, Shah SP, McPherson A. Single-cell DNA replication dynamics in genomically unstable cancers. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.10.536250. [PMID: 37090647 PMCID: PMC10120671 DOI: 10.1101/2023.04.10.536250] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Dysregulated DNA replication is both a cause and a consequence of aneuploidy, yet the dynamics of DNA replication in aneuploid cell populations remains understudied. We developed a new method, PERT, for inferring cell-specific DNA replication states from single-cell whole genome sequencing, and investigated clone-specific DNA replication dynamics in >50,000 cells obtained from a collection of aneuploid and clonally heterogeneous cell lines, xenografts and primary cancer tissues. Clone replication timing (RT) profiles correlated with future copy number changes in serially passaged cell lines. Cell type was the strongest determinant of RT heterogeneity, while whole genome doubling and mutational process were associated with accumulation of late S-phase cells and weaker RT associations. Copy number changes affecting chromosome X had striking impact on RT, with loss of the inactive X allele shifting replication earlier, and loss of inactive Xq resulting in reactivation of Xp. Finally, analysis of time series xenografts illustrate how cell cycle distributions approximate clone proliferation, recapitulating expected relationships between proliferation and fitness in treatment-naive and chemotherapeutic contexts.
Collapse
Affiliation(s)
- Adam C Weiner
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Marc J Williams
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Hongyu Shi
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Gerstner Sloan Kettering Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Ignacio Vázquez-García
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Sohrab Salehi
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Nicole Rusk
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Samuel Aparicio
- Department of Molecular Oncology, British Columbia Cancer, Vancouver, BC, Canada
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Sohrab P Shah
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Andrew McPherson
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| |
Collapse
|
42
|
Yi D, Nam JW, Jeong H. Toward the functional interpretation of somatic structural variations: bulk- and single-cell approaches. Brief Bioinform 2023; 24:bbad297. [PMID: 37587831 PMCID: PMC10516374 DOI: 10.1093/bib/bbad297] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 07/05/2023] [Accepted: 07/23/2023] [Indexed: 08/18/2023] Open
Abstract
Structural variants (SVs) are genomic rearrangements that can take many different forms such as copy number alterations, inversions and translocations. During cell development and aging, somatic SVs accumulate in the genome with potentially neutral, deleterious or pathological effects. Generation of somatic SVs is a key mutational process in cancer development and progression. Despite their importance, the detection of somatic SVs is challenging, making them less studied than somatic single-nucleotide variants. In this review, we summarize recent advances in whole-genome sequencing (WGS)-based approaches for detecting somatic SVs at the tissue and single-cell levels and discuss their advantages and limitations. First, we describe the state-of-the-art computational algorithms for somatic SV calling using bulk WGS data and compare the performance of somatic SV detectors in the presence or absence of a matched-normal control. We then discuss the unique features of cutting-edge single-cell-based techniques for analyzing somatic SVs. The advantages and disadvantages of bulk and single-cell approaches are highlighted, along with a discussion of their sensitivity to copy-neutral SVs, usefulness for functional inferences and experimental and computational costs. Finally, computational approaches for linking somatic SVs to their functional readouts, such as those obtained from single-cell transcriptome and epigenome analyses, are illustrated, with a discussion of the promise of these approaches in health and diseases.
Collapse
Affiliation(s)
- Dohun Yi
- Department of Life Science, College of Natural Sciences, Hanyang University, Wangsimni-ro 222, Seongdong-gu, Seoul 04763, Republic of Korea
| | - Jin-Wu Nam
- Department of Life Science, College of Natural Sciences, Hanyang University, Wangsimni-ro 222, Seongdong-gu, Seoul 04763, Republic of Korea
- Research Institute for Convergence of Basic Sciences, Hanyang University, Wangsimni-ro 222, Seongdong-gu, Seoul 04763, Republic of Korea
- Bio-BigData Center, Hanyang Institute of Bioscience and Biotechnology, Hanyang University, Wangsimni-ro 222, Seongdong-gu, Seoul 04763, Republic of Korea
- Hanyang Institute of Advanced BioConvergence, Hanyang University, Wangsimni-ro 222, Seongdong-gu, Seoul 04763, Republic of Korea
| | - Hyobin Jeong
- Department of Life Science, College of Natural Sciences, Hanyang University, Wangsimni-ro 222, Seongdong-gu, Seoul 04763, Republic of Korea
- Bio-BigData Center, Hanyang Institute of Bioscience and Biotechnology, Hanyang University, Wangsimni-ro 222, Seongdong-gu, Seoul 04763, Republic of Korea
- Hanyang Institute of Advanced BioConvergence, Hanyang University, Wangsimni-ro 222, Seongdong-gu, Seoul 04763, Republic of Korea
| |
Collapse
|
43
|
Logotheti S, Papadaki E, Zolota V, Logothetis C, Vrahatis AG, Soundararajan R, Tzelepi V. Lineage Plasticity and Stemness Phenotypes in Prostate Cancer: Harnessing the Power of Integrated "Omics" Approaches to Explore Measurable Metrics. Cancers (Basel) 2023; 15:4357. [PMID: 37686633 PMCID: PMC10486655 DOI: 10.3390/cancers15174357] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/21/2023] [Accepted: 08/25/2023] [Indexed: 09/10/2023] Open
Abstract
Prostate cancer (PCa), the most frequent and second most lethal cancer type in men in developed countries, is a highly heterogeneous disease. PCa heterogeneity, therapy resistance, stemness, and lethal progression have been attributed to lineage plasticity, which refers to the ability of neoplastic cells to undergo phenotypic changes under microenvironmental pressures by switching between developmental cell states. What remains to be elucidated is how to identify measurements of lineage plasticity, how to implement them to inform preclinical and clinical research, and, further, how to classify patients and inform therapeutic strategies in the clinic. Recent research has highlighted the crucial role of next-generation sequencing technologies in identifying potential biomarkers associated with lineage plasticity. Here, we review the genomic, transcriptomic, and epigenetic events that have been described in PCa and highlight those with significance for lineage plasticity. We further focus on their relevance in PCa research and their benefits in PCa patient classification. Finally, we explore ways in which bioinformatic analyses can be used to determine lineage plasticity based on large omics analyses and algorithms that can shed light on upstream and downstream events. Most importantly, an integrated multiomics approach may soon allow for the identification of a lineage plasticity signature, which would revolutionize the molecular classification of PCa patients.
Collapse
Affiliation(s)
- Souzana Logotheti
- Department of Pathology, University of Patras, 26504 Patras, Greece; (S.L.); (E.P.); (V.Z.)
| | - Eugenia Papadaki
- Department of Pathology, University of Patras, 26504 Patras, Greece; (S.L.); (E.P.); (V.Z.)
- Department of Informatics, Ionian University, 49100 Corfu, Greece;
| | - Vasiliki Zolota
- Department of Pathology, University of Patras, 26504 Patras, Greece; (S.L.); (E.P.); (V.Z.)
| | - Christopher Logothetis
- Department of Genitourinary Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA;
| | | | - Rama Soundararajan
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Vasiliki Tzelepi
- Department of Pathology, University of Patras, 26504 Patras, Greece; (S.L.); (E.P.); (V.Z.)
| |
Collapse
|
44
|
Sollier E, Kuipers J, Takahashi K, Beerenwinkel N, Jahn K. COMPASS: joint copy number and mutation phylogeny reconstruction from amplicon single-cell sequencing data. Nat Commun 2023; 14:4921. [PMID: 37582954 PMCID: PMC10427627 DOI: 10.1038/s41467-023-40378-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 07/19/2023] [Indexed: 08/17/2023] Open
Abstract
Reconstructing the history of somatic DNA alterations can help understand the evolution of a tumor and predict its resistance to treatment. Single-cell DNA sequencing (scDNAseq) can be used to investigate clonal heterogeneity and to inform phylogeny reconstruction. However, most existing phylogenetic methods for scDNAseq data are designed either for single nucleotide variants (SNVs) or for large copy number alterations (CNAs), or are not applicable to targeted sequencing. Here, we develop COMPASS, a computational method for inferring the joint phylogeny of SNVs and CNAs from targeted scDNAseq data. We evaluate COMPASS on simulated data and apply it to several datasets including a cohort of 123 patients with acute myeloid leukemia. COMPASS detected clonal CNAs that could be orthogonally validated with bulk data, in addition to subclonal ones that require single-cell resolution, some of which point toward convergent evolution.
Collapse
Affiliation(s)
- Etienne Sollier
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- Division of Cancer Epigenomics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Koichi Takahashi
- Department of Leukemia, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Katharina Jahn
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
- Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany.
| |
Collapse
|
45
|
Myers MA, Arnold BJ, Bansal V, Mullen KM, Zaccaria S, Raphael BJ. HATCHet2: clone- and haplotype-specific copy number inference from bulk tumor sequencing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.13.548855. [PMID: 37502835 PMCID: PMC10370020 DOI: 10.1101/2023.07.13.548855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Multi-region DNA sequencing of primary tumors and metastases from individual patients helps identify somatic aberrations driving cancer development. However, most methods to infer copy-number aberrations (CNAs) analyze individual samples. We introduce HATCHet2 to identify haplotype- and clone-specific CNAs simultaneously from multiple bulk samples. HATCHet2 introduces a novel statistic, the mirrored haplotype B-allele frequency (mhBAF), to identify mirrored-subclonal CNAs having different numbers of copies of parental haplotypes in different tumor clones. HATCHet2 also has high accuracy in identifying focal CNAs and extends the earlier HATCHet method in several directions. We demonstrate HATCHet2's improved accuracy using simulations and a single-cell sequencing dataset. HATCHet2 analysis of 50 prostate cancer samples from 10 patients reveals previously-unreported mirrored-subclonal CNAs affecting cancer genes.
Collapse
Affiliation(s)
- Matthew A. Myers
- Department of Computer Science, Princeton University, Princeton, USA
| | - Brian J. Arnold
- Center for Statistics and Machine Learning, Princeton University, Princeton, USA
| | - Vineet Bansal
- Princeton Research Computing, Princeton University, Princeton, NJ, USA
| | - Katelyn M. Mullen
- Human Oncology & Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Simone Zaccaria
- Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK
| | | |
Collapse
|
46
|
Lu B, Curtius K, Graham TA, Yang Z, Barnes CP. CNETML: maximum likelihood inference of phylogeny from copy number profiles of multiple samples. Genome Biol 2023; 24:144. [PMID: 37340508 PMCID: PMC10283241 DOI: 10.1186/s13059-023-02983-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 06/08/2023] [Indexed: 06/22/2023] Open
Abstract
Phylogenetic trees based on copy number profiles from multiple samples of a patient are helpful to understand cancer evolution. Here, we develop a new maximum likelihood method, CNETML, to infer phylogenies from such data. CNETML is the first program to jointly infer the tree topology, node ages, and mutation rates from total copy numbers of longitudinal samples. Our extensive simulations suggest CNETML performs well on copy numbers relative to ploidy and under slight violation of model assumptions. The application of CNETML to real data generates results consistent with previous discoveries and provides novel early copy number events for further investigation.
Collapse
Affiliation(s)
- Bingxin Lu
- Department of Cell and Developmental Biology, University College London, London, UK.
- UCL Genetics Institute, University College London, London, UK.
| | - Kit Curtius
- Barts Cancer Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Trevor A Graham
- Barts Cancer Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK
- Centre for Evolution and Cancer, Institute of Cancer Research, London, UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Chris P Barnes
- Department of Cell and Developmental Biology, University College London, London, UK.
- UCL Genetics Institute, University College London, London, UK.
| |
Collapse
|
47
|
Van de Sande B, Lee JS, Mutasa-Gottgens E, Naughton B, Bacon W, Manning J, Wang Y, Pollard J, Mendez M, Hill J, Kumar N, Cao X, Chen X, Khaladkar M, Wen J, Leach A, Ferran E. Applications of single-cell RNA sequencing in drug discovery and development. Nat Rev Drug Discov 2023; 22:496-520. [PMID: 37117846 PMCID: PMC10141847 DOI: 10.1038/s41573-023-00688-4] [Citation(s) in RCA: 138] [Impact Index Per Article: 69.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/10/2023] [Indexed: 04/30/2023]
Abstract
Single-cell technologies, particularly single-cell RNA sequencing (scRNA-seq) methods, together with associated computational tools and the growing availability of public data resources, are transforming drug discovery and development. New opportunities are emerging in target identification owing to improved disease understanding through cell subtyping, and highly multiplexed functional genomics screens incorporating scRNA-seq are enhancing target credentialling and prioritization. ScRNA-seq is also aiding the selection of relevant preclinical disease models and providing new insights into drug mechanisms of action. In clinical development, scRNA-seq can inform decision-making via improved biomarker identification for patient stratification and more precise monitoring of drug response and disease progression. Here, we illustrate how scRNA-seq methods are being applied in key steps in drug discovery and development, and discuss ongoing challenges for their implementation in the pharmaceutical industry.
Collapse
Affiliation(s)
| | | | | | - Bart Naughton
- Computational Neurobiology, Eisai, Cambridge, MA, USA
| | - Wendi Bacon
- EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
- The Open University, Milton Keynes, UK
| | | | - Yong Wang
- Precision Bioinformatics, Prometheus Biosciences, San Diego, CA, USA
| | | | - Melissa Mendez
- Genomic Sciences, GlaxoSmithKline, Collegeville, PA, USA
| | - Jon Hill
- Global Computational Biology and Digital Sciences, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, USA
| | - Namit Kumar
- Informatics & Predictive Sciences, Bristol Myers Squibb, San Diego, CA, USA
| | - Xiaohong Cao
- Genomic Research Center, AbbVie Inc., Cambridge, MA, USA
| | - Xiao Chen
- Magnet Biomedicine, Cambridge, MA, USA
| | - Mugdha Khaladkar
- Human Genetics and Computational Biology, GlaxoSmithKline, Collegeville, PA, USA
| | - Ji Wen
- Oncology Research and Development Unit, Pfizer, La Jolla, CA, USA
| | | | | |
Collapse
|
48
|
Hessey S, Fessas P, Zaccaria S, Jamal-Hanjani M, Swanton C. Insights into the metastatic cascade through research autopsies. Trends Cancer 2023; 9:490-502. [PMID: 37059687 DOI: 10.1016/j.trecan.2023.03.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 03/05/2023] [Accepted: 03/07/2023] [Indexed: 04/16/2023]
Abstract
Metastasis is a complex process and the leading cause of cancer-related death globally. Recent studies have demonstrated that genomic sequencing data from paired primary and metastatic tumours can be used to trace the evolutionary origins of cells responsible for metastasis. This approach has yielded new insights into the genomic alterations that engender metastatic potential, and the mechanisms by which cancer spreads. Given that the reliability of these approaches is contingent upon how representative the samples are of primary and metastatic tumour heterogeneity, we review insights from studies that have reconstructed the evolution of metastasis within the context of their cohorts and designs. We discuss the role of research autopsies in achieving the comprehensive sampling necessary to advance the current understanding of metastasis.
Collapse
Affiliation(s)
- Sonya Hessey
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK; Cancer Metastasis Laboratory, University College London Cancer Institute, London, UK; Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK
| | - Petros Fessas
- Cancer Metastasis Laboratory, University College London Cancer Institute, London, UK
| | - Simone Zaccaria
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK; Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK
| | - Mariam Jamal-Hanjani
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK; Cancer Metastasis Laboratory, University College London Cancer Institute, London, UK; Department of Oncology, University College London Hospitals, London, UK.
| | - Charles Swanton
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK; Department of Oncology, University College London Hospitals, London, UK; Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK.
| |
Collapse
|
49
|
Qin F, Cai G, Xiao F. A statistical learning method for simultaneous copy number estimation and subclone clustering with single cell sequencing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.18.537346. [PMID: 37131674 PMCID: PMC10153109 DOI: 10.1101/2023.04.18.537346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The availability of single cell sequencing (SCS) enables us to assess intra-tumor heterogeneity and identify cellular subclones without the confounding effect of mixed cells. Copy number aberrations (CNAs) have been commonly used to identify subclones in SCS data using various clustering methods, since cells comprising a subpopulation are found to share genetic profile. However, currently available methods may generate spurious results (e.g., falsely identified CNAs) in the procedure of CNA detection, hence diminishing the accuracy of subclone identification from a large complex cell population. In this study, we developed a CNA detection method based on a fused lasso model, referred to as FLCNA, which can simultaneously identify subclones in single cell DNA sequencing (scDNA-seq) data. Spike-in simulations were conducted to evaluate the clustering and CNA detection performance of FLCNA benchmarking to existing copy number estimation methods (SCOPE, HMMcopy) in combination with the existing and commonly used clustering methods. Interestingly, application of FLCNA to a real scDNA-seq dataset of breast cancer revealed remarkably different genomic variation patterns in neoadjuvant chemotherapy treated samples and pre-treated samples. We show that FLCNA is a practical and powerful method in subclone identification and CNA detection with scDNA-seq data.
Collapse
|
50
|
Schmidt H, Sashittal P, Raphael BJ. A zero-agnostic model for copy number evolution in cancer. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.10.536302. [PMID: 37090633 PMCID: PMC10120719 DOI: 10.1101/2023.04.10.536302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Motivation New low-coverage single-cell DNA sequencing technologies enable the measurement of copy number profiles from thousands of individual cells within tumors. From this data, one can infer the evolutionary history of the tumor by modeling transformations of the genome via copy number aberrations. A widely used model to infer such copy number phylogenies is the copy number transformation (CNT) model in which a genome is represented by an integer vector and a copy number aberration is an event that either increases or decreases the number of copies of a contiguous segment of the genome. The CNT distance between a pair of copy number profiles is the minimum number of events required to transform one profile to another. While this distance can be computed efficiently, no efficient algorithm has been developed to find the most parsimonious phylogeny under the CNT model. Results We introduce the zero-agnostic copy number transformation (ZCNT) model, a simplification of the CNT model that allows the amplification or deletion of regions with zero copies. We derive a closed form expression for the ZCNT distance between two copy number profiles and show that, unlike the CNT distance, the ZCNT distance forms a metric. We leverage the closed-form expression for the ZCNT distance and an alternative characterization of copy number profiles to derive polynomial time algorithms for two natural relaxations of the small parsimony problem on copy number profiles. While the alteration of zero copy number regions allowed under the ZCNT model is not biologically realistic, we show on both simulated and real datasets that the ZCNT distance is a close approximation to the CNT distance. Extending our polynomial time algorithm for the ZCNT small parsimony problem, we develop an algorithm, Lazac, for solving the large parsimony problem on copy number profiles. We demonstrate that Lazac outperforms existing methods for inferring copy number phylogenies on both simulated and real data.
Collapse
Affiliation(s)
- Henri Schmidt
- Department of Computer Science, Princeton University, NJ, USA
| | | | | |
Collapse
|