1
|
Kuipers J, Tuncel MA, Ferreira PF, Jahn K, Beerenwinkel N. Single-cell copy number calling and event history reconstruction. Bioinformatics 2025; 41:btaf072. [PMID: 39946094 PMCID: PMC11897432 DOI: 10.1093/bioinformatics/btaf072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 01/06/2025] [Accepted: 02/11/2025] [Indexed: 03/14/2025] Open
Abstract
MOTIVATION Copy number alterations are driving forces of tumour development and the emergence of intra-tumour heterogeneity. A comprehensive picture of these genomic aberrations is therefore essential for the development of personalised and precise cancer diagnostics and therapies. Single-cell sequencing offers the highest resolution for copy number profiling down to the level of individual cells. Recent high-throughput protocols allow for the processing of hundreds of cells through shallow whole-genome DNA sequencing. The resulting low read-depth data poses substantial statistical and computational challenges to the identification of copy number alterations. RESULTS We developed SCICoNE, a statistical model and MCMC algorithm tailored to single-cell copy number profiling from shallow whole-genome DNA sequencing data. SCICoNE reconstructs the history of copy number events in the tumour and uses these evolutionary relationships to identify the copy number profiles of the individual cells. We show the accuracy of this approach in evaluations on simulated data and demonstrate its practicability in applications to two breast cancer samples from different sequencing protocols. AVAILABILITY AND IMPLEMENTATION SCICoNE is available at https://github.com/cbg-ethz/SCICoNE.
Collapse
Affiliation(s)
- Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Mustafa Anıl Tuncel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Pedro F Ferreira
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Katharina Jahn
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| |
Collapse
|
2
|
Lu B. Cancer phylogenetic inference using copy number alterations detected from DNA sequencing data. CANCER PATHOGENESIS AND THERAPY 2025; 3:16-29. [PMID: 39872371 PMCID: PMC11764021 DOI: 10.1016/j.cpt.2024.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 04/05/2024] [Accepted: 04/15/2024] [Indexed: 01/30/2025]
Abstract
Cancer is an evolutionary process involving the accumulation of diverse somatic mutations and clonal evolution over time. Phylogenetic inference from samples obtained from an individual patient offers a powerful approach to unraveling the intricate evolutionary history of cancer and provides insights that can inform cancer treatment. Somatic copy number alterations (CNAs) are important in cancer evolution and are often used as markers, alone or with other somatic mutations, for phylogenetic inferences, particularly in low-coverage DNA sequencing data. Many phylogenetic inference methods using CNAs detected from bulk or single-cell DNA sequencing data have been developed over the years. However, there have been no systematic reviews on these methods. To summarize the state-of-the-art of the field and inform future development, this review presents a comprehensive survey on the major challenges in inference, different types of methods, and applications of these methods. The challenges are discussed from the aspects of input data, models of evolution, and inference algorithms. The different methods are grouped according to the markers used for inference and the types of the reconstructed trees. The applications include using phylogenetic inference to understand intra-tumor heterogeneity, metastasis, treatment resistance, and early cancer development. This review also sheds light on future directions of cancer phylogenetic inference using CNAs, including the improvement of scalability, the utilization of new types of data, and the development of more realistic models of evolution.
Collapse
Affiliation(s)
- Bingxin Lu
- School of Biosciences and Medicine, University of Surrey, Guildford GU2 7XH, UK
- Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, UK
| |
Collapse
|
3
|
Schneider MP, Cullen AE, Pangonyte J, Skelton J, Major H, Van Oudenhove E, Garcia MJ, Chaves Urbano B, Piskorz AM, Brenton JD, Macintyre G, Markowetz F. scAbsolute: measuring single-cell ploidy and replication status. Genome Biol 2024; 25:62. [PMID: 38438920 PMCID: PMC10910719 DOI: 10.1186/s13059-024-03204-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 02/22/2024] [Indexed: 03/06/2024] Open
Abstract
Cancer cells often exhibit DNA copy number aberrations and can vary widely in their ploidy. Correct estimation of the ploidy of single-cell genomes is paramount for downstream analysis. Based only on single-cell DNA sequencing information, scAbsolute achieves accurate and unbiased measurement of single-cell ploidy and replication status, including whole-genome duplications. We demonstrate scAbsolute's capabilities using experimental cell multiplets, a FUCCI cell cycle expression system, and a benchmark against state-of-the-art methods. scAbsolute provides a robust foundation for single-cell DNA sequencing analysis across different technologies and has the potential to enable improvements in a number of downstream analyses.
Collapse
Affiliation(s)
- Michael P Schneider
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Amy E Cullen
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Justina Pangonyte
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Jason Skelton
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Harvey Major
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Elke Van Oudenhove
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Maria J Garcia
- Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | | | - Anna M Piskorz
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - James D Brenton
- University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK
| | - Geoff Macintyre
- Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Florian Markowetz
- University of Cambridge, Cambridge, UK.
- Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK.
| |
Collapse
|
4
|
Lu B, Curtius K, Graham TA, Yang Z, Barnes CP. CNETML: maximum likelihood inference of phylogeny from copy number profiles of multiple samples. Genome Biol 2023; 24:144. [PMID: 37340508 PMCID: PMC10283241 DOI: 10.1186/s13059-023-02983-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 06/08/2023] [Indexed: 06/22/2023] Open
Abstract
Phylogenetic trees based on copy number profiles from multiple samples of a patient are helpful to understand cancer evolution. Here, we develop a new maximum likelihood method, CNETML, to infer phylogenies from such data. CNETML is the first program to jointly infer the tree topology, node ages, and mutation rates from total copy numbers of longitudinal samples. Our extensive simulations suggest CNETML performs well on copy numbers relative to ploidy and under slight violation of model assumptions. The application of CNETML to real data generates results consistent with previous discoveries and provides novel early copy number events for further investigation.
Collapse
Affiliation(s)
- Bingxin Lu
- Department of Cell and Developmental Biology, University College London, London, UK.
- UCL Genetics Institute, University College London, London, UK.
| | - Kit Curtius
- Barts Cancer Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Trevor A Graham
- Barts Cancer Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK
- Centre for Evolution and Cancer, Institute of Cancer Research, London, UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Chris P Barnes
- Department of Cell and Developmental Biology, University College London, London, UK.
- UCL Genetics Institute, University College London, London, UK.
| |
Collapse
|
5
|
Lei H, Gertz EM, Schäffer AA, Fu X, Tao Y, Heselmeyer-Haddad K, Torres I, Li G, Xu L, Hou Y, Wu K, Shi X, Dean M, Ried T, Schwartz R. Tumor heterogeneity assessed by sequencing and fluorescence in situ hybridization (FISH) data. Bioinformatics 2021; 37:4704-4711. [PMID: 34289030 PMCID: PMC8665747 DOI: 10.1093/bioinformatics/btab504] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 05/19/2021] [Accepted: 07/05/2021] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION Computational reconstruction of clonal evolution in cancers has become a crucial tool for understanding how tumors initiate and progress and how this process varies across patients. The field still struggles, however, with special challenges of applying phylogenetic methods to cancers, such as the prevalence and importance of copy number alteration (CNA) and structural variation (SV) events in tumor evolution, which are difficult to profile accurately by prevailing sequencing methods in such a way that subsequent reconstruction by phylogenetic inference algorithms is accurate. RESULTS In the present work, we develop computational methods to combine sequencing with multiplex interphase fluorescence in situ hybridization (miFISH) to exploit the complementary advantages of each technology in inferring accurate models of clonal CNA evolution accounting for both focal changes and aneuploidy at whole-genome scales. By integrating such information in an integer linear programming (ILP) framework, we demonstrate on simulated data that incorporation of FISH data substantially improves accurate inference of focal CNA and ploidy changes in clonal evolution from deconvolving bulk sequence data. Analysis of real glioblastoma data for which FISH, bulk sequence, and single cell sequence are all available confirms the power of FISH to enhance accurate reconstruction of clonal copy number evolution in conjunction with bulk and optionally single-cell sequence data. AVAILABILITY Source code is available on Github at https://github.com/CMUSchwartzLab/FISH_deconvolution.
Collapse
Affiliation(s)
- Haoyun Lei
- Computational Biology Dept, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - E Michael Gertz
- Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Alejandro A Schäffer
- Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Xuecong Fu
- Shenzhen Luohu People's Hospital, Shenzhen, 518000, China
| | - Yifeng Tao
- Computational Biology Dept, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Kerstin Heselmeyer-Haddad
- Genetics Branch, Cancer Genomics Section, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Irianna Torres
- Genetics Branch, Cancer Genomics Section, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Guibo Li
- Department of Biology, University of Copenhagen, Copenhagen, 1599, Denmark
| | - Liqin Xu
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Soltofts Plads, 2800 Kongens Lyngby, Denmark
| | - Yong Hou
- Department of Biology, University of Copenhagen, Copenhagen, 1599, Denmark
| | - Kui Wu
- Department of Biology, University of Copenhagen, Copenhagen, 1599, Denmark
| | - Xulian Shi
- Shenzhen Luohu People's Hospital, Shenzhen, 518000, China
| | - Michael Dean
- Laboratory of Translational Genomics, Division of Cancer Epidemiology & Genetics, National Cancer Institute, U.S. National Institutes of Health, Gaithersburg, MD, 20814, USA
| | - Thomas Ried
- Genetics Branch, Cancer Genomics Section, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Russell Schwartz
- Computational Biology Dept, Carnegie Mellon University, Pittsburgh, PA, 15213, USA.,Dept. of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| |
Collapse
|
6
|
Tao Y, Rajaraman A, Cui X, Cui Z, Chen H, Zhao Y, Eaton J, Kim H, Ma J, Schwartz R. Assessing the contribution of tumor mutational phenotypes to cancer progression risk. PLoS Comput Biol 2021; 17:e1008777. [PMID: 33711014 PMCID: PMC7990181 DOI: 10.1371/journal.pcbi.1008777] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 03/24/2021] [Accepted: 02/06/2021] [Indexed: 01/10/2023] Open
Abstract
Cancer occurs via an accumulation of somatic genomic alterations in a process of clonal evolution. There has been intensive study of potential causal mutations driving cancer development and progression. However, much recent evidence suggests that tumor evolution is normally driven by a variety of mechanisms of somatic hypermutability, which act in different combinations or degrees in different cancers. These variations in mutability phenotypes are predictive of progression outcomes independent of the specific mutations they have produced to date. Here we explore the question of how and to what degree these differences in mutational phenotypes act in a cancer to predict its future progression. We develop a computational paradigm using evolutionary tree inference (tumor phylogeny) algorithms to derive features quantifying single-tumor mutational phenotypes, followed by a machine learning framework to identify key features predictive of progression. Analyses of breast invasive carcinoma and lung carcinoma demonstrate that a large fraction of the risk of future clinical outcomes of cancer progression-overall survival and disease-free survival-can be explained solely from mutational phenotype features derived from the phylogenetic analysis. We further show that mutational phenotypes have additional predictive power even after accounting for traditional clinical and driver gene-centric genomic predictors of progression. These results confirm the importance of mutational phenotypes in contributing to cancer progression risk and suggest strategies for enhancing the predictive power of conventional clinical data or driver-centric biomarkers.
Collapse
Affiliation(s)
- Yifeng Tao
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Joint Carnegie Mellon-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, Pennsylvania, United States of America
| | - Ashok Rajaraman
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Xiaoyue Cui
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Joint Carnegie Mellon-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, Pennsylvania, United States of America
| | - Ziyi Cui
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Haoran Chen
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Joint Carnegie Mellon-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, Pennsylvania, United States of America
| | - Yuanqi Zhao
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Jesse Eaton
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Hannah Kim
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Jian Ma
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Russell Schwartz
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|
7
|
Zaccaria S, Raphael BJ. Characterizing allele- and haplotype-specific copy numbers in single cells with CHISEL. Nat Biotechnol 2021; 39:207-214. [PMID: 32879467 PMCID: PMC9876616 DOI: 10.1038/s41587-020-0661-6] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2019] [Accepted: 08/04/2020] [Indexed: 01/28/2023]
Abstract
Single-cell barcoding technologies enable genome sequencing of thousands of individual cells in parallel, but with extremely low sequencing coverage (<0.05×) per cell. While the total copy number of large multi-megabase segments can be derived from such data, important allele-specific mutations-such as copy-neutral loss of heterozygosity (LOH) in cancer-are missed. We introduce copy-number haplotype inference in single cells using evolutionary links (CHISEL), a method to infer allele- and haplotype-specific copy numbers in single cells and subpopulations of cells by aggregating sparse signal across hundreds or thousands of individual cells. We applied CHISEL to ten single-cell sequencing datasets of ~2,000 cells from two patients with breast cancer. We identified extensive allele-specific copy-number aberrations (CNAs) in these samples, including copy-neutral LOHs, whole-genome duplications (WGDs) and mirrored-subclonal CNAs. These allele-specific CNAs affect genomic regions containing well-known breast-cancer genes. We also refined the reconstruction of tumor evolution, timing allele-specific CNAs before and after WGDs, identifying low-frequency subpopulations distinguished by unique CNAs and uncovering evidence of convergent evolution.
Collapse
Affiliation(s)
- Simone Zaccaria
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ, USA.
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, USA.
| |
Collapse
|
8
|
Abstract
MOTIVATION Copy number aberrations (CNAs), which delete or amplify large contiguous segments of the genome, are a common type of somatic mutation in cancer. Copy number profiles, representing the number of copies of each region of a genome, are readily obtained from whole-genome sequencing or microarrays. However, modeling copy number evolution is a substantial challenge, because different CNAs may overlap with one another on the genome. A recent popular model for copy number evolution is the copy number distance (CND), defined as the length of a shortest sequence of deletions and amplifications of contiguous segments that transforms one profile into the other. In the CND, all events contribute equally; however, it is well known that rates of CNAs vary by length, genomic position and type (amplification versus deletion). RESULTS We introduce a weighted CND that allows events to have varying weights, or probabilities, based on their length, position and type. We derive an efficient algorithm to compute the weighted CND as well as the associated transformation. This algorithm is based on the observation that the constraint matrix of the underlying optimization problem is totally unimodular. We show that the weighted CND improves phylogenetic reconstruction on simulated data where CNAs occur with varying probabilities, aids in the derivation of phylogenies from ultra-low-coverage single-cell DNA sequencing data and helps estimate CNA rates in a large pan-cancer dataset. AVAILABILITY AND IMPLEMENTATION Code is available at https://github.com/raphael-group/WCND. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ron Zeira
- Department of Computer Science, Princeton University, Princeton, NJ 08544, USA
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
9
|
Lei H, Lyu B, Gertz EM, Schäffer AA, Shi X, Wu K, Li G, Xu L, Hou Y, Dean M, Schwartz R. Tumor Copy Number Deconvolution Integrating Bulk and Single-Cell Sequencing Data. J Comput Biol 2020; 27:565-598. [PMID: 32181683 DOI: 10.1089/cmb.2019.0302] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Characterizing intratumor heterogeneity (ITH) is crucial to understanding cancer development, but it is hampered by limits of available data sources. Bulk DNA sequencing is the most common technology to assess ITH, but involves the analysis of a mixture of many genetically distinct cells in each sample, which must then be computationally deconvolved. Single-cell sequencing is a promising alternative, but its limitations-for example, high noise, difficulty scaling to large populations, technical artifacts, and large data sets-have so far made it impractical for studying cohorts of sufficient size to identify statistically robust features of tumor evolution. We have developed strategies for deconvolution and tumor phylogenetics combining limited amounts of bulk and single-cell data to gain some advantages of single-cell resolution with much lower cost, with specific focus on deconvolving genomic copy number data. We developed a mixed membership model for clonal deconvolution via non-negative matrix factorization balancing deconvolution quality with similarity to single-cell samples via an associated efficient coordinate descent algorithm. We then improve on that algorithm by integrating deconvolution with clonal phylogeny inference, using a mixed integer linear programming model to incorporate a minimum evolution phylogenetic tree cost in the problem objective. We demonstrate the effectiveness of these methods on semisimulated data of known ground truth, showing improved deconvolution accuracy relative to bulk data alone.
Collapse
Affiliation(s)
- Haoyun Lei
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania
| | - Bochuan Lyu
- Department of Mathematics, Rose-Hulman Institute of Technology, Terre Haute, Indiana
| | - E Michael Gertz
- National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, Maryland.,Cancer Data Science Laboratory, National Cancer Institute, U.S. National Institutes of Health, Bethesda, Maryland
| | - Alejandro A Schäffer
- National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, Maryland.,Cancer Data Science Laboratory, National Cancer Institute, U.S. National Institutes of Health, Bethesda, Maryland
| | | | - Kui Wu
- BGI-Shenzhen, Shenzhen, China
| | | | | | | | - Michael Dean
- Laboratory of Translational Genomics, Division of Cancer Epidemiology & Genetics, National Cancer Institute, U.S. National Institutes of Health, Gaithersburg, Maryland
| | - Russell Schwartz
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania
| |
Collapse
|
10
|
Eaton J, Wang J, Schwartz R. Deconvolution and phylogeny inference of structural variations in tumor genomic samples. Bioinformatics 2019; 34:i357-i365. [PMID: 29950001 PMCID: PMC6022719 DOI: 10.1093/bioinformatics/bty270] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Motivation Phylogenetic reconstruction of tumor evolution has emerged as a crucial tool for making sense of the complexity of emerging cancer genomic datasets. Despite the growing use of phylogenetics in cancer studies, though, the field has only slowly adapted to many ways that tumor evolution differs from classic species evolution. One crucial question in that regard is how to handle inference of structural variations (SVs), which are a major mechanism of evolution in cancers but have been largely neglected in tumor phylogenetics to date, in part due to the challenges of reliably detecting and typing SVs and interpreting them phylogenetically. Results We present a novel method for reconstructing evolutionary trajectories of SVs from bulk whole-genome sequence data via joint deconvolution and phylogenetics, to infer clonal sub-populations and reconstruct their ancestry. We establish a novel likelihood model for joint deconvolution and phylogenetic inference on bulk SV data and formulate an associated optimization algorithm. We demonstrate the approach to be efficient and accurate for realistic scenarios of SV mutation on simulated data. Application to breast cancer genomic data from The Cancer Genome Atlas shows it to be practical and effective at reconstructing features of SV-driven evolution in single tumors. Availability and implementation Python source code and associated documentation are available at https://github.com/jaebird123/tusv.
Collapse
Affiliation(s)
- Jesse Eaton
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Jingyi Wang
- Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Russell Schwartz
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
11
|
Godini R, Fallahi H. A brief overview of the concepts, methods and computational tools used in phylogenetic tree construction and gene prediction. Meta Gene 2019. [DOI: 10.1016/j.mgene.2019.100586] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
|
12
|
Hodzic E, Shrestha R, Zhu K, Cheng K, Collins CC, Cenk Sahinalp S. Combinatorial Detection of Conserved Alteration Patterns for Identifying Cancer Subnetworks. Gigascience 2019; 8:giz024. [PMID: 30978274 PMCID: PMC6458499 DOI: 10.1093/gigascience/giz024] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2018] [Revised: 12/12/2018] [Accepted: 02/21/2019] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND Advances in large-scale tumor sequencing have led to an understanding that there are combinations of genomic and transcriptomic alterations specific to tumor types, shared across many patients. Unfortunately, computational identification of functionally meaningful and recurrent alteration patterns within gene/protein interaction networks has proven to be challenging. FINDINGS We introduce a novel combinatorial method, cd-CAP (combinatorial detection of conserved alteration patterns), for simultaneous detection of connected subnetworks of an interaction network where genes exhibit conserved alteration patterns across tumor samples. Our method differentiates distinct alteration types associated with each gene (rather than relying on binary information of a gene being altered or not) and simultaneously detects multiple alteration profile conserved subnetworks. CONCLUSIONS In a number of The Cancer Genome Atlas datasets, cd-CAP identified large biologically significant subnetworks with conserved alteration patterns, shared across many tumor samples.
Collapse
Affiliation(s)
- Ermin Hodzic
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
- School of Computing Science, Simon Fraser University, 8888 University Dr, Burnaby, BC, V5A 1S6, Canada
| | - Raunak Shrestha
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
- Department of Urologic Sciences, University of British Columbia, 2775 Laurel St, Vancouver, BC, V5Z 1M9, Canada
| | - Kaiyuan Zhu
- Department of Computer Science, Indiana University Bloomington, 700 N. Woodlawn Ave, Bloomington, IN, 47408, USA
| | - Kuoyuan Cheng
- Center for Bioinformatics and Computational Biology, University of Maryland, 8125 Paint Branch Dr, College Park, MD, 20742, USA
| | - Colin C Collins
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
- Department of Urologic Sciences, University of British Columbia, 2775 Laurel St, Vancouver, BC, V5Z 1M9, Canada
| | - S Cenk Sahinalp
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
- Department of Computer Science, Indiana University Bloomington, 700 N. Woodlawn Ave, Bloomington, IN, 47408, USA
| |
Collapse
|
13
|
Xia R, Lin Y, Zhou J, Geng T, Feng B, Tang J. Phylogenetic Reconstruction for Copy-Number Evolution Problems. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:694-699. [PMID: 29993694 DOI: 10.1109/tcbb.2018.2829698] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Cancer is known for its heterogeneity and is regarded as an evolutionary process driven by somatic mutations and clonal expansions. This evolutionary process can be modeled by a phylogenetic tree and phylogenetic analysis of multiple subclones of cancer cells can facilitate the study of the tumor variants progression. Copy-number aberration occurs frequently in many types of tumors in terms of segmental amplifications and deletions. In this paper, we developed a distance-based method for reconstructing phylogenies from copy-number profiles of cancer cells. We demonstrate the importance of distance correction from the edit (minimum) distance to the estimated actual number of events. Experimental results show that our approaches provide accurate and scalable results in estimating the actual number of evolutionary events between copy number profiles and in reconstructing phylogenies.
Collapse
|
14
|
Zaccaria S, El-Kebir M, Klau GW, Raphael BJ. Phylogenetic Copy-Number Factorization of Multiple Tumor Samples. J Comput Biol 2018; 25:689-708. [PMID: 29658782 PMCID: PMC6067108 DOI: 10.1089/cmb.2017.0253] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Cancer is an evolutionary process driven by somatic mutations. This process can be represented as a phylogenetic tree. Constructing such a phylogenetic tree from genome sequencing data is a challenging task due to the many types of mutations in cancer and the fact that nearly all cancer sequencing is of a bulk tumor, measuring a superposition of somatic mutations present in different cells. We study the problem of reconstructing tumor phylogenies from copy-number aberrations (CNAs) measured in bulk-sequencing data. We introduce the Copy-Number Tree Mixture Deconvolution (CNTMD) problem, which aims to find the phylogenetic tree with the fewest number of CNAs that explain the copy-number data from multiple samples of a tumor. We design an algorithm for solving the CNTMD problem and apply the algorithm to both simulated and real data. On simulated data, we find that our algorithm outperforms existing approaches that either perform deconvolution/factorization of mixed tumor samples or build phylogenetic trees assuming homogeneous tumor samples. On real data, we analyze multiple samples from a prostate cancer patient, identifying clones within these samples and a phylogenetic tree that relates these clones and their differing proportions across samples. This phylogenetic tree provides a higher resolution view of copy-number evolution of this cancer than published analyses.
Collapse
Affiliation(s)
- Simone Zaccaria
- Department of Computer Science, Princeton University, Princeton, New Jersey
- Dipartimento di Informatica Sistemistica e Comunicazione (DISCo), Università degli Studi di Milano-Bicocca, Milan, Italy
| | - Mohammed El-Kebir
- Department of Computer Science, Princeton University, Princeton, New Jersey
| | - Gunnar W. Klau
- Algorithmic Bioinformatics, Heinrich Heine University, Düsseldorf, Germany
| | | |
Collapse
|
15
|
Zeira R, Shamir R. Sorting cancer karyotypes using double-cut-and-joins, duplications and deletions. Bioinformatics 2018; 37:1489-1496. [PMID: 29726899 DOI: 10.1093/bioinformatics/bty381] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Revised: 03/15/2018] [Accepted: 05/02/2018] [Indexed: 01/30/2023] Open
Abstract
Motivation Problems of genome rearrangement are central in both evolution and cancer research. Most genome rearrangement models assume that the genome contains a single copy of each gene and the only changes in the genome are structural, i.e., reordering of segments. In contrast, tumor genomes also undergo numerical changes such as deletions and duplications, and thus the number of copies of genes varies. Dealing with unequal gene content is a very challenging task, addressed by few algorithms to date. More realistic models are needed to help trace genome evolution during tumorigenesis. Results Here we present a model for the evolution of genomes with multiple gene copies using the operation types double-cut-and-joins, duplications and deletions. The events supported by the model are reversals, translocations, tandem duplications, segmental deletions, and chromosomal amplifications and deletions, covering most types of structural and numerical changes observed in tumor samples. Our goal is to find a series of operations of minimum length that transform one karyotype into the other. We show that the problem is NP-hard and give an integer linear programming formulation that solves the problem exactly under some mild assumptions. We test our method on simulated genomes and on ovarian cancer genomes. Our study advances the state of the art in two ways: It allows a broader set of operations than extant models, thus being more realistic, and it is the first study attempting to reconstruct the full sequence of structural and numerical events during cancer evolution. Availability Code and data are available in https://github.com/Shamir-Lab/Sorting-Cancer-Karyotypes. Contact ronzeira@post.tau.ac.il, rshamir@tau.ac.il. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Ron Shamir
- Blavatnik School of Computer Science, Tel Aviv university, Tel Aviv, 6997801, Israel
| |
Collapse
|
16
|
An Improved Binary Differential Evolution Algorithm to Infer Tumor Phylogenetic Trees. BIOMED RESEARCH INTERNATIONAL 2017; 2017:5482750. [PMID: 29279850 PMCID: PMC5723949 DOI: 10.1155/2017/5482750] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/02/2017] [Accepted: 10/18/2017] [Indexed: 12/14/2022]
Abstract
Tumourigenesis is a mutation accumulation process, which is likely to start with a mutated founder cell. The evolutionary nature of tumor development makes phylogenetic models suitable for inferring tumor evolution through genetic variation data. Copy number variation (CNV) is the major genetic marker of the genome with more genes, disease loci, and functional elements involved. Fluorescence in situ hybridization (FISH) accurately measures multiple gene copy number of hundreds of single cells. We propose an improved binary differential evolution algorithm, BDEP, to infer tumor phylogenetic tree based on FISH platform. The topology analysis of tumor progression tree shows that the pathway of tumor subcell expansion varies greatly during different stages of tumor formation. And the classification experiment shows that tree-based features are better than data-based features in distinguishing tumor. The constructed phylogenetic trees have great performance in characterizing tumor development process, which outperforms other similar algorithms.
Collapse
|
17
|
Zeira R, Zehavi M, Shamir R. A Linear-Time Algorithm for the Copy Number Transformation Problem. J Comput Biol 2017; 24:1179-1194. [PMID: 28837352 DOI: 10.1089/cmb.2017.0060] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Problems of genome rearrangement are central in both evolution and cancer. Most evolutionary scenarios have been studied under the assumption that the genome contains a single copy of each gene. In contrast, tumor genomes undergo deletions and duplications, and thus, the number of copies of genes varies. The number of copies of each segment along a chromosome is called its copy number profile (CNP). Understanding CNP changes can assist in predicting disease progression and treatment. To date, questions related to distances between CNPs gained little scientific attention. Here we focus on the following fundamental problem, introduced by Schwarz et al.: given two CNPs, u and v, compute the minimum number of operations transforming u into v, where the edit operations are segmental deletions and amplifications. We establish the computational complexity of this problem, showing that it is solvable in linear time and constant space.
Collapse
Affiliation(s)
- Ron Zeira
- 1 Blavatnik School of Computer Science, Tel-Aviv University , Tel-Aviv, Israel
| | - Meirav Zehavi
- 2 Department of Informatics, University of Bergen , Bergen, Norway
| | - Ron Shamir
- 1 Blavatnik School of Computer Science, Tel-Aviv University , Tel-Aviv, Israel
| |
Collapse
|
18
|
Vandin F. Computational Methods for Characterizing Cancer Mutational Heterogeneity. Front Genet 2017; 8:83. [PMID: 28659971 PMCID: PMC5469877 DOI: 10.3389/fgene.2017.00083] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Accepted: 05/30/2017] [Indexed: 12/21/2022] Open
Abstract
Advances in DNA sequencing technologies have allowed the characterization of somatic mutations in a large number of cancer genomes at an unprecedented level of detail, revealing the extreme genetic heterogeneity of cancer at two different levels: inter-tumor, with different patients of the same cancer type presenting different collections of somatic mutations, and intra-tumor, with different clones coexisting within the same tumor. Both inter-tumor and intra-tumor heterogeneity have crucial implications for clinical practices. Here, we review computational methods that use somatic alterations measured through next-generation DNA sequencing technologies for characterizing tumor heterogeneity and its association with clinical variables. We first review computational methods for studying inter-tumor heterogeneity, focusing on methods that attempt to summarize cancer heterogeneity by discovering pathways that are commonly mutated across different patients of the same cancer type. We then review computational methods for characterizing intra-tumor heterogeneity using information from bulk sequencing data or from single cell sequencing data. Finally, we present some of the recent computational methodologies that have been proposed to identify and assess the association between inter- or intra-tumor heterogeneity with clinical variables.
Collapse
Affiliation(s)
- Fabio Vandin
- Department of Information Engineering, University of PadovaPadova, Italy
| |
Collapse
|
19
|
El-Kebir M, Raphael BJ, Shamir R, Sharan R, Zaccaria S, Zehavi M, Zeira R. Complexity and algorithms for copy-number evolution problems. Algorithms Mol Biol 2017; 12:13. [PMID: 28515774 PMCID: PMC5433102 DOI: 10.1186/s13015-017-0103-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 04/11/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cancer is an evolutionary process characterized by the accumulation of somatic mutations in a population of cells that form a tumor. One frequent type of mutations is copy number aberrations, which alter the number of copies of genomic regions. The number of copies of each position along a chromosome constitutes the chromosome's copy-number profile. Understanding how such profiles evolve in cancer can assist in both diagnosis and prognosis. RESULTS We model the evolution of a tumor by segmental deletions and amplifications, and gauge distance from profile [Formula: see text] to [Formula: see text] by the minimum number of events needed to transform [Formula: see text] into [Formula: see text]. Given two profiles, our first problem aims to find a parental profile that minimizes the sum of distances to its children. Given k profiles, the second, more general problem, seeks a phylogenetic tree, whose k leaves are labeled by the k given profiles and whose internal vertices are labeled by ancestral profiles such that the sum of edge distances is minimum. CONCLUSIONS For the former problem we give a pseudo-polynomial dynamic programming algorithm that is linear in the profile length, and an integer linear program formulation. For the latter problem we show it is NP-hard and give an integer linear program formulation that scales to practical problem instance sizes. We assess the efficiency and quality of our algorithms on simulated instances. AVAILABILITY https://github.com/raphael-group/CNT-ILP.
Collapse
Affiliation(s)
- Mohammed El-Kebir
- Department of Computer Science, Princeton University, Princeton, NJ 08540 USA
- Department of Computer Science, Center for Computational Molecular Biology, Brown University, Providence, RI 02912 USA
| | - Benjamin J. Raphael
- Department of Computer Science, Princeton University, Princeton, NJ 08540 USA
- Department of Computer Science, Center for Computational Molecular Biology, Brown University, Providence, RI 02912 USA
| | - Ron Shamir
- School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Roded Sharan
- School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Simone Zaccaria
- Department of Computer Science, Princeton University, Princeton, NJ 08540 USA
- Department of Computer Science, Center for Computational Molecular Biology, Brown University, Providence, RI 02912 USA
- Dipartimento di Informatica Sistemistica e Comunicazione (DISCo), Univ. degli Studi di Milano-Bicocca, Milan, Italy
| | - Meirav Zehavi
- School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Ron Zeira
- School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
20
|
Abstract
Rapid advances in high-throughput sequencing and a growing realization of the importance of evolutionary theory to cancer genomics have led to a proliferation of phylogenetic studies of tumour progression. These studies have yielded not only new insights but also a plethora of experimental approaches, sometimes reaching conflicting or poorly supported conclusions. Here, we consider this body of work in light of the key computational principles underpinning phylogenetic inference, with the goal of providing practical guidance on the design and analysis of scientifically rigorous tumour phylogeny studies. We survey the range of methods and tools available to the researcher, their key applications, and the various unsolved problems, closing with a perspective on the prospects and broader implications of this field.
Collapse
Affiliation(s)
- Russell Schwartz
- Department of Biological Sciences and Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15217, USA
| | - Alejandro A Schäffer
- Computational Biology Branch, National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 20892, USA
| |
Collapse
|
21
|
Zhang M, Lee AV, Rosen JM. The Cellular Origin and Evolution of Breast Cancer. Cold Spring Harb Perspect Med 2017; 7:cshperspect.a027128. [PMID: 28062556 DOI: 10.1101/cshperspect.a027128] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
In this review, we will discuss how the cell of origin may modulate breast cancer intratumoral heterogeneity (ITH) as well as the role of ITH in the evolution of cancer. The clonal evolution and the cancer stem cell (CSC) models, as well as a model that integrates clonal evolution with a CSC hierarchy, have all been proposed to explain the development of ITH. The extent of ITH correlates with clinical outcome and reflects the cellular complexity and dynamics within a tumor. A unique subtype of breast cancer, the claudin-low subtype that is highly resistant to chemotherapy and most closely resembles mammary epithelial stem cells, will be discussed. Furthermore, we will review how the interactions among various tumor cells, some with distinct mutations, may impact breast cancer treatment. Finally, novel technologies that may help advance our understanding of ITH and lead to improvements in the design of new treatments also will be discussed.
Collapse
Affiliation(s)
- Mei Zhang
- Department of Developmental Biology, University of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Adrian V Lee
- Department of Pharmacology and Chemical Biology, University of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Jeffrey M Rosen
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas 77030
| |
Collapse
|
22
|
Zhou J, Lin Y, Rajan V, Hoskins W, Feng B, Tang J. Analysis of gene copy number changes in tumor phylogenetics. Algorithms Mol Biol 2016; 11:26. [PMID: 27688796 PMCID: PMC5034472 DOI: 10.1186/s13015-016-0088-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2016] [Accepted: 09/08/2016] [Indexed: 02/04/2023] Open
Abstract
BACKGOUND Evolution of cancer cells is characterized by large scale and rapid changes in the chromosomal landscape. The fluorescence in situ hybridization (FISH) technique provides a way to measure the copy numbers of preselected genes in a group of cells and has been found to be a reliable source of data to model the evolution of tumor cells. Chowdhury et al. (Bioinformatics 29(13):189-98, 23; PLoS Comput Biol 10(7):1003740, 24) recently develop a computational model for tumor progression driven by gains and losses in cell count patterns obtained by FISH probes. Their model aims to find the rectilinear Steiner minimum tree (RSMT) (Chowdhury et al. in Bioinformatics 29(13):189-98, 23) and the duplication Steiner minimum tree (DSMT) (Chowdhury et al. in PLoS Comput Biol 10(7):1003740, 24) that describe the progression of FISH cell count patterns over its branches in a parsimonious manner. Both the RSMT and DSMT problems are NP-hard and heuristics are required to solve the problems efficiently. METHODS In this paper we propose two approaches to solve the RSMT problem, one inspired by iterative methods to address the "small phylogeny" problem (Sankoff et al. in J Mol Evol 7(2):133-49, 27; Blanchette et al. in Genome Inform 8:25-34, 28), and the other based on maximum parsimony phylogeny inference. We further show how to extend these heuristics to obtain solutions to the DSMT problem, that models large scale duplication events. RESULTS Experimental results from both simulated and real tumor data show that our methods outperform previous heuristics (Chowdhury et al. in Bioinformatics 29(13):189-98, 23; Chowdhury et al. in PLoS Comput Biol 10(7):1003740, 24) in obtaining solutions to both RSMT and DSMT problems. CONCLUSION The methods introduced here are able to provide more parsimony phylogenies compared to earlier ones which are consider better choices.
Collapse
Affiliation(s)
- Jun Zhou
- School of Computer Science and Technology, Tianjin University, Tianjin, 300072 China ; Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208 USA
| | - Yu Lin
- Research School of Computer Science, Australian National University, Canberra, ACT 0200 Australia
| | - Vaibhav Rajan
- Xerox Research Centre India (XRCI), Bangalore, India
| | - William Hoskins
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208 USA
| | - Bing Feng
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208 USA
| | - Jijun Tang
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208 USA
| |
Collapse
|
23
|
Catanzaro D, Shackney SE, Schaffer AA, Schwartz R. Classifying the Progression of Ductal Carcinoma from Single-Cell Sampled Data via Integer Linear Programming: A Case Study. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:643-655. [PMID: 26353381 PMCID: PMC5217787 DOI: 10.1109/tcbb.2015.2476808] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Ductal Carcinoma In Situ (DCIS) is a precursor lesion of Invasive Ductal Carcinoma (IDC) of the breast. Investigating its temporal progression could provide fundamental new insights for the development of better diagnostic tools to predict which cases of DCIS will progress to IDC. We investigate the problem of reconstructing a plausible progression from single-cell sampled data of an individual with synchronous DCIS and IDC. Specifically, by using a number of assumptions derived from the observation of cellular atypia occurring in IDC, we design a possible predictive model using integer linear programming (ILP). Computational experiments carried out on a preexisting data set of 13 patients with simultaneous DCIS and IDC show that the corresponding predicted progression models are classifiable into categories having specific evolutionary characteristics. The approach provides new insights into mechanisms of clonal progression in breast cancers and helps illustrate the power of the ILP approach for similar problems in reconstructing tumor evolution scenarios under complex sets of constraints.
Collapse
|
24
|
Gertz EM, Chowdhury SA, Lee WJ, Wangsa D, Heselmeyer-Haddad K, Ried T, Schwartz R, Schäffer AA. FISHtrees 3.0: Tumor Phylogenetics Using a Ploidy Probe. PLoS One 2016; 11:e0158569. [PMID: 27362268 PMCID: PMC4928784 DOI: 10.1371/journal.pone.0158569] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Accepted: 06/19/2016] [Indexed: 01/03/2023] Open
Abstract
Advances in fluorescence in situ hybridization (FISH) make it feasible to detect multiple copy-number changes in hundreds of cells of solid tumors. Studies using FISH, sequencing, and other technologies have revealed substantial intra-tumor heterogeneity. The evolution of subclones in tumors may be modeled by phylogenies. Tumors often harbor aneuploid or polyploid cell populations. Using a FISH probe to estimate changes in ploidy can guide the creation of trees that model changes in ploidy and individual gene copy-number variations. We present FISHtrees 3.0, which implements a ploidy-based tree building method based on mixed integer linear programming (MILP). The ploidy-based modeling in FISHtrees includes a new formulation of the problem of merging trees for changes of a single gene into trees modeling changes in multiple genes and the ploidy. When multiple samples are collected from each patient, varying over time or tumor regions, it is useful to evaluate similarities in tumor progression among the samples. Therefore, we further implemented in FISHtrees 3.0 a new method to build consensus graphs for multiple samples. We validate FISHtrees 3.0 on a simulated data and on FISH data from paired cases of cervical primary and metastatic tumors and on paired breast ductal carcinoma in situ (DCIS) and invasive ductal carcinoma (IDC). Tests on simulated data show improved accuracy of the ploidy-based approach relative to prior ploidyless methods. Tests on real data further demonstrate novel insights these methods offer into tumor progression processes. Trees for DCIS samples are significantly less complex than trees for paired IDC samples. Consensus graphs show substantial divergence among most paired samples from both sets. Low consensus between DCIS and IDC trees may help explain the difficulty in finding biomarkers that predict which DCIS cases are at most risk to progress to IDC. The FISHtrees software is available at ftp://ftp.ncbi.nih.gov/pub/FISHtrees.
Collapse
MESH Headings
- Biomarkers, Tumor/genetics
- Breast Neoplasms/genetics
- Breast Neoplasms/pathology
- Carcinoma, Ductal, Breast/genetics
- Carcinoma, Ductal, Breast/pathology
- Carcinoma, Intraductal, Noninfiltrating/genetics
- Carcinoma, Intraductal, Noninfiltrating/pathology
- Databases, Genetic
- Female
- Humans
- In Situ Hybridization, Fluorescence/methods
- Ploidies
- Uterine Cervical Neoplasms/genetics
- Uterine Cervical Neoplasms/pathology
Collapse
Affiliation(s)
- E. Michael Gertz
- Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, United States of America
| | - Salim Akhter Chowdhury
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, United States of America
- Carnegie Mellon/University of Pittsburgh Joint Ph.D. Program in Computational Biology, Pittsburgh, PA, United States of America
| | - Woei-Jyh Lee
- Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, United States of America
| | - Darawalee Wangsa
- Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, United States of America
| | - Kerstin Heselmeyer-Haddad
- Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, United States of America
| | - Thomas Ried
- Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, United States of America
| | - Russell Schwartz
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, United States of America
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, United States of America
| | - Alejandro A. Schäffer
- Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, United States of America
| |
Collapse
|
25
|
Ross EM, Markowetz F. OncoNEM: inferring tumor evolution from single-cell sequencing data. Genome Biol 2016; 17:69. [PMID: 27083415 PMCID: PMC4832472 DOI: 10.1186/s13059-016-0929-9] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Accepted: 03/30/2016] [Indexed: 11/17/2022] Open
Abstract
Single-cell sequencing promises a high-resolution view of genetic heterogeneity and clonal evolution in cancer. However, methods to infer tumor evolution from single-cell sequencing data lag behind methods developed for bulk-sequencing data. Here, we present OncoNEM, a probabilistic method for inferring intra-tumor evolutionary lineage trees from somatic single nucleotide variants of single cells. OncoNEM identifies homogeneous cellular subpopulations and infers their genotypes as well as a tree describing their evolutionary relationships. In simulation studies, we assess OncoNEM's robustness and benchmark its performance against competing methods. Finally, we show its applicability in case studies of muscle-invasive bladder cancer and essential thrombocythemia.
Collapse
Affiliation(s)
- Edith M Ross
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK
| | - Florian Markowetz
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK.
| |
Collapse
|
26
|
Beerenwinkel N, Greenman CD, Lagergren J. Computational Cancer Biology: An Evolutionary Perspective. PLoS Comput Biol 2016; 12:e1004717. [PMID: 26845763 PMCID: PMC4742235 DOI: 10.1371/journal.pcbi.1004717] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Affiliation(s)
- Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
- * E-mail: (NB); (CDG); (JL)
| | - Chris D. Greenman
- School of Computing Sciences, University of East Anglia, Norwich, United Kingdom
- * E-mail: (NB); (CDG); (JL)
| | - Jens Lagergren
- Science for Life Laboratory, School of Computer Science and Communication, Swedish E-Science Research Center, KTH Royal Institute of Technology, Solna, Sweden
- * E-mail: (NB); (CDG); (JL)
| |
Collapse
|
27
|
Chowdhury SA, Gertz EM, Wangsa D, Heselmeyer-Haddad K, Ried T, Schäffer AA, Schwartz R. Inferring models of multiscale copy number evolution for single-tumor phylogenetics. Bioinformatics 2015; 31:i258-67. [PMID: 26072490 PMCID: PMC4481700 DOI: 10.1093/bioinformatics/btv233] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Motivation: Phylogenetic algorithms have begun to see widespread use in cancer research to reconstruct processes of evolution in tumor progression. Developing reliable phylogenies for tumor data requires quantitative models of cancer evolution that include the unusual genetic mechanisms by which tumors evolve, such as chromosome abnormalities, and allow for heterogeneity between tumor types and individual patients. Previous work on inferring phylogenies of single tumors by copy number evolution assumed models of uniform rates of genomic gain and loss across different genomic sites and scales, a substantial oversimplification necessitated by a lack of algorithms and quantitative parameters for fitting to more realistic tumor evolution models. Results: We propose a framework for inferring models of tumor progression from single-cell gene copy number data, including variable rates for different gain and loss events. We propose a new algorithm for identification of most parsimonious combinations of single gene and single chromosome events. We extend it via dynamic programming to include genome duplications. We implement an expectation maximization (EM)-like method to estimate mutation-specific and tumor-specific event rates concurrently with tree reconstruction. Application of our algorithms to real cervical cancer data identifies key genomic events in disease progression consistent with prior literature. Classification experiments on cervical and tongue cancer datasets lead to improved prediction accuracy for the metastasis of primary cervical cancers and for tongue cancer survival. Availability and implementation: Our software (FISHtrees) and two datasets are available at ftp://ftp.ncbi.nlm.nih.gov/pub/FISHtrees. Contact:russells@andrew.cmu.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Salim Akhter Chowdhury
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - E Michael Gertz
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Darawalee Wangsa
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Kerstin Heselmeyer-Haddad
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Thomas Ried
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Alejandro A Schäffer
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Russell Schwartz
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
28
|
Abstract
BACKGROUND Effective management and treatment of cancer continues to be complicated by the rapid evolution and resulting heterogeneity of tumors. Phylogenetic study of cell populations in single tumors provides a way to delineate intra-tumoral heterogeneity and identify robust features of evolutionary processes. The introduction of single-cell sequencing has shown great promise for advancing single-tumor phylogenetics; however, the volume and high noise in these data present challenges for inference, especially with regard to chromosome abnormalities that typically dominate tumor evolution. Here, we investigate a strategy to use such data to track differences in tumor cell genomic content during progression. RESULTS We propose a reference-free approach to mining single-cell genome sequence reads to allow predictive classification of tumors into heterogeneous cell types and reconstruct models of their evolution. The approach extracts k-mer counts from single-cell tumor genomic DNA sequences, and uses differences in normalized k-mer frequencies as a proxy for overall evolutionary distance between distinct cells. The approach computationally simplifies deriving phylogenetic markers, which normally relies on first aligning sequence reads to a reference genome and then processing the data to extract meaningful progression markers for constructing phylogenetic trees. The approach also provides a way to bypass some of the challenges that massive genome rearrangement typical of tumor genomes presents for reference-based methods. We illustrate the method on a publicly available breast tumor single-cell sequencing dataset. CONCLUSIONS We have demonstrated a computational approach for learning tumor progression from single cell sequencing data using k-mer counts. k-mer features classify tumor cells by stage of progression with high accuracy. Phylogenies built from these k-mer spectrum distance matrices yield splits that are statistically significant when tested for their ability to partition cells at different stages of cancer.
Collapse
Affiliation(s)
- Ayshwarya Subramanian
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 655 Huntington Street, 02115 Boston, USA
| | - Russell Schwartz
- Department of Biological Sciences and the Computational Biology Department, Carnegie Mellon University, 5000 Forbes Avenue, 15213 Pittsburgh, USA
| |
Collapse
|
29
|
Wangsa D, Chowdhury SA, Ryott M, Gertz EM, Elmberger G, Auer G, Åvall Lundqvist E, Küffer S, Ströbel P, Schäffer AA, Schwartz R, Munck-Wikland E, Ried T, Heselmeyer-Haddad K. Phylogenetic analysis of multiple FISH markers in oral tongue squamous cell carcinoma suggests that a diverse distribution of copy number changes is associated with poor prognosis. Int J Cancer 2015; 138:98-109. [PMID: 26175310 DOI: 10.1002/ijc.29691] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2014] [Revised: 04/21/2015] [Accepted: 06/19/2015] [Indexed: 12/31/2022]
Abstract
Oral tongue squamous cell carcinoma (OTSCC) is associated with poor prognosis. To improve prognostication, we analyzed four gene probes (TERC, CCND1, EGFR and TP53) and the centromere probe CEP4 as a marker of chromosomal instability, using fluorescence in situ hybridization (FISH) in single cells from the tumors of sixty-five OTSCC patients (Stage I, n = 15; Stage II, n = 30; Stage III, n = 7; Stage IV, n = 13). Unsupervised hierarchical clustering of the FISH data distinguished three clusters related to smoking status. Copy number increases of all five markers were found to be correlated to non-smoking habits, while smokers in this cohort had low-level copy number gains. Using the phylogenetic modeling software FISHtrees, we constructed models of tumor progression for each patient based on the four gene probes. Then, we derived test statistics on the models that are significant predictors of disease-free and overall survival, independent of tumor stage and smoking status in multivariate analysis. The patients whose tumors were modeled as progressing by a more diverse distribution of copy number changes across the four genes have poorer prognosis. This is consistent with the view that multiple genetic pathways need to become deregulated in order for cancer to progress.
Collapse
Affiliation(s)
- Darawalee Wangsa
- Genetics Branch, Center For Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD.,Department of Oncology-Pathology, Karolinska Institutet, Karolinska University Hospital, Stockholm, Sweden
| | - Salim Akhter Chowdhury
- Joint Carnegie Mellon/University of Pittsburgh Ph.D. Program In Computational Biology, Carnegie Mellon University, Pittsburgh, PA.,Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA
| | - Michael Ryott
- Department of Otorhinolaryngology, Sophiahemmet Hospital, Stockholm, Sweden
| | - E Michael Gertz
- Computational Biology Branch, National Center For Biotechnology Information, National Institutes of Health, Bethesda, MD
| | - Göran Elmberger
- Department of Laboratory Medicine, Pathology, Örebro University Hospital, Örebro, Sweden
| | - Gert Auer
- Department of Oncology-Pathology, Karolinska Institutet, Karolinska University Hospital, Stockholm, Sweden
| | - Elisabeth Åvall Lundqvist
- Department of Oncology-Pathology, Karolinska Institutet, Karolinska University Hospital, Stockholm, Sweden.,Department of Oncology And Department Of Clinical And Experimental Medicine, Linköping University, Linköping, Sweden
| | - Stefan Küffer
- Institute of Pathology, University Medical Center Göttingen, Göttingen, Germany
| | - Philipp Ströbel
- Institute of Pathology, University Medical Center Göttingen, Göttingen, Germany
| | - Alejandro A Schäffer
- Computational Biology Branch, National Center For Biotechnology Information, National Institutes of Health, Bethesda, MD
| | - Russell Schwartz
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA
| | - Eva Munck-Wikland
- Department of Oto-Rhino-Laryngology, Head And Neck Surgery, Karolinska University Hospital and Department of Clinical Science, Intervention and Technology, Karolinska Institutet, Stockholm, Sweden
| | - Thomas Ried
- Genetics Branch, Center For Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Kerstin Heselmeyer-Haddad
- Genetics Branch, Center For Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD
| |
Collapse
|
30
|
Roman T, Nayyeri A, Fasy BT, Schwartz R. A simplicial complex-based approach to unmixing tumor progression data. BMC Bioinformatics 2015; 16:254. [PMID: 26264682 PMCID: PMC4534068 DOI: 10.1186/s12859-015-0694-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2014] [Accepted: 08/03/2015] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND Tumorigenesis is an evolutionary process by which tumor cells acquire mutations through successive diversification and differentiation. There is much interest in reconstructing this process of evolution due to its relevance to identifying drivers of mutation and predicting future prognosis and drug response. Efforts are challenged by high tumor heterogeneity, though, both within and among patients. In prior work, we showed that this heterogeneity could be turned into an advantage by computationally reconstructing models of cell populations mixed to different degrees in distinct tumors. Such mixed membership model approaches, however, are still limited in their ability to dissect more than a few well-conserved cell populations across a tumor data set. RESULTS We present a method to improve on current mixed membership model approaches by better accounting for conserved progression pathways between subsets of cancers, which imply a structure to the data that has not previously been exploited. We extend our prior methods, which use an interpretation of the mixture problem as that of reconstructing simple geometric objects called simplices, to instead search for structured unions of simplices called simplicial complexes that one would expect to emerge from mixture processes describing branches along an evolutionary tree. We further improve on the prior work with a novel objective function to better identify mixtures corresponding to parsimonious evolutionary tree models. We demonstrate that this approach improves on our ability to accurately resolve mixtures on simulated data sets and demonstrate its practical applicability on a large RNASeq tumor data set. CONCLUSIONS Better exploiting the expected geometric structure for mixed membership models produced from common evolutionary trees allows us to quickly and accurately reconstruct models of cell populations sampled from those trees. In the process, we hope to develop a better understanding of tumor evolution as well as other biological problems that involve interpreting genomic data gathered from heterogeneous populations of cells.
Collapse
Affiliation(s)
- Theodore Roman
- Computatational Biology Department, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, USA.
| | - Amir Nayyeri
- Computer Science Department, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, USA.
| | - Brittany Terese Fasy
- Department of Computer Science, Tulane University, 6834 St. Charles St., New Orleans, USA.
| | - Russell Schwartz
- Computatational Biology Department, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, USA. .,Department of Biological Sciences, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, USA.
| |
Collapse
|
31
|
Sun D, Jonasch E, Lara PN. Genetic Heterogeneity of Kidney Cancer. KIDNEY CANCER 2015. [DOI: 10.1007/978-3-319-17903-2_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|