3
|
Chowdhury SA, Gertz EM, Wangsa D, Heselmeyer-Haddad K, Ried T, Schäffer AA, Schwartz R. Inferring models of multiscale copy number evolution for single-tumor phylogenetics. Bioinformatics 2015; 31:i258-67. [PMID: 26072490 PMCID: PMC4481700 DOI: 10.1093/bioinformatics/btv233] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Motivation: Phylogenetic algorithms have begun to see widespread use in cancer research to reconstruct processes of evolution in tumor progression. Developing reliable phylogenies for tumor data requires quantitative models of cancer evolution that include the unusual genetic mechanisms by which tumors evolve, such as chromosome abnormalities, and allow for heterogeneity between tumor types and individual patients. Previous work on inferring phylogenies of single tumors by copy number evolution assumed models of uniform rates of genomic gain and loss across different genomic sites and scales, a substantial oversimplification necessitated by a lack of algorithms and quantitative parameters for fitting to more realistic tumor evolution models. Results: We propose a framework for inferring models of tumor progression from single-cell gene copy number data, including variable rates for different gain and loss events. We propose a new algorithm for identification of most parsimonious combinations of single gene and single chromosome events. We extend it via dynamic programming to include genome duplications. We implement an expectation maximization (EM)-like method to estimate mutation-specific and tumor-specific event rates concurrently with tree reconstruction. Application of our algorithms to real cervical cancer data identifies key genomic events in disease progression consistent with prior literature. Classification experiments on cervical and tongue cancer datasets lead to improved prediction accuracy for the metastasis of primary cervical cancers and for tongue cancer survival. Availability and implementation: Our software (FISHtrees) and two datasets are available at ftp://ftp.ncbi.nlm.nih.gov/pub/FISHtrees. Contact:russells@andrew.cmu.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Salim Akhter Chowdhury
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - E Michael Gertz
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Darawalee Wangsa
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Kerstin Heselmeyer-Haddad
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Thomas Ried
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Alejandro A Schäffer
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Russell Schwartz
- Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA Joint Carnegie Mellon/University of Pittsburgh PhD Program in Computational Biology, Pittsburgh, PA, USA, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA, Computational Biology Branch, National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, MD, USA, Section of Cancer Genomics, Genetics Branch, Center for Cancer Research, National Cancer Institute, U.S. National Institutes of Health, Bethesda, MD, USA and Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
4
|
Chowdhury SA, Shackney SE, Heselmeyer-Haddad K, Ried T, Schäffer AA, Schwartz R. Algorithms to model single gene, single chromosome, and whole genome copy number changes jointly in tumor phylogenetics. PLoS Comput Biol 2014; 10:e1003740. [PMID: 25078894 PMCID: PMC4117424 DOI: 10.1371/journal.pcbi.1003740] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2014] [Accepted: 06/04/2014] [Indexed: 02/07/2023] Open
Abstract
We present methods to construct phylogenetic models of tumor progression at the cellular level that include copy number changes at the scale of single genes, entire chromosomes, and the whole genome. The methods are designed for data collected by fluorescence in situ hybridization (FISH), an experimental technique especially well suited to characterizing intratumor heterogeneity using counts of probes to genetic regions frequently gained or lost in tumor development. Here, we develop new provably optimal methods for computing an edit distance between the copy number states of two cells given evolution by copy number changes of single probes, all probes on a chromosome, or all probes in the genome. We then apply this theory to develop a practical heuristic algorithm, implemented in publicly available software, for inferring tumor phylogenies on data from potentially hundreds of single cells by this evolutionary model. We demonstrate and validate the methods on simulated data and published FISH data from cervical cancers and breast cancers. Our computational experiments show that the new model and algorithm lead to more parsimonious trees than prior methods for single-tumor phylogenetics and to improved performance on various classification tasks, such as distinguishing primary tumors from metastases obtained from the same patient population. Cancer is an evolutionary system whose growth and development is attributed to aberrations in well-known genes and to cancer-type specific genomic imbalances. Here, we present methods for reconstructing the evolution of individual tumors based on cell-to-cell variations between copy numbers of targeted regions of the genome. The methods are designed to work with fluorescence in situ hybridization (FISH), a technique that allows one to profile copy number changes in potentially thousands of single cells per study. Our work advances the prior art by developing theory and practical algorithms for building evolutionary trees of single tumors that can model gain or loss of genetic regions at the scale of single genes, whole chromosomes, or the entire genome, all common events in tumor evolution. We apply these methods on simulated and real tumor data to demonstrate substantial improvements in tree-building accuracy and in our ability to accurately classify tumors from their inferred evolutionary models. The newly developed algorithms have been released through our publicly available software, FISHtrees.
Collapse
Affiliation(s)
- Salim Akhter Chowdhury
- Joint Carnegie Mellon/University of Pittsburgh Ph.D. Program in Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Stanley E. Shackney
- Intelligent Oncotherapeutics, Pittsburgh, Pennsylvania, United States of America
| | | | - Thomas Ried
- Genetics Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland, United States of America
| | - Alejandro A. Schäffer
- Computational Biology Branch, NCBI, NIH, Bethesda, Maryland, United States of America
| | - Russell Schwartz
- Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|