1
|
Zhu X, Zhao W, Zhou Z, Gu X. Unraveling the Drivers of Tumorigenesis in the Context of Evolution: Theoretical Models and Bioinformatics Tools. J Mol Evol 2023:10.1007/s00239-023-10117-0. [PMID: 37246992 DOI: 10.1007/s00239-023-10117-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 05/09/2023] [Indexed: 05/30/2023]
Abstract
Cancer originates from somatic cells that have accumulated mutations. These mutations alter the phenotype of the cells, allowing them to escape homeostatic regulation that maintains normal cell numbers. The emergence of malignancies is an evolutionary process in which the random accumulation of somatic mutations and sequential selection of dominant clones cause cancer cells to proliferate. The development of technologies such as high-throughput sequencing has provided a powerful means to measure subclonal evolutionary dynamics across space and time. Here, we review the patterns that may be observed in cancer evolution and the methods available for quantifying the evolutionary dynamics of cancer. An improved understanding of the evolutionary trajectories of cancer will enable us to explore the molecular mechanism of tumorigenesis and to design tailored treatment strategies.
Collapse
Affiliation(s)
- Xunuo Zhu
- Innovation Institute for Artificial Intelligence in Medicine, Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Wenyi Zhao
- Innovation Institute for Artificial Intelligence in Medicine, Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Zhan Zhou
- Innovation Institute for Artificial Intelligence in Medicine, Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, 322000, China.
- Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 310058, China.
| | - Xun Gu
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
2
|
Abstract
Our capacity to study individual cells has enabled a new level of resolution for understanding complex biological systems such as multicellular organisms or microbial communities. Not surprisingly, several methods have been developed in recent years with a formidable potential to investigate the somatic evolution of single cells in both healthy and pathological tissues. However, single-cell sequencing data can be quite noisy due to different technical biases, so inferences resulting from these new methods need to be carefully contrasted. Here, I introduce CellCoal, a software tool for the coalescent simulation of single-cell sequencing genotypes. CellCoal simulates the history of single-cell samples obtained from somatic cell populations with different demographic histories and produces single-nucleotide variants under a variety of mutation models, sequencing read counts, and genotype likelihoods, considering allelic imbalance, allelic dropout, amplification, and sequencing errors, typical of this type of data. CellCoal is a flexible tool that can be used to understand the implications of different somatic evolutionary processes at the single-cell level, and to benchmark dedicated bioinformatic tools for the analysis of single-cell sequencing data. CellCoal is available at https://github.com/dapogon/cellcoal.
Collapse
Affiliation(s)
- David Posada
- Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo, Spain.,Biomedical Research Center (CINBIO), University of Vigo, Vigo, Spain.,Galicia Sur Health Research Institute, Vigo, Spain
| |
Collapse
|
3
|
The Nubeam reference-free approach to analyze metagenomic sequencing reads. Genome Res 2020; 30:1364-1375. [PMID: 32883749 PMCID: PMC7545149 DOI: 10.1101/gr.261750.120] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 07/30/2020] [Indexed: 01/04/2023]
Abstract
We present Nubeam (nucleotide be a matrix) as a novel reference-free approach to analyze short sequencing reads. Nubeam represents nucleotides by matrices, transforms a read into a product of matrices, and assigns numbers to reads based on the product matrix. Nubeam capitalizes on the noncommutative property of matrix multiplication, such that different reads are assigned different numbers and similar reads similar numbers. A sample, which is a collection of reads, becomes a collection of numbers that form an empirical distribution. We demonstrate that the genetic difference between samples can be quantified by the distance between empirical distributions. Nubeam includes the k-mer method as a special case, but unlike the k-mer method, it is convenient for Nubeam to account for GC bias and nucleotide quality. As a reference-free approach, Nubeam avoids reference bias and mapping bias, and can work with organisms without reference genomes. Thus, Nubeam is ideal to analyze data sets from metagenomics whole genome shotgun (WGS) sequencing, where the amount of unmapped reads is substantial. When applied to a WGS sequencing data set to quantify distances between metagenomics samples from various human body habitats, Nubeam recapitulates findings made by mapping-based methods and sheds light on contributions of unmapped reads. Nubeam is also useful in analyzing 16S rRNA sequencing data, which is a more prevalent type of data set in metagenomics studies. In our analysis, Nubeam recapitulated the findings that natural microbiota in mouse gut are resilient under challenges, and Nubeam detected differences in vaginal microbiota between cases of polycystic ovary syndrome and healthy controls.
Collapse
|
4
|
Lei H, Lyu B, Gertz EM, Schäffer AA, Shi X, Wu K, Li G, Xu L, Hou Y, Dean M, Schwartz R. Tumor Copy Number Deconvolution Integrating Bulk and Single-Cell Sequencing Data. J Comput Biol 2020; 27:565-598. [PMID: 32181683 DOI: 10.1089/cmb.2019.0302] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Characterizing intratumor heterogeneity (ITH) is crucial to understanding cancer development, but it is hampered by limits of available data sources. Bulk DNA sequencing is the most common technology to assess ITH, but involves the analysis of a mixture of many genetically distinct cells in each sample, which must then be computationally deconvolved. Single-cell sequencing is a promising alternative, but its limitations-for example, high noise, difficulty scaling to large populations, technical artifacts, and large data sets-have so far made it impractical for studying cohorts of sufficient size to identify statistically robust features of tumor evolution. We have developed strategies for deconvolution and tumor phylogenetics combining limited amounts of bulk and single-cell data to gain some advantages of single-cell resolution with much lower cost, with specific focus on deconvolving genomic copy number data. We developed a mixed membership model for clonal deconvolution via non-negative matrix factorization balancing deconvolution quality with similarity to single-cell samples via an associated efficient coordinate descent algorithm. We then improve on that algorithm by integrating deconvolution with clonal phylogeny inference, using a mixed integer linear programming model to incorporate a minimum evolution phylogenetic tree cost in the problem objective. We demonstrate the effectiveness of these methods on semisimulated data of known ground truth, showing improved deconvolution accuracy relative to bulk data alone.
Collapse
Affiliation(s)
- Haoyun Lei
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania
| | - Bochuan Lyu
- Department of Mathematics, Rose-Hulman Institute of Technology, Terre Haute, Indiana
| | - E Michael Gertz
- National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, Maryland.,Cancer Data Science Laboratory, National Cancer Institute, U.S. National Institutes of Health, Bethesda, Maryland
| | - Alejandro A Schäffer
- National Center for Biotechnology Information, U.S. National Institutes of Health, Bethesda, Maryland.,Cancer Data Science Laboratory, National Cancer Institute, U.S. National Institutes of Health, Bethesda, Maryland
| | | | - Kui Wu
- BGI-Shenzhen, Shenzhen, China
| | | | | | | | - Michael Dean
- Laboratory of Translational Genomics, Division of Cancer Epidemiology & Genetics, National Cancer Institute, U.S. National Institutes of Health, Gaithersburg, Maryland
| | - Russell Schwartz
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania
| |
Collapse
|
5
|
|
6
|
Abstract
The rapid development of immunomodulatory cancer therapies has led to a concurrent increase in the application of informatics techniques to the analysis of tumors, the tumor microenvironment, and measures of systemic immunity. In this review, the use of tumors to gather genetic and expression data will first be explored. Next, techniques to assess tumor immunity are reviewed, including HLA status, predicted neoantigens, immune microenvironment deconvolution, and T-cell receptor sequencing. Attempts to integrate these data are in early stages of development and are discussed in this review. Finally, we review the application of these informatics strategies to therapy development, with a focus on vaccines, adoptive cell transfer, and checkpoint blockade therapies.
Collapse
Affiliation(s)
- J Hammerbacher
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York
- Department of Microbiology and Immunology, Medical University of South Carolina, Charleston
| | - A Snyder
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York
- Adaptive Biotechnologies, Seattle, USA
| |
Collapse
|
7
|
Subramanian A, Schwartz R. Erratum to: 'Reference-free inference of tumor phylogenies from single-cell sequencing data'. BMC Genomics 2016; 17:348. [PMID: 27164840 PMCID: PMC4863360 DOI: 10.1186/s12864-016-2609-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- Ayshwarya Subramanian
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 655 Huntington Street, 02115, Boston, USA
| | - Russell Schwartz
- Department of Biological Sciences and the Computational Biology Department, Carnegie Mellon University, 5000 Forbes Avenue, 15213, Pittsburgh, USA.
| |
Collapse
|