1
|
Richard PI, Baltosser WH, Williams PH, He Q. Phylogenetic analysis of microbial CP-lyase cluster genes for bioremediation of phosphonate. AMB Express 2025; 15:42. [PMID: 40064825 PMCID: PMC11893972 DOI: 10.1186/s13568-025-01856-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2024] [Accepted: 02/22/2025] [Indexed: 03/14/2025] Open
Abstract
The ever-increasing use of phosphonates and their derivatives has resulted in the discharge of large quantities of these materials into the ecosystem, causing pollution and harmful shifts in microbiome composition. We conducted an extensive phylogenetic analysis to address this mounting problem and to help determine suitable microbes for bioremediation in specific environments. The 84 microorganisms included in our study span the gamut of species and occupied habitats. They degrade phosphonates by expressing an enzyme complex; CP-Lyase transcribed from 14 cistrons. Of the organisms studied, 12, 39, and 25 are singularly suitable for mostly freshwater, marine, or terrestrial habitats, respectively. Others adapted to multihabitats include Calothrix sp. PCC 7507 (both freshwater and marine habitats), Escherichia coli, Kaistia soli, Limoniibacter endophyticus, Marivita sp. and Virgibacillus dokdonensis (both marine and terrestrial habitats), Acidithiobacillus ferrooxidans (both freshwater and terrestrial habitats), with Paenibacillus contaminans suitable for freshwater, marine, and terrestrial habitats. All organisms were statistically rooted to glutathione peroxidase for phylogenetic perspective with tree topology dependent upon 50% or greater support. Clustered genes have been shown to have co-evolved based on striking nucleotide similarity and clade groupings within the tree topologies generated.
Collapse
Affiliation(s)
- Precious I Richard
- Department of Biology, University of Arkansas at Little Rock, Little Rock, AR, 72204, USA
| | - William H Baltosser
- Department of Biology, University of Arkansas at Little Rock, Little Rock, AR, 72204, USA
| | - Philip H Williams
- MidSouth Bioinformatics Center, University of Arkansas at Little Rock, Little Rock, AR, 72204, USA
| | - Qingfang He
- Department of Biology, University of Arkansas at Little Rock, Little Rock, AR, 72204, USA.
| |
Collapse
|
2
|
Askary A, Chen W, Choi J, Du LY, Elowitz MB, Gagnon JA, Schier AF, Seidel S, Shendure J, Stadler T, Tran M. The lives of cells, recorded. Nat Rev Genet 2025; 26:203-222. [PMID: 39587306 DOI: 10.1038/s41576-024-00788-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/26/2024] [Indexed: 11/27/2024]
Abstract
A paradigm for biology is emerging in which cells can be genetically programmed to write their histories into their own genomes. These records can subsequently be read, and the cellular histories reconstructed, which for each cell could include a record of its lineage relationships, extrinsic influences, internal states and physical locations, over time. DNA recording has the potential to transform the way that we study developmental and disease processes. Recent advances in genome engineering are driving the development of systems for DNA recording, and meanwhile single-cell and spatial omics technologies increasingly enable the recovery of the recorded information. Combined with advances in computational and phylogenetic inference algorithms, the DNA recording paradigm is beginning to bear fruit. In this Perspective, we explore the rationale and technical basis of DNA recording, what aspects of cellular biology might be recorded and how, and the types of discovery that we anticipate this paradigm will enable.
Collapse
Affiliation(s)
- Amjad Askary
- Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, CA, USA
| | - Wei Chen
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Junhong Choi
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Developmental Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Lucia Y Du
- Biozentrum, University of Basel, Basel, Switzerland
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA
| | - Michael B Elowitz
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA.
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
- Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA, USA.
| | - James A Gagnon
- School of Biological Sciences, University of Utah, Salt Lake City, UT, USA.
| | - Alexander F Schier
- Biozentrum, University of Basel, Basel, Switzerland.
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA.
| | - Sophie Seidel
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA.
- Howard Hughes Medical Institute, Seattle, WA, USA.
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA, USA.
- Seattle Hub for Synthetic Biology, Seattle, WA, USA.
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| | - Martin Tran
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
3
|
Blohmer M, Cheek DM, Hung WT, Kessler M, Chatzidimitriou F, Wang J, Hung W, Lee IH, Gorelick AN, Wassenaar EC, Yang CY, Yeh YC, Ho HL, Speiser D, Karsten MM, Lanuti M, Pai SI, Kranenburg O, Lennerz JK, Chou TY, Kloor M, Naxerova K. Quantifying cell divisions along evolutionary lineages in cancer. Nat Genet 2025; 57:706-717. [PMID: 39905260 DOI: 10.1038/s41588-025-02078-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Accepted: 01/07/2025] [Indexed: 02/06/2025]
Abstract
Cell division drives somatic evolution but is challenging to quantify. We developed a framework to count cell divisions with DNA replication-related mutations in polyguanine homopolymers. Analyzing 505 samples from 37 patients, we studied the milestones of colorectal cancer evolution. Primary tumors diversify at ~250 divisions from the founder cell, while distant metastasis divergence occurs significantly later, at ~500 divisions. Notably, distant but not lymph node metastases originate from primary tumor regions that have undergone surplus divisions, tying subclonal expansion to metastatic capacity. Then, we analyzed a cohort of 73 multifocal lung cancers and showed that the cell division burden of the tumors' common ancestor distinguishes independent primary tumors from intrapulmonary metastases and correlates with patient survival. In lung cancer too, metastatic capacity is tied to more extensive proliferation. The cell division history of human cancers is easily accessible using our simple framework and contains valuable biological and clinical information.
Collapse
Affiliation(s)
- Martin Blohmer
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Center for Systems Biology, Massachusetts General Hospital Research Institute and Harvard Medical School, Boston, MA, USA
- Department of Gynecology with Breast Center, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - David M Cheek
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Center for Systems Biology, Massachusetts General Hospital Research Institute and Harvard Medical School, Boston, MA, USA
| | - Wei-Ting Hung
- Center for Systems Biology, Massachusetts General Hospital Research Institute and Harvard Medical School, Boston, MA, USA
- Graduate Institute of Medical Genomics and Proteomics, National Taiwan University, Taipei, Taiwan
| | - Maria Kessler
- Department of Applied Tumor Biology, Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Department of Medical Oncology, National Center for Tumor Diseases, Heidelberg University Hospital, Heidelberg, Germany
| | - Foivos Chatzidimitriou
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Center for Systems Biology, Massachusetts General Hospital Research Institute and Harvard Medical School, Boston, MA, USA
| | - Jiahe Wang
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - William Hung
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - I-Hsiu Lee
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Center for Systems Biology, Massachusetts General Hospital Research Institute and Harvard Medical School, Boston, MA, USA
| | - Alexander N Gorelick
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Center for Systems Biology, Massachusetts General Hospital Research Institute and Harvard Medical School, Boston, MA, USA
| | - Emma Ce Wassenaar
- Department of Surgery, St. Antonius Hospital, Nieuwegein, the Netherlands
- Department of Surgical Oncology, Laboratory Translational Oncology, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Ching-Yeuh Yang
- Department of Pathology, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan
| | - Yi-Chen Yeh
- Department of Pathology and Laboratory Medicine, Taipei Veterans General Hospital, Taipei, Taiwan
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Hsiang-Ling Ho
- Department of Pathology and Laboratory Medicine, Taipei Veterans General Hospital, Taipei, Taiwan
- Department of Biotechnology and Laboratory Science in Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Dorothee Speiser
- Department of Gynecology with Breast Center, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Maria M Karsten
- Department of Gynecology with Breast Center, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Michael Lanuti
- Division of Thoracic Surgery, Massachusetts General Hospital, Boston, MA, USA
| | - Sara I Pai
- Center for Systems Biology, Massachusetts General Hospital Research Institute and Harvard Medical School, Boston, MA, USA
- Division of Otolaryngology-Head and Neck Surgery, Department of Surgery, Yale University School of Medicine, New Haven, CT, USA
| | - Onno Kranenburg
- Department of Surgical Oncology, Laboratory Translational Oncology, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Jochen K Lennerz
- Department of Pathology, Center for Integrated Diagnostics, Massachusetts General Hospital, Boston, MA, USA
| | - Teh-Ying Chou
- Department of Pathology and Precision Medicine Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Taiwan
- Graduate Institute of Clinical Medicine, Taipei Medical University, Taipei, Taiwan
| | - Matthias Kloor
- Department of Applied Tumor Biology, Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
| | - Kamila Naxerova
- Department of Genetics, Harvard Medical School, Boston, MA, USA.
- Center for Systems Biology, Massachusetts General Hospital Research Institute and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
4
|
Fang W, Yang Y, Ji H, Kalhor R. Reconstructing Progenitor State Hierarchy and Dynamics Using Lineage Barcoding Data. Methods Mol Biol 2025; 2886:177-199. [PMID: 39745641 DOI: 10.1007/978-1-0716-4310-5_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
Measurements of cell phylogeny based on natural or induced mutations, known as lineage barcodes, in conjunction with molecular phenotype have become increasingly feasible for a large number of single cells. In this chapter, we delve into Quantitative Fate Mapping (QFM) and its computational pipeline, which enables the interrogation of the dynamics of progenitor cells and their fate restriction during development. The methods described here include inferring cell phylogeny with the Phylotime model, and reconstructing progenitor state hierarchy, commitment time, population size, and commitment bias with the ICE-FASE algorithm. Evaluation of adequate sampling based on progenitor state coverage statistics is emphasized for interpreting the QFM results. Overall, this chapter describes a general framework for characterizing the dynamics of cell fate changes using lineage barcoding data.
Collapse
Affiliation(s)
- Weixiang Fang
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Yi Yang
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Hongkai Ji
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA
| | - Reza Kalhor
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- Departments of Molecular Biology & Genetics, Medicine, Neuroscience, and Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
5
|
Liu Z, Zeng H, Xiang H, Deng S, He X. Achieving single-cell-resolution lineage tracing in zebrafish by continuous barcoding mutations during embryogenesis. J Genet Genomics 2024; 51:947-956. [PMID: 38621643 DOI: 10.1016/j.jgg.2024.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 04/03/2024] [Accepted: 04/07/2024] [Indexed: 04/17/2024]
Abstract
Unraveling the lineage relationships of all descendants from a zygote is fundamental to advancing our understanding of developmental and stem cell biology. However, existing cell barcoding technologies in zebrafish lack the resolution to capture the majority of cell divisions during embryogenesis. A recently developed method, a substitution mutation-aided lineage-tracing system (SMALT), successfully reconstructed high-resolution cell phylogenetic trees for Drosophila melanogaster. Here, we implement the SMALT system in zebrafish, recording a median of 14 substitution mutations on a one-kilobase-pair barcoding sequence for one-day post-fertilization embryos. Leveraging this system, we reconstruct four cell lineage trees for zebrafish fin cells, encompassing both original and regenerated fin. Each tree consists of hundreds of internal nodes with a median bootstrap support of 99%. Analysis of the obtained cell lineage trees reveals that regenerated fin cells mainly originate from cells in the same part of the fins. Through multiple times sampling germ cells from the same individual, we show the stability of the germ cell pool and the early separation of germ cell and somatic cell progenitors. Our system offers the potential for reconstructing high-quality cell phylogenies across diverse tissues, providing valuable insights into development and disease in zebrafish.
Collapse
Affiliation(s)
- Zhan Liu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong 510275, China
| | - Hui Zeng
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong 510275, China
| | - Huimin Xiang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong 510275, China
| | - Shanjun Deng
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong 510275, China
| | - Xionglei He
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong 510275, China.
| |
Collapse
|
6
|
Csordas A, Sipos B, Kurucova T, Volfova A, Zamola F, Tichy B, Hicks DG. Cell Tree Rings: the structure of somatic evolution as a human aging timer. GeroScience 2024; 46:3005-3019. [PMID: 38172489 PMCID: PMC11009167 DOI: 10.1007/s11357-023-01053-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 12/22/2023] [Indexed: 01/05/2024] Open
Abstract
Biological age is typically estimated using biomarkers whose states have been observed to correlate with chronological age. A persistent limitation of such aging clocks is that it is difficult to establish how the biomarker states are related to the mechanisms of aging. Somatic mutations could potentially form the basis for a more fundamental aging clock since the mutations are both markers and drivers of aging and have a natural timescale. Cell lineage trees inferred from these mutations reflect the somatic evolutionary process, and thus, it has been conjectured, the aging status of the body. Such a timer has been impractical thus far, however, because detection of somatic variants in single cells presents a significant technological challenge. Here, we show that somatic mutations detected using single-cell RNA sequencing (scRNA-seq) from thousands of cells can be used to construct a cell lineage tree whose structure correlates with chronological age. De novo single-nucleotide variants (SNVs) are detected in human peripheral blood mononuclear cells using a modified protocol. A default model based on penalized multiple regression of chronological age on 31 metrics characterizing the phylogenetic tree gives a Pearson correlation of 0.81 and a median absolute error of ~4 years between predicted and chronological ages. Testing of the model on a public scRNA-seq dataset yields a Pearson correlation of 0.85. In addition, cell tree age predictions are found to be better predictors of certain clinical biomarkers than chronological age alone, for instance glucose, albumin levels, and leukocyte count. The geometry of the cell lineage tree records the structure of somatic evolution in the individual and represents a new modality of aging timer. In addition to providing a numerical estimate of "cell tree age," it unveils a temporal history of the aging process, revealing how clonal structure evolves over life span. Cell Tree Rings complements existing aging clocks and may help reduce the current uncertainty in the assessment of geroprotective trials.
Collapse
Affiliation(s)
- Attila Csordas
- AgeCurve Limited, Cambridge, CB2 1SD, UK.
- Doctoral School of Clinical Medicine, University of Szeged, Szeged, H-6720, Hungary.
| | | | - Terezia Kurucova
- CEITEC - Central European Institute of Technology, Masaryk University, 62500, Brno, Czechia
- Department of Experimental Biology, Faculty of Science, Masaryk University, 62500, Brno, Czechia
| | - Andrea Volfova
- HealthyLongevity.clinic Inc, 540 University Ave, Palo Alto, CA, 94301, USA
| | - Frantisek Zamola
- HealthyLongevity.clinic Inc, 540 University Ave, Palo Alto, CA, 94301, USA
| | - Boris Tichy
- CEITEC - Central European Institute of Technology, Masaryk University, 62500, Brno, Czechia
| | - Damien G Hicks
- AgeCurve Limited, Cambridge, CB2 1SD, UK
- Swinburne University of Technology, Hawthorn, VIC, 3122, Australia
| |
Collapse
|
7
|
Wang K, Hou L, Wang X, Zhai X, Lu Z, Zi Z, Zhai W, He X, Curtis C, Zhou D, Hu Z. PhyloVelo enhances transcriptomic velocity field mapping using monotonically expressed genes. Nat Biotechnol 2024; 42:778-789. [PMID: 37524958 DOI: 10.1038/s41587-023-01887-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 06/28/2023] [Indexed: 08/02/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) is a powerful approach for studying cellular differentiation, but accurately tracking cell fate transitions can be challenging, especially in disease conditions. Here we introduce PhyloVelo, a computational framework that estimates the velocity of transcriptomic dynamics by using monotonically expressed genes (MEGs) or genes with expression patterns that either increase or decrease, but do not cycle, through phylogenetic time. Through integration of scRNA-seq data with lineage information, PhyloVelo identifies MEGs and reconstructs a transcriptomic velocity field. We validate PhyloVelo using simulated data and Caenorhabditis elegans ground truth data, successfully recovering linear, bifurcated and convergent differentiations. Applying PhyloVelo to seven lineage-traced scRNA-seq datasets, generated using CRISPR-Cas9 editing, lentiviral barcoding or immune repertoire profiling, demonstrates its high accuracy and robustness in inferring complex lineage trajectories while outperforming RNA velocity. Additionally, we discovered that MEGs across tissues and organisms share similar functions in translation and ribosome biogenesis.
Collapse
Affiliation(s)
- Kun Wang
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- School of Mathematical Sciences, Xiamen University, Xiamen, China
| | - Liangzhen Hou
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- Faculty of Health Sciences, University of Macau, Taipa, Macau, China
| | - Xin Wang
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Xiangwei Zhai
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, China
| | - Zhaolian Lu
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Zhike Zi
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Weiwei Zhai
- CAS Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| | - Xionglei He
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, China
| | - Christina Curtis
- Department of Medicine, Division of Oncology, Stanford University School of Medicine, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Da Zhou
- School of Mathematical Sciences, Xiamen University, Xiamen, China.
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China.
| | - Zheng Hu
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| |
Collapse
|
8
|
Manso BA, Rodriguez y Baena A, Forsberg EC. From Hematopoietic Stem Cells to Platelets: Unifying Differentiation Pathways Identified by Lineage Tracing Mouse Models. Cells 2024; 13:704. [PMID: 38667319 PMCID: PMC11048769 DOI: 10.3390/cells13080704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 04/17/2024] [Accepted: 04/17/2024] [Indexed: 04/28/2024] Open
Abstract
Platelets are the terminal progeny of megakaryocytes, primarily produced in the bone marrow, and play critical roles in blood homeostasis, clotting, and wound healing. Traditionally, megakaryocytes and platelets are thought to arise from multipotent hematopoietic stem cells (HSCs) via multiple discrete progenitor populations with successive, lineage-restricting differentiation steps. However, this view has recently been challenged by studies suggesting that (1) some HSC clones are biased and/or restricted to the platelet lineage, (2) not all platelet generation follows the "canonical" megakaryocytic differentiation path of hematopoiesis, and (3) platelet output is the default program of steady-state hematopoiesis. Here, we specifically investigate the evidence that in vivo lineage tracing studies provide for the route(s) of platelet generation and investigate the involvement of various intermediate progenitor cell populations. We further identify the challenges that need to be overcome that are required to determine the presence, role, and kinetics of these possible alternate pathways.
Collapse
Affiliation(s)
- Bryce A. Manso
- Institute for the Biology of Stem Cells, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
- Department of Biomolecular Engineering, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | - Alessandra Rodriguez y Baena
- Institute for the Biology of Stem Cells, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
- Program in Biomedical Sciences and Engineering, Department of Molecular, Cell, and Developmental Biology, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | - E. Camilla Forsberg
- Institute for the Biology of Stem Cells, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
- Department of Biomolecular Engineering, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
9
|
Li Z, Yang W, Wu P, Shan Y, Zhang X, Chen F, Yang J, Yang JR. Reconstructing cell lineage trees with genomic barcoding: approaches and applications. J Genet Genomics 2024; 51:35-47. [PMID: 37269980 DOI: 10.1016/j.jgg.2023.05.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 05/19/2023] [Accepted: 05/20/2023] [Indexed: 06/05/2023]
Abstract
In multicellular organisms, developmental history of cell divisions and functional annotation of terminal cells can be organized into a cell lineage tree (CLT). The reconstruction of the CLT has long been a major goal in developmental biology and other related fields. Recent technological advancements, especially those in editable genomic barcodes and single-cell high-throughput sequencing, have sparked a new wave of experimental methods for reconstructing CLTs. Here we review the existing experimental approaches to the reconstruction of CLT, which are broadly categorized as either image-based or DNA barcode-based methods. In addition, we present a summary of the related literature based on the biological insight provided by the obtained CLTs. Moreover, we discuss the challenges that will arise as more and better CLT data become available in the near future. Genomic barcoding-based CLT reconstructions and analyses, due to their wide applicability and high scalability, offer the potential for novel biological discoveries, especially those related to general and systemic properties of the developmental process.
Collapse
Affiliation(s)
- Zizhang Li
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong 510080, China; Department of Genetics and Biomedical Informatics, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong 510080, China
| | - Wenjing Yang
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong 510080, China
| | - Peng Wu
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong 510080, China
| | - Yuyan Shan
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong 510080, China
| | - Xiaoyu Zhang
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong 510080, China
| | - Feng Chen
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong 510080, China; Department of Genetics and Biomedical Informatics, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong 510080, China
| | - Junnan Yang
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong 510080, China
| | - Jian-Rong Yang
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong 510080, China; Department of Genetics and Biomedical Informatics, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong 510080, China; Key Laboratory of Tropical Disease Control, Ministry of Education, Sun Yat-sen University, Guangzhou, Guangdong 510080, China.
| |
Collapse
|
10
|
Brower AVZ. Hierarchies, classifications, cladograms and phylogeny. Cladistics 2023; 39:229-239. [PMID: 36786346 DOI: 10.1111/cla.12525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 12/21/2022] [Accepted: 12/22/2022] [Indexed: 02/15/2023] Open
Abstract
Figure 18 of Hennig's Phylogenetic Systematics (University of Illinois Press, Urbana, IL, 1966) shows a phylogenetic tree (a generative hierarchy) and what appear to be nested sets (an inclusive hierarchy) that he stated were two representations of the same pattern of relationships. This essay questions whether this is correct or not, explores the meanings of different hierarchical patterns, reviews various interpretations of Hennig's figure, and discusses the conceptual path from systematic evidence to phylogenetic explanation. The crux of the argument is that systematic hierarchies as we know them scientifically are nested sets that group theoretical entities based on patterns of synapomorphy. The notions of phylogeny and common ancestry reflect this hierarchical pattern.
Collapse
Affiliation(s)
- Andrew V Z Brower
- National Identification Services, Plant Protection and Quarantine, USDA-APHIS, 4700 River Road, Riverdale, MD, 20737, USA.,Division of Invertebrates, American Museum of Natural History, Central Park West at 79th Street, New York, NY, 10024, USA.,Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC, 20013-7012, USA
| |
Collapse
|
11
|
Espinosa-Medina I, Feliciano D, Belmonte-Mateos C, Linda Miyares R, Garcia-Marques J, Foster B, Lindo S, Pujades C, Koyama M, Lee T. TEMPO enables sequential genetic labeling and manipulation of vertebrate cell lineages. Neuron 2023; 111:345-361.e10. [PMID: 36417906 DOI: 10.1016/j.neuron.2022.10.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 08/15/2022] [Accepted: 10/26/2022] [Indexed: 11/24/2022]
Abstract
During development, regulatory factors appear in a precise order to determine cell fates over time. Consequently, to investigate complex tissue development, it is necessary to visualize and manipulate cell lineages with temporal control. Current strategies for tracing vertebrate cell lineages lack genetic access to sequentially produced cells. Here, we present TEMPO (Temporal Encoding and Manipulation in a Predefined Order), an imaging-readable genetic tool allowing differential labeling and manipulation of consecutive cell generations in vertebrates. TEMPO is based on CRISPR and powered by a cascade of gRNAs that drive orderly activation and inactivation of reporters and/or effectors. Using TEMPO to visualize zebrafish and mouse neurogenesis, we recapitulated birth-order-dependent neuronal fates. Temporally manipulating cell-cycle regulators in mouse cortex progenitors altered the proportion and distribution of neurons and glia, revealing the effects of temporal gene perturbation on serial cell fates. Thus, TEMPO enables sequential manipulation of molecular factors, crucial to study cell-type specification.
Collapse
Affiliation(s)
| | - Daniel Feliciano
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA
| | - Carla Belmonte-Mateos
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra, PRBB, Barcelona 08003, Spain
| | - Rosa Linda Miyares
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA
| | - Jorge Garcia-Marques
- Centro Nacional de Biotecnologia, Consejo Superior de Investigaciones Cientificas, Madrid 28049, Spain
| | - Benjamin Foster
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA
| | - Sarah Lindo
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA
| | - Cristina Pujades
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra, PRBB, Barcelona 08003, Spain
| | - Minoru Koyama
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA; Department of Biological Sciences, University of Toronto Scarborough, Toronto, ON M1C 1A4, Canada
| | - Tzumin Lee
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA.
| |
Collapse
|
12
|
Fang W, Bell CM, Sapirstein A, Asami S, Leeper K, Zack DJ, Ji H, Kalhor R. Quantitative fate mapping: A general framework for analyzing progenitor state dynamics via retrospective lineage barcoding. Cell 2022; 185:4604-4620.e32. [PMID: 36423582 PMCID: PMC9708097 DOI: 10.1016/j.cell.2022.10.028] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 08/23/2022] [Accepted: 10/26/2022] [Indexed: 11/24/2022]
Abstract
Natural and induced somatic mutations that accumulate in the genome during development record the phylogenetic relationships of cells; whether these lineage barcodes capture the complex dynamics of progenitor states remains unclear. We introduce quantitative fate mapping, an approach to reconstruct the hierarchy, commitment times, population sizes, and commitment biases of intermediate progenitor states during development based on a time-scaled phylogeny of their descendants. To reconstruct time-scaled phylogenies from lineage barcodes, we introduce Phylotime, a scalable maximum likelihood clustering approach based on a general barcoding mutagenesis model. We validate these approaches using realistic in silico and in vitro barcoding experiments. We further establish criteria for the number of cells that must be analyzed for robust quantitative fate mapping and a progenitor state coverage statistic to assess the robustness. This work demonstrates how lineage barcodes, natural or synthetic, enable analyzing progenitor fate and dynamics long after embryonic development in any organism.
Collapse
Affiliation(s)
- Weixiang Fang
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA; Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Claire M Bell
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Ophthalmology, Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Abel Sapirstein
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Soichiro Asami
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Kathleen Leeper
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Donald J Zack
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Ophthalmology, Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Hongkai Ji
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA.
| | - Reza Kalhor
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.
| |
Collapse
|
13
|
Anderson DJ, Pauler FM, McKenna A, Shendure J, Hippenmeyer S, Horwitz MS. Simultaneous brain cell type and lineage determined by scRNA-seq reveals stereotyped cortical development. Cell Syst 2022; 13:438-453.e5. [PMID: 35452605 DOI: 10.1016/j.cels.2022.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 01/21/2022] [Accepted: 03/30/2022] [Indexed: 11/30/2022]
Abstract
Mutations are acquired frequently, such that each cell's genome inscribes its history of cell divisions. Common genomic alterations involve loss of heterozygosity (LOH). LOH accumulates throughout the genome, offering large encoding capacity for inferring cell lineage. Using only single-cell RNA sequencing (scRNA-seq) of mouse brain cells, we found that LOH events spanning multiple genes are revealed as tracts of monoallelically expressed, constitutionally heterozygous single-nucleotide variants (SNVs). We simultaneously inferred cell lineage and marked developmental time points based on X chromosome inactivation and the total number of LOH events while identifying cell types from gene expression patterns. Our results are consistent with progenitor cells giving rise to multiple cortical cell types through stereotyped expansion and distinct waves of neurogenesis. This type of retrospective analysis could be incorporated into scRNA-seq pipelines and, compared with experimental approaches for determining lineage in model organisms, is applicable where genetic engineering is prohibited, such as humans.
Collapse
Affiliation(s)
- Donovan J Anderson
- Allen Discovery Center for Lineage Tracing and Department of Laboratory Medicine & Pathology, University of Washington, Seattle, WA 98109, USA
| | - Florian M Pauler
- Institute of Science and Technology Austria, Am Campus 1, 3400 Klosterneuburg, Austria
| | | | - Jay Shendure
- Allen Discovery Center for Lineage Tracing, Department of Genome Sciences, and Howard Hughes Medical Institute, University of Washington, Seattle, WA 98109, USA
| | - Simon Hippenmeyer
- Institute of Science and Technology Austria, Am Campus 1, 3400 Klosterneuburg, Austria
| | - Marshall S Horwitz
- Allen Discovery Center for Lineage Tracing and Department of Laboratory Medicine & Pathology, University of Washington, Seattle, WA 98109, USA.
| |
Collapse
|
14
|
Elshikh MS, Ajmal Ali M, Al-Hemaid F, Yong Kim S, Elangbam M, Bahadur Gurung A, Mukherjee P, El-Zaidy M, Lee J. Insights into plastome of Fagonia indica Burm.f. (Zygophyllaceae) : organization, annotation and phylogeny. Saudi J Biol Sci 2022; 29:1313-1321. [PMID: 35280582 PMCID: PMC8913386 DOI: 10.1016/j.sjbs.2021.11.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 11/03/2021] [Accepted: 11/04/2021] [Indexed: 11/15/2022] Open
Abstract
The enhanced understanding of chloroplast genomics would facilitate various biotechnology applications; however, the chloroplast (cp) genome / plastome characteristics of plants like Fagonia indica Burm.f. (family Zygophyllaceae), which have the capability to grow in extremely hot sand desert, have been rarely understood. The de novo genome sequence of F. indica using the Illumina high-throughput sequencing technology determined 128,379 bp long cp genome, encode 115 unique coding genes. The present study added the evidence of the loss of a copy of the IR in the cp genome of the taxa capable to grow in the hot sand desert. The maximum likelihood analysis revealed two distinct sub-clades i.e. Krameriaceae and Zygophyllaceae of the order Zygophyllales, nested within fabids.
Collapse
Affiliation(s)
- Mohamed S Elshikh
- Department of Botany and Microbiology, College of Science, King Saud University, Riyadh 11451, Saudi Arabia
| | - Mohammad Ajmal Ali
- Department of Botany and Microbiology, College of Science, King Saud University, Riyadh 11451, Saudi Arabia
| | - Fahad Al-Hemaid
- Department of Botany and Microbiology, College of Science, King Saud University, Riyadh 11451, Saudi Arabia
| | - Soo Yong Kim
- International Biological Material Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
| | - Meena Elangbam
- Genetics Laboratory, Centre of Advanced Studies in Life Sciences, Manipur University, Canchipur 795 003, India
| | - Arun Bahadur Gurung
- Department of Basic Sciences and Social Sciences, North-Eastern Hill University, Shillong-793022, Meghalaya, India
| | - Prasanjit Mukherjee
- Department of Botany, Kumar Kalidas Memorial College, Pakur-816107, Jharkhand, India
| | - Mohamed El-Zaidy
- Department of Botany and Microbiology, College of Science, King Saud University, Riyadh 11451, Saudi Arabia
| | - Joongku Lee
- Department of Environment and Forest Resources, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
| |
Collapse
|
15
|
Mapping single-cell-resolution cell phylogeny reveals cell population dynamics during organ development. Nat Methods 2021; 18:1506-1514. [PMID: 34857936 DOI: 10.1038/s41592-021-01325-x] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 10/18/2021] [Indexed: 12/20/2022]
Abstract
Mapping the cell phylogeny of a complex multicellular organism relies on somatic mutations accumulated from zygote to adult. Available cell barcoding methods can record about three mutations per barcode, enabling only low-resolution mapping of the cell phylogeny of complex organisms. Here we developed SMALT, a substitution mutation-aided lineage-tracing system that outperforms the available cell barcoding methods in mapping cell phylogeny. We applied SMALT to Drosophila melanogaster and obtained on average more than 20 mutations on a three-kilobase-pair barcoding sequence in early-adult cells. Using the barcoding mutations, we obtained high-quality cell phylogenetic trees, each comprising several thousand internal nodes with 84-93% median bootstrap support. The obtained cell phylogenies enabled a population genetic analysis that estimates the longitudinal dynamics of the number of actively dividing parental cells (Np) in each organ through development. The Np dynamics revealed the trajectory of cell births and provided insight into the balance of symmetric and asymmetric cell division.
Collapse
|
16
|
Lyne AM, Perie L. Comparing Phylogenetic Approaches to Reconstructing Cell Lineage From Microsatellites With Missing Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2291-2301. [PMID: 32386163 DOI: 10.1109/tcbb.2020.2992813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Due to the imperfect fidelity of DNA replication, somatic cells acquire DNA mutations at each division which record their lineage history. Microsatellites, tandem repeats of DNA nucleotide motifs, mutate more frequently than other genomic regions and by observing microsatellite lengths in single cells and implementing suitable inference procedures, the cell lineage tree of an organism can be reconstructed. Due to recent advances in single cell Next Generation Sequencing (NGS) and the phylogenetic methods used to infer lineage trees, this work investigates which computational approaches best exploit the lineage information found in single cell NGS data. We simulated trees representing cell division with mutating microsatellites, and tested a range of available phylogenetic algorithms to reconstruct cell lineage. We found that distance-based approaches are fast and accurate with fully observed data. However, Maximum Parsimony and the computationally intensive probabilistic methods are more robust to missing data and therefore better suited to reconstructing cell lineage from NGS datasets. We also investigated how robust reconstruction algorithms are to different tree topologies and mutation generation models. Our results show that the flexibility of Maximum Parsimony and the probabilistic approaches mean they can be adapted to allow good reconstruction across a range of biologically relevant scenarios.
Collapse
|
17
|
Abstract
Over the past decade, genomic analyses of single cells-the fundamental units of life-have become possible. Single-cell DNA sequencing has shed light on biological questions that were previously inaccessible across diverse fields of research, including somatic mutagenesis, organismal development, genome function, and microbiology. Single-cell DNA sequencing also promises significant future biomedical and clinical impact, spanning oncology, fertility, and beyond. While single-cell approaches that profile RNA and protein have greatly expanded our understanding of cellular diversity, many fundamental questions in biology and important biomedical applications require analysis of the DNA of single cells. Here, we review the applications and biological questions for which single-cell DNA sequencing is uniquely suited or required. We include a discussion of the fields that will be impacted by single-cell DNA sequencing as the technology continues to advance.
Collapse
Affiliation(s)
- Gilad D Evrony
- Center for Human Genetics and Genomics, Grossman School of Medicine, New York University, New York, NY 10016, USA;
| | - Anjali Gupta Hinch
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom;
| | - Chongyuan Luo
- Department of Human Genetics, University of California, Los Angeles, California 90095, USA;
| |
Collapse
|
18
|
PolyG-DS: An ultrasensitive polyguanine tract-profiling method to detect clonal expansions and trace cell lineage. Proc Natl Acad Sci U S A 2021; 118:2023373118. [PMID: 34330826 DOI: 10.1073/pnas.2023373118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Polyguanine tracts (PolyGs) are short guanine homopolymer repeats that are prone to accumulating mutations when cells divide. This feature makes them especially suitable for cell lineage tracing, which has been exploited to detect and characterize precancerous and cancerous somatic evolution. PolyG genotyping, however, is challenging because of the inherent biochemical difficulties in amplifying and sequencing repetitive regions. To overcome this limitation, we developed PolyG-DS, a next-generation sequencing (NGS) method that combines the error-correction capabilities of duplex sequencing (DS) with enrichment of PolyG loci using CRISPR-Cas9-targeted genomic fragmentation. PolyG-DS markedly reduces technical artifacts by comparing the sequences derived from the complementary strands of each original DNA molecule. We demonstrate that PolyG-DS genotyping is accurate, reproducible, and highly sensitive, enabling the detection of low-frequency alleles (<0.01) in spike-in samples using a panel of only 19 PolyG markers. PolyG-DS replicated prior results based on PolyG fragment length analysis by capillary electrophoresis, and exhibited higher sensitivity for identifying clonal expansions in the nondysplastic colon of patients with ulcerative colitis. We illustrate the utility of this method for resolving the phylogenetic relationship among precancerous lesions in ulcerative colitis and for tracing the metastatic dissemination of ovarian cancer. PolyG-DS enables the study of tumor evolution without prior knowledge of tumor driver mutations and provides a tool to perform cost-effective and easily scalable ultra-accurate NGS-based PolyG genotyping for multiple applications in biology, genetics, and cancer research.
Collapse
|
19
|
Tao L, Raz O, Marx Z, Ghosh MS, Huber S, Greindl-Junghans J, Biezuner T, Amir S, Milo L, Adar R, Levy R, Onn A, Chapal-Ilani N, Berman V, Ben Arie A, Rom G, Oron B, Halaban R, Czyz ZT, Werner-Klein M, Klein CA, Shapiro E. Retrospective cell lineage reconstruction in humans by using short tandem repeats. CELL REPORTS METHODS 2021; 1:None. [PMID: 34341783 PMCID: PMC8313865 DOI: 10.1016/j.crmeth.2021.100054] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Revised: 04/17/2021] [Accepted: 06/24/2021] [Indexed: 12/18/2022]
Abstract
Cell lineage analysis aims to uncover the developmental history of an organism back to its cell of origin. Recently, novel in vivo methods utilizing genome editing enabled important insights into the cell lineages of animals. In contrast, human cell lineage remains restricted to retrospective approaches, which still lack resolution and cost-efficient solutions. Here, we demonstrate a scalable platform based on short tandem repeats targeted by duplex molecular inversion probes. With this human cell lineage tracing method, we accurately reproduced a known lineage of DU145 cells and reconstructed lineages of healthy and metastatic single cells from a melanoma patient who matched the anatomical reference while adding further refinements. This platform allowed us to faithfully recapitulate lineages of developmental tissue formation in healthy cells. In summary, our lineage discovery platform can profile informative somatic mutations efficiently and provides solid lineage reconstructions even in challenging low-mutation-rate healthy single cells.
Collapse
Affiliation(s)
- Liming Tao
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Ofir Raz
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Zipora Marx
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Manjusha S. Ghosh
- Experimental Medicine and Therapy Research, University of Regensburg, Franz-Josef-Strauß-Allee 11, 93053 Regensburg, Germany
| | - Sandra Huber
- Experimental Medicine and Therapy Research, University of Regensburg, Franz-Josef-Strauß-Allee 11, 93053 Regensburg, Germany
| | - Julia Greindl-Junghans
- Experimental Medicine and Therapy Research, University of Regensburg, Franz-Josef-Strauß-Allee 11, 93053 Regensburg, Germany
| | - Tamir Biezuner
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Shiran Amir
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Lilach Milo
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Rivka Adar
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Ron Levy
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Amos Onn
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Noa Chapal-Ilani
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Veronika Berman
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Asaf Ben Arie
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Guy Rom
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Barak Oron
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Ruth Halaban
- Department of Dermatology, Yale University School of Medicine, New Haven, CT 06520-8059, USA
| | - Zbigniew T. Czyz
- Experimental Medicine and Therapy Research, University of Regensburg, Franz-Josef-Strauß-Allee 11, 93053 Regensburg, Germany
| | - Melanie Werner-Klein
- Experimental Medicine and Therapy Research, University of Regensburg, Franz-Josef-Strauß-Allee 11, 93053 Regensburg, Germany
| | - Christoph A. Klein
- Experimental Medicine and Therapy Research, University of Regensburg, Franz-Josef-Strauß-Allee 11, 93053 Regensburg, Germany
- Division of Personalized Tumor Therapy, Fraunhofer Institute for Experimental Medicine and Toxicology Regensburg, Am Biopark 9, 93053 Regensburg, Germany
| | - Ehud Shapiro
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 761001, Israel
| |
Collapse
|
20
|
Zhou T, Chen L, Guo J, Zhang M, Zhang Y, Cao S, Lou F, Wang H. MSIFinder: a python package for detecting MSI status using random forest classifier. BMC Bioinformatics 2021; 22:185. [PMID: 33845765 PMCID: PMC8042960 DOI: 10.1186/s12859-021-03986-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2020] [Accepted: 01/29/2021] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Microsatellite instability (MSI) is a common genomic alteration in colorectal cancer, endometrial carcinoma, and other solid tumors. MSI is characterized by a high degree of polymorphism in microsatellite lengths owing to the deficiency in the mismatch repair system. Based on the degree, MSI can be classified as microsatellite instability-high (MSI-H) and microsatellite stable (MSS). MSI is a predictive biomarker for immunotherapy efficacy in advanced/metastatic solid tumors, especially in colorectal cancer patients. Several computational approaches based on target panel sequencing data have been used to detect MSI; however, they are considerably affected by the sequencing depth and panel size. RESULTS We developed MSIFinder, a python package for automatic MSI classification, using random forest classifier (RFC)-based genome sequencing, which is a machine learning technology. We included 19 MSI-H and 25 MSS samples as training sets. First, we selected 54 feature markers from the training sets, built an RFC model, and validated the classifier using a test set comprising 21 MSI-H and 379 MSS samples. With this test set, MSIFinder achieved a sensitivity (recall) of 1.0, a specificity of 0.997, an accuracy of 0.998, a positive predictive value of 0.954, an F1 score of 0.977, and an area under the curve of 0.999. To further verify the robustness and effectiveness of the model, we used a prospective cohort consisting of 18 MSI-H samples and 122 MSS samples. MSIFinder achieved a sensitivity (recall) of 1.0 and a specificity of 1.0. We discovered that MSIFinder is less affected by a low sequencing depth and can achieve a concordance of 0.993 while exhibiting a sequencing depth of 100×. Furthermore, we realized that MSIFinder is less affected by the panel size and can achieve a concordance of 0.99 when the panel size is 0.5 M (million bases). CONCLUSION These results indicate that MSIFinder is a robust and effective MSI classification tool that can provide reliable MSI detection for scientific and clinical purposes.
Collapse
Affiliation(s)
- Tao Zhou
- AcornMed Biotechnology Co., Ltd., Floor 18, Block 5, Yard 18, Kechuang 13 RD, Beijing, 100176, China
| | - Libin Chen
- AcornMed Biotechnology Co., Ltd., Floor 18, Block 5, Yard 18, Kechuang 13 RD, Beijing, 100176, China
| | - Jing Guo
- AcornMed Biotechnology Co., Ltd., Floor 18, Block 5, Yard 18, Kechuang 13 RD, Beijing, 100176, China
| | - Mengmeng Zhang
- AcornMed Biotechnology Co., Ltd., Floor 18, Block 5, Yard 18, Kechuang 13 RD, Beijing, 100176, China
| | - Yanrui Zhang
- AcornMed Biotechnology Co., Ltd., Floor 18, Block 5, Yard 18, Kechuang 13 RD, Beijing, 100176, China
| | - Shanbo Cao
- AcornMed Biotechnology Co., Ltd., Floor 18, Block 5, Yard 18, Kechuang 13 RD, Beijing, 100176, China
| | - Feng Lou
- AcornMed Biotechnology Co., Ltd., Floor 18, Block 5, Yard 18, Kechuang 13 RD, Beijing, 100176, China.
| | - Haijun Wang
- Department of Pathology, The Second Affiliated Hospital of Zhejiang University School of Medicine, No. 88 Jiefang Road, Shangcheng District, Hangzhou, 310009, Zhejiang, China.
| |
Collapse
|
21
|
Chow KHK, Budde MW, Granados AA, Cabrera M, Yoon S, Cho S, Huang TH, Koulena N, Frieda KL, Cai L, Lois C, Elowitz MB. Imaging cell lineage with a synthetic digital recording system. Science 2021; 372:eabb3099. [PMID: 33833095 DOI: 10.1126/science.abb3099] [Citation(s) in RCA: 74] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Accepted: 02/25/2021] [Indexed: 12/13/2022]
Abstract
During multicellular development, spatial position and lineage history play powerful roles in controlling cell fate decisions. Using a serine integrase-based recording system, we engineered cells to record lineage information in a format that can be read out in situ. The system, termed integrase-editable memory by engineered mutagenesis with optical in situ readout (intMEMOIR), allowed in situ reconstruction of lineage relationships in cultured mouse cells and flies. intMEMOIR uses an array of independent three-state genetic memory elements that can recombine stochastically and irreversibly, allowing up to 59,049 distinct digital states. It reconstructed lineage trees in stem cells and enabled simultaneous analysis of single-cell clonal history, spatial position, and gene expression in Drosophila brain sections. These results establish a foundation for microscopy-readable lineage recording and analysis in diverse systems.
Collapse
Affiliation(s)
- Ke-Huan K Chow
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Mark W Budde
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Alejandro A Granados
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Maria Cabrera
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Shinae Yoon
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Soomin Cho
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Ting-Hao Huang
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Noushin Koulena
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | | | - Long Cai
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Carlos Lois
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA.
| | - Michael B Elowitz
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA.
- Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
22
|
Figueres-Oñate M, Sánchez-González R, López-Mascaraque L. Deciphering neural heterogeneity through cell lineage tracing. Cell Mol Life Sci 2021; 78:1971-1982. [PMID: 33151389 PMCID: PMC7966193 DOI: 10.1007/s00018-020-03689-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 10/10/2020] [Accepted: 10/20/2020] [Indexed: 12/21/2022]
Abstract
Understanding how an adult brain reaches an appropriate size and cell composition from a pool of progenitors that proliferates and differentiates is a key question in Developmental Neurobiology. Not only the control of final size but also, the proper arrangement of cells of different embryonic origins is fundamental in this process. Each neural progenitor has to produce a precise number of sibling cells that establish clones, and all these clones will come together to form the functional adult nervous system. Lineage cell tracing is a complex and challenging process that aims to reconstruct the offspring that arise from a single progenitor cell. This tracing can be achieved through strategies based on genetically modified organisms, using either genetic tracers, transfected viral vectors or DNA constructs, and even single-cell sequencing. Combining different reporter proteins and the use of transgenic mice revolutionized clonal analysis more than a decade ago and now, the availability of novel genome editing tools and single-cell sequencing techniques has vastly improved the capacity of lineage tracing to decipher progenitor potential. This review brings together the strategies used to study cell lineages in the brain and the role they have played in our understanding of the functional clonal relationships among neural cells. In addition, future perspectives regarding the study of cell heterogeneity and the ontogeny of different cell lineages will also be addressed.
Collapse
Affiliation(s)
- María Figueres-Oñate
- Department of Molecular, Cellular and Development Neurobiology, Instituto Cajal-CSIC, 28002, Madrid, Spain
- Max Planck Research Unit for Neurogenetics, 60438, Frankfurt am Main, Germany
| | - Rebeca Sánchez-González
- Department of Molecular, Cellular and Development Neurobiology, Instituto Cajal-CSIC, 28002, Madrid, Spain
| | - Laura López-Mascaraque
- Department of Molecular, Cellular and Development Neurobiology, Instituto Cajal-CSIC, 28002, Madrid, Spain.
| |
Collapse
|
23
|
Stadler T, Pybus OG, Stumpf MPH. Phylodynamics for cell biologists. Science 2021; 371:371/6526/eaah6266. [PMID: 33446527 DOI: 10.1126/science.aah6266] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 08/13/2020] [Indexed: 12/12/2022]
Abstract
Multicellular organisms are composed of cells connected by ancestry and descent from progenitor cells. The dynamics of cell birth, death, and inheritance within an organism give rise to the fundamental processes of development, differentiation, and cancer. Technical advances in molecular biology now allow us to study cellular composition, ancestry, and evolution at the resolution of individual cells within an organism or tissue. Here, we take a phylogenetic and phylodynamic approach to single-cell biology. We explain how "tree thinking" is important to the interpretation of the growing body of cell-level data and how ecological null models can benefit statistical hypothesis testing. Experimental progress in cell biology should be accompanied by theoretical developments if we are to exploit fully the dynamical information in single-cell data.
Collapse
Affiliation(s)
- T Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Switzerland. .,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - O G Pybus
- Department of Zoology, University of Oxford, Oxford, UK.
| | - M P H Stumpf
- Melbourne Integrative Genomics, School of BioSciences and School of Mathematics and Statistics, University of Melbourne, Melbourne, Australia.
| |
Collapse
|
24
|
The art of lineage tracing: From worm to human. Prog Neurobiol 2020; 199:101966. [PMID: 33249090 DOI: 10.1016/j.pneurobio.2020.101966] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Revised: 11/03/2020] [Accepted: 11/22/2020] [Indexed: 12/20/2022]
Abstract
Reconstructing the genealogy of every cell that makes up an organism remains a long-standing challenge in developmental biology. Besides its relevance for understanding the mechanisms underlying normal and pathological development, resolving the lineage origin of cell types will be crucial to create these types on-demand. Multiple strategies have been deployed towards the problem of lineage tracing, ranging from direct observation to sophisticated genetic approaches. Here we discuss the achievements and limitations of past and current technology. Finally, we speculate about the future of lineage tracing and how to reach the next milestones in the field.
Collapse
|
25
|
Garcia-Marques J, Espinosa-Medina I, Ku KY, Yang CP, Koyama M, Yu HH, Lee T. A programmable sequence of reporters for lineage analysis. Nat Neurosci 2020; 23:1618-1628. [PMID: 32719561 DOI: 10.1038/s41593-020-0676-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 06/19/2020] [Indexed: 12/22/2022]
Abstract
We present CLADES (cell lineage access driven by an edition sequence), a technology for cell lineage studies based on CRISPR-Cas9 techniques. CLADES relies on a system of genetic switches to activate and inactivate reporter genes in a predetermined order. Targeting CLADES to progenitor cells allows the progeny to inherit a sequential cascade of reporters, thereby coupling birth order to reporter expression. This system, which can also be temporally induced by heat shock, enables the temporal resolution of lineage development and can therefore be used to deconstruct an extended cell lineage by tracking the reporters expressed in the progeny. When targeted to the germ line, the same cascade progresses across animal generations, predominantly marking each generation with the corresponding combination of reporters. CLADES therefore offers an innovative strategy for making programmable cascades of genes that can be used for genetic manipulation or to record serial biological events.
Collapse
Affiliation(s)
| | | | - Kai-Yuan Ku
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei, Taiwan
| | - Ching-Po Yang
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
| | - Minoru Koyama
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
| | - Hung-Hsiang Yu
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei, Taiwan
| | - Tzumin Lee
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA.
| |
Collapse
|
26
|
Lymph node metastases develop through a wider evolutionary bottleneck than distant metastases. Nat Genet 2020; 52:692-700. [PMID: 32451459 PMCID: PMC7343611 DOI: 10.1038/s41588-020-0633-2] [Citation(s) in RCA: 83] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Accepted: 04/24/2020] [Indexed: 12/12/2022]
Abstract
Genetic diversity among metastases is poorly understood but contains important information about disease evolution at secondary sites. Here we investigate inter- and intra-lesion heterogeneity for two types of metastases that associate with different clinical outcomes: lymph node and distant organ metastases in human colorectal cancer. We develop a rigorous mathematical framework for quantifying metastatic phylogenetic diversity. Distant metastases are typically monophyletic and genetically similar to each other. Lymph node metastases, in contrast, display high levels of inter-lesion diversity. We validate these findings by analyzing 317 multi-region biopsies from an independent cohort of 20 patients. We further demonstrate higher levels of intra-lesion heterogeneity in lymph node than in distant metastases. Our results show that fewer primary tumor lineages seed distant metastases than lymph node metastases, indicating that the two sites are subject to different levels of selection. Thus, lymph node and distant metastases develop through fundamentally different evolutionary mechanisms.
Collapse
|
27
|
Cotterell J, Vila-Cejudo M, Batlle-Morera L, Sharpe J. Endogenous CRISPR/Cas9 arrays for scalable whole-organism lineage tracing. Development 2020; 147:147/9/dev184481. [DOI: 10.1242/dev.184481] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Accepted: 03/06/2020] [Indexed: 12/26/2022]
Abstract
ABSTRACT
The past decade has seen a renewed appreciation of the central importance of cellular lineages to many questions in biology (especially organogenesis, stem cells and tumor biology). This has been driven in part by a renaissance in genetic clonal-labeling techniques. Recent approaches are based on accelerated mutation of DNA sequences, which can then be sequenced from individual cells to re-create a ‘phylogenetic’ tree of cell lineage. However, current approaches depend on making transgenic alterations to the genome in question, which limit their application. Here, we introduce a new method that completely avoids the need for prior genetic engineering, by identifying endogenous CRISPR/Cas9 target arrays suitable for lineage analysis. In both mouse and zebrafish, we identify the highest quality compact arrays as judged by equal base composition, 5′ G sequence, minimal likelihood of residing in the functional genome, minimal off targets and ease of amplification. We validate multiple high-quality endogenous CRISPR/Cas9 arrays, demonstrating their utility for lineage tracing. Our pragmatically scalable technique thus can produce deep and broad lineages in vivo, while removing the dependence on genetic engineering.
Collapse
Affiliation(s)
- James Cotterell
- European Molecular Biology Laboratory (EMBL) Barcelona, 08003 Barcelona, Spain
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
| | - Marta Vila-Cejudo
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
| | - Laura Batlle-Morera
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
| | - James Sharpe
- European Molecular Biology Laboratory (EMBL) Barcelona, 08003 Barcelona, Spain
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
- Institucio' Catalana de Recerca i Estudis Avancats (ICREA), 08010 Barcelona, Spain
| |
Collapse
|
28
|
Zou Z, Zhang H, Guan Y, Zhang J. Deep Residual Neural Networks Resolve Quartet Molecular Phylogenies. Mol Biol Evol 2020; 37:1495-1507. [PMID: 31868908 PMCID: PMC8453599 DOI: 10.1093/molbev/msz307] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Phylogenetic inference is of fundamental importance to evolutionary as well as other fields of biology, and molecular sequences have emerged as the primary data for this task. Although many phylogenetic methods have been developed to explicitly take into account substitution models of sequence evolution, such methods could fail due to model misspecification or insufficiency, especially in the face of heterogeneities in substitution processes across sites and among lineages. In this study, we propose to infer topologies of four-taxon trees using deep residual neural networks, a machine learning approach needing no explicit modeling of the subject system and having a record of success in solving complex nonlinear inference problems. We train residual networks on simulated protein sequence data with extensive amino acid substitution heterogeneities. We show that the well-trained residual network predictors can outperform existing state-of-the-art inference methods such as the maximum likelihood method on diverse simulated test data, especially under extensive substitution heterogeneities. Reassuringly, residual network predictors generally agree with existing methods in the trees inferred from real phylogenetic data with known or widely believed topologies. Furthermore, when combined with the quartet puzzling algorithm, residual network predictors can be used to reconstruct trees with more than four taxa. We conclude that deep learning represents a powerful new approach to phylogenetic reconstruction, especially when sequences evolve via heterogeneous substitution processes. We present our best trained predictor in a freely available program named Phylogenetics by Deep Learning (PhyDL, https://gitlab.com/ztzou/phydl; last accessed January 3, 2020).
Collapse
Affiliation(s)
- Zhengting Zou
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI
| | - Hongjiu Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI
- Department of Internal Medicine, University of Michigan, Ann Arbor, MI
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI
| |
Collapse
|
29
|
Wei CJY, Zhang K. RETrace: simultaneous retrospective lineage tracing and methylation profiling of single cells. Genome Res 2020; 30:602-610. [PMID: 32127417 PMCID: PMC7197472 DOI: 10.1101/gr.255851.119] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Accepted: 02/27/2020] [Indexed: 12/02/2022]
Abstract
Retrospective lineage tracing harnesses naturally occurring mutations in cells to elucidate single cell development. Common single-cell phylogenetic fate mapping methods have utilized highly mutable microsatellite loci found within the human genome. Such methods were limited by the introduction of in vitro noise through polymerase slippage inherent in DNA amplification, which we characterized to be approximately 10–100× higher than the in vivo replication mutation rate. Here, we present RETrace, a method for simultaneously capturing both microsatellites and methylation-informative cytosines to characterize both lineage and cell type, respectively, from the same single cell. An important unique feature of RETrace was the introduction of linear amplification of microsatellites in order to reduce in vitro amplification noise. We further coupled microsatellite capture with single-cell reduced representation bisulfite sequencing (scRRBS), to measure the CpG methylation status on the same cell for cell type inference. When compared to existing retrospective lineage tracing methods, RETrace achieved higher accuracy (88% triplet accuracy from an ex vivo HCT116 tree) at a higher cell division resolution (lowering the required number of cell division difference between single cells by approximately 100 divisions). Simultaneously, RETrace demonstrated the ability to capture on average 150,000 unique CpGs per single cell in order to accurately determine cell type. We further formulated additional developments that would allow high-resolution mapping on microsatellite-stable cells or tissues with RETrace. Overall, we present RETrace as a foundation for multi-omics lineage mapping and cell typing of single cells.
Collapse
Affiliation(s)
- Christopher Jen-Yue Wei
- Department of Bioengineering, University of California San Diego, La Jolla, California 92093, USA
| | - Kun Zhang
- Department of Bioengineering, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|
30
|
Naser-Khdour S, Minh BQ, Zhang W, Stone EA, Lanfear R. The Prevalence and Impact of Model Violations in Phylogenetic Analysis. Genome Biol Evol 2019; 11:3341-3352. [PMID: 31536115 PMCID: PMC6893154 DOI: 10.1093/gbe/evz193] [Citation(s) in RCA: 84] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/03/2019] [Indexed: 12/24/2022] Open
Abstract
In phylogenetic inference, we commonly use models of substitution which assume that sequence evolution is stationary, reversible, and homogeneous (SRH). Although the use of such models is often criticized, the extent of SRH violations and their effects on phylogenetic inference of tree topologies and edge lengths are not well understood. Here, we introduce and apply the maximal matched-pairs tests of homogeneity to assess the scale and impact of SRH model violations on 3,572 partitions from 35 published phylogenetic data sets. We show that roughly one-quarter of all the partitions we analyzed (23.5%) reject the SRH assumptions, and that for 25% of data sets, tree topologies inferred from all partitions differ significantly from topologies inferred using the subset of partitions that do not reject the SRH assumptions. This proportion increases when comparing trees inferred using the subset of partitions that rejects the SRH assumptions, to those inferred from partitions that do not reject the SRH assumptions. These results suggest that the extent and effects of model violation in phylogenetics may be substantial. They highlight the importance of testing for model violations and possibly excluding partitions that violate models prior to tree reconstruction. Our results also suggest that further effort in developing models that do not require SRH assumptions could lead to large improvements in the accuracy of phylogenomic inference. The scripts necessary to perform the analysis are available in https://github.com/roblanf/SRHtests, and the new tests we describe are available as a new option in IQ-TREE (http://www.iqtree.org).
Collapse
Affiliation(s)
- Suha Naser-Khdour
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Bui Quang Minh
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
- Research School of Computer Science, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Wenqi Zhang
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Eric A Stone
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Robert Lanfear
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| |
Collapse
|
31
|
Abstract
Every animal grows from a single fertilized egg into an intricate network of cell types and organ systems. This process is captured in a lineage tree: a diagram of every cell's ancestry back to the founding zygote. Biologists have long sought to trace this cell lineage tree in individual organisms and have developed a variety of technologies to map the progeny of specific cells. However, there are billions to trillions of cells in complex organisms, and conventional approaches can only map a limited number of clonal populations per experiment. A new generation of tools that use molecular recording methods integrated with single cell profiling technologies may provide a solution. Here, we summarize recent breakthroughs in these technologies, outline experimental and computational challenges, and discuss biological questions that can be addressed using single cell dynamic lineage tracing.
Collapse
Affiliation(s)
- Aaron McKenna
- Department of Molecular and Systems Biology, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA
| | - James A Gagnon
- Center for Cell and Genome Science, University of Utah, Salt Lake City, UT 84112, USA
- School of Biological Sciences, University of Utah, Salt Lake City, UT 84112, USA
| |
Collapse
|
32
|
Wu SH(S, Lee JH, Koo BK. Lineage Tracing: Computational Reconstruction Goes Beyond the Limit of Imaging. Mol Cells 2019; 42:104-112. [PMID: 30764600 PMCID: PMC6399003 DOI: 10.14348/molcells.2019.0006] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Revised: 01/18/2018] [Accepted: 01/20/2019] [Indexed: 02/07/2023] Open
Abstract
Tracking the fate of individual cells and their progeny through lineage tracing has been widely used to investigate various biological processes including embryonic development, homeostatic tissue turnover, and stem cell function in regeneration and disease. Conventional lineage tracing involves the marking of cells either with dyes or nucleoside analogues or genetic marking with fluorescent and/or colorimetric protein reporters. Both are imaging-based approaches that have played a crucial role in the field of developmental biology as well as adult stem cell biology. However, imaging-based lineage tracing approaches are limited by their scalability and the lack of molecular information underlying fate transitions. Recently, computational biology approaches have been combined with diverse tracing methods to overcome these limitations and so provide high-order scalability and a wealth of molecular information. In this review, we will introduce such novel computational methods, starting from single-cell RNA sequencing-based lineage analysis to DNA barcoding or genetic scar analysis. These novel approaches are complementary to conventional imaging-based approaches and enable us to study the lineage relationships of numerous cell types during vertebrate, and in particular human, development and disease.
Collapse
Affiliation(s)
- Szu-Hsien (Sam) Wu
- Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna Biocenter (VBC), 1030 Vienna,
Austria
| | - Ji-Hyun Lee
- Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna Biocenter (VBC), 1030 Vienna,
Austria
| | - Bon-Kyoung Koo
- Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna Biocenter (VBC), 1030 Vienna,
Austria
| |
Collapse
|
33
|
Hicks DG, Speed TP, Yassin M, Russell SM. Maps of variability in cell lineage trees. PLoS Comput Biol 2019; 15:e1006745. [PMID: 30753182 PMCID: PMC6388934 DOI: 10.1371/journal.pcbi.1006745] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 02/25/2019] [Accepted: 01/02/2019] [Indexed: 11/19/2022] Open
Abstract
New approaches to lineage tracking have allowed the study of differentiation in multicellular organisms over many generations of cells. Understanding the phenotypic variability observed in these lineage trees requires new statistical methods. Whereas an invariant cell lineage, such as that for the nematode Caenorhabditis elegans, can be described by a lineage map, defined as the pattern of phenotypes overlaid onto the binary tree, a traditional lineage map is static and does not describe the variability inherent in the cell lineages of higher organisms. Here, we introduce lineage variability maps which describe the pattern of second-order variation in lineage trees. These maps can be undirected graphs of the partial correlations between every lineal position, or directed graphs showing the dynamics of bifurcated patterns in each subtree. We show how to infer these graphical models for lineages of any depth from sample sizes of only a few pedigrees. This required developing the generalized spectral analysis for a binary tree, the natural framework for describing tree-structured variation. When tested on pedigrees from C. elegans expressing a marker for pharyngeal differentiation potential, the variability maps recover essential features of the known lineage map. When applied to highly-variable pedigrees monitoring cell size in T lymphocytes, the maps show that most of the phenotype is set by the founder naive T cell. Lineage variability maps thus elevate the concept of the lineage map to the population level, addressing questions about the potency and dynamics of cell lineages and providing a way to quantify the progressive restriction of cell fate with increasing depth in the tree.
Collapse
Affiliation(s)
- Damien G. Hicks
- Centre for Micro-Photonics, Department of Physics and Astronomy, Swinburne University of Technology, Hawthorn, Victoria 3122, Australia
- Bioinformatics Division, Walter & Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
| | - Terence P. Speed
- Bioinformatics Division, Walter & Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
| | - Mohammed Yassin
- Peter MacCallum Cancer Centre, Parkville, Victoria 3052, Australia
| | - Sarah M. Russell
- Centre for Micro-Photonics, Department of Physics and Astronomy, Swinburne University of Technology, Hawthorn, Victoria 3122, Australia
- Peter MacCallum Cancer Centre, Parkville, Victoria 3052, Australia
- Department of Pathology and Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Victoria 3050, Australia
| |
Collapse
|
34
|
Salvador-Martínez I, Grillo M, Averof M, Telford MJ. Is it possible to reconstruct an accurate cell lineage using CRISPR recorders? eLife 2019; 8:e40292. [PMID: 30688650 PMCID: PMC6349403 DOI: 10.7554/elife.40292] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 01/11/2019] [Indexed: 02/02/2023] Open
Abstract
Cell lineages provide the framework for understanding how cell fates are decided during development. Describing cell lineages in most organisms is challenging; even a fruit fly larva has ~50,000 cells and a small mammal has >1 billion cells. Recently, the idea of applying CRISPR to induce mutations during development, to be used as heritable markers for lineage reconstruction, has been proposed by several groups. While an attractive idea, its practical value depends on the accuracy of the cell lineages that can be generated. Here, we use computer simulations to estimate the performance of these approaches under different conditions. We incorporate empirical data on CRISPR-induced mutation frequencies in Drosophila. We show significant impacts from multiple biological and technical parameters - variable cell division rates, skewed mutational outcomes, target dropouts and different sequencing strategies. Our approach reveals the limitations of published CRISPR recorders, and indicates how future implementations can be optimised. Editorial note This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed (see decision letter).
Collapse
Affiliation(s)
- Irepan Salvador-Martínez
- Centre for Life’s Origins and Evolution, Department of Genetics Evolution and EnvironmentUniversity College LondonLondonUnited Kingdom
| | - Marco Grillo
- Institut de Génomique Fonctionnelle de Lyon (IGFL)École Normale Supérieure de LyonLyonFrance
- Centre National de la Recherche Scientifique (CNRS)ParisFrance
| | - Michalis Averof
- Institut de Génomique Fonctionnelle de Lyon (IGFL)École Normale Supérieure de LyonLyonFrance
- Centre National de la Recherche Scientifique (CNRS)ParisFrance
| | - Maximilian J Telford
- Centre for Life’s Origins and Evolution, Department of Genetics Evolution and EnvironmentUniversity College LondonLondonUnited Kingdom
| |
Collapse
|
35
|
Abstract
Reconstructing lineage relationships between cells within a tissue or organism is a long-standing aim in biology. Traditionally, lineage tracing has been achieved through the (genetic) labeling of a cell followed by the tracking of its offspring. Currently, lineage trajectories can also be predicted using single-cell transcriptomics. Although single-cell transcriptomics provides detailed phenotypic information, the predicted lineage trajectories do not necessarily reflect genetic relationships. Recently, techniques have been developed that unite these strategies. In this Review, we discuss transcriptome-based lineage trajectory prediction algorithms, single-cell genetic lineage tracing, and the promising combination of these techniques for stem cell and cancer research.
Collapse
Affiliation(s)
- Lennart Kester
- Oncode Institute, Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences (KNAW) and University Medical Center Utrecht, 3584 CT Utrecht, the Netherlands
| | - Alexander van Oudenaarden
- Oncode Institute, Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences (KNAW) and University Medical Center Utrecht, 3584 CT Utrecht, the Netherlands.
| |
Collapse
|
36
|
Naxerova K, Reiter JG, Brachtel E, Lennerz JK, van de Wetering M, Rowan A, Cai T, Clevers H, Swanton C, Nowak MA, Elledge SJ, Jain RK. Origins of lymphatic and distant metastases in human colorectal cancer. Science 2017; 357:55-60. [PMID: 28684519 PMCID: PMC5536201 DOI: 10.1126/science.aai8515] [Citation(s) in RCA: 366] [Impact Index Per Article: 45.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Revised: 02/16/2017] [Accepted: 05/08/2017] [Indexed: 12/15/2022]
Abstract
The spread of cancer cells from primary tumors to regional lymph nodes is often associated with reduced survival. One prevailing model to explain this association posits that fatal, distant metastases are seeded by lymph node metastases. This view provides a mechanistic basis for the TNM staging system and is the rationale for surgical resection of tumor-draining lymph nodes. Here we examine the evolutionary relationship between primary tumor, lymph node, and distant metastases in human colorectal cancer. Studying 213 archival biopsy samples from 17 patients, we used somatic variants in hypermutable DNA regions to reconstruct high-confidence phylogenetic trees. We found that in 65% of cases, lymphatic and distant metastases arose from independent subclones in the primary tumor, whereas in 35% of cases they shared common subclonal origin. Therefore, two different lineage relationships between lymphatic and distant metastases exist in colorectal cancer.
Collapse
Affiliation(s)
- Kamila Naxerova
- Edwin L. Steele Laboratories for Tumor Biology, Department of Radiation Oncology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA.
- Division of Genetics, Department of Genetics, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Johannes G Reiter
- Program for Evolutionary Dynamics, Harvard University, Cambridge, MA 02138, USA
| | - Elena Brachtel
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Jochen K Lennerz
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Marc van de Wetering
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences (KNAW) and University Medical Center (UMC) Utrecht, 3584CT Utrecht, Netherlands
- Cancer Genomics Netherlands, UMC Utrecht, 3584CG Utrecht, Netherlands
| | | | - Tianxi Cai
- Department of Biostatistics, Harvard University, Boston, MA 02115, USA
| | - Hans Clevers
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences (KNAW) and University Medical Center (UMC) Utrecht, 3584CT Utrecht, Netherlands
- Cancer Genomics Netherlands, UMC Utrecht, 3584CG Utrecht, Netherlands
| | - Charles Swanton
- The Francis Crick Institute, London NW1 1AT, UK
- University College London Cancer Institute, London WC1E 6DD, UK
| | - Martin A Nowak
- Program for Evolutionary Dynamics, Harvard University, Cambridge, MA 02138, USA
- Department of Mathematics and Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Stephen J Elledge
- Division of Genetics, Department of Genetics, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Rakesh K Jain
- Edwin L. Steele Laboratories for Tumor Biology, Department of Radiation Oncology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| |
Collapse
|
37
|
Woodworth MB, Girskis KM, Walsh CA. Building a lineage from single cells: genetic techniques for cell lineage tracking. Nat Rev Genet 2017; 18:230-244. [PMID: 28111472 PMCID: PMC5459401 DOI: 10.1038/nrg.2016.159] [Citation(s) in RCA: 177] [Impact Index Per Article: 22.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Resolving lineage relationships between cells in an organism is a fundamental interest of developmental biology. Furthermore, investigating lineage can drive understanding of pathological states, including cancer, as well as understanding of developmental pathways that are amenable to manipulation by directed differentiation. Although lineage tracking through the injection of retroviral libraries has long been the state of the art, a recent explosion of methodological advances in exogenous labelling and single-cell sequencing have enabled lineage tracking at larger scales, in more detail, and in a wider range of species than was previously considered possible. In this Review, we discuss these techniques for cell lineage tracking, with attention both to those that trace lineage forwards from experimental labelling, and those that trace backwards across the life history of an organism.
Collapse
Affiliation(s)
- Mollie B Woodworth
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, Massachusetts 02115, USA
- Departments of Neurology and Pediatrics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02139, USA
| | - Kelly M Girskis
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, Massachusetts 02115, USA
- Departments of Neurology and Pediatrics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02139, USA
| | - Christopher A Walsh
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, Massachusetts 02115, USA
- Departments of Neurology and Pediatrics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
38
|
Frieda KL, Linton JM, Hormoz S, Choi J, Chow KHK, Singer ZS, Budde MW, Elowitz MB, Cai L. Synthetic recording and in situ readout of lineage information in single cells. Nature 2017; 541:107-111. [PMID: 27869821 PMCID: PMC6487260 DOI: 10.1038/nature20777] [Citation(s) in RCA: 296] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Accepted: 11/11/2016] [Indexed: 12/13/2022]
Abstract
Reconstructing the lineage relationships and dynamic event histories of individual cells within their native spatial context is a long-standing challenge in biology. Many biological processes of interest occur in optically opaque or physically inaccessible contexts, necessitating approaches other than direct imaging. Here we describe a synthetic system that enables cells to record lineage information and event histories in the genome in a format that can be subsequently read out of single cells in situ. This system, termed memory by engineered mutagenesis with optical in situ readout (MEMOIR), is based on a set of barcoded recording elements termed scratchpads. The state of a given scratchpad can be irreversibly altered by CRISPR/Cas9-based targeted mutagenesis, and later read out in single cells through multiplexed single-molecule RNA fluorescence hybridization (smFISH). Using MEMOIR as a proof of principle, we engineered mouse embryonic stem cells to contain multiple scratchpads and other recording components. In these cells, scratchpads were altered in a progressive and stochastic fashion as the cells proliferated. Analysis of the final states of scratchpads in single cells in situ enabled reconstruction of lineage information from cell colonies. Combining analysis of endogenous gene expression with lineage reconstruction in the same cells further allowed inference of the dynamic rates at which embryonic stem cells switch between two gene expression states. Finally, using simulations, we show how parallel MEMOIR systems operating in the same cell could enable recording and readout of dynamic cellular event histories. MEMOIR thus provides a versatile platform for information recording and in situ, single-cell readout across diverse biological systems.
Collapse
Affiliation(s)
- Kirsten L Frieda
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - James M Linton
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Sahand Hormoz
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Joonhyuk Choi
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Ke-Huan K Chow
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Zakary S Singer
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Mark W Budde
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Michael B Elowitz
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 91125, USA
- Howard Hughes Medical Institute, California Institute of Technology, Pasadena, California 91125, USA
| | - Long Cai
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| |
Collapse
|
39
|
Figueres-Oñate M, García-Marqués J, López-Mascaraque L. UbC-StarTrack, a clonal method to target the entire progeny of individual progenitors. Sci Rep 2016; 6:33896. [PMID: 27654510 PMCID: PMC5031994 DOI: 10.1038/srep33896] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 09/05/2016] [Indexed: 01/02/2023] Open
Abstract
Clonal cell analysis defines the potential of single cells and the diversity they can produce. To achieve this, we have developed a novel adaptation of the genetic tracing strategy, UbC-StarTrack, which attributes a specific and unique color-code to single neural precursors, allowing all their progeny to be tracked. We used integrable fluorescent reporters driven by a ubiquitous promoter in PiggyBac-based vectors to achieve inheritable and stable clonal cell labeling. In addition, coupling this to an inducible Cre-LoxP system avoids the expression of non-integrated reporters. To assess the utility of this system, we first analyzed images of combinatorial expression of fluorescent reporters in transfected cells and their progeny. We also validated the efficiency of the UbC-StarTrack to trace cell lineages through in vivo, in vitro and ex vivo strategies. Finally, progenitors located in the lateral ventricles were targeted at embryonic or postnatal stages to determine the diversity of neurons and glia they produce, and their clonal relationships. In this way we demonstrate that UbC-StarTrack can be used to identify all the progeny of a single cell and that it can be employed in a wide range of contexts.
Collapse
|
40
|
McKenna A, Findlay GM, Gagnon JA, Horwitz MS, Schier AF, Shendure J. Whole-organism lineage tracing by combinatorial and cumulative genome editing. Science 2016; 353:aaf7907. [PMID: 27229144 DOI: 10.1126/science.aaf7907] [Citation(s) in RCA: 517] [Impact Index Per Article: 57.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 05/20/2016] [Indexed: 12/27/2022]
Abstract
Multicellular systems develop from single cells through distinct lineages. However, current lineage-tracing approaches scale poorly to whole, complex organisms. Here, we use genome editing to progressively introduce and accumulate diverse mutations in a DNA barcode over multiple rounds of cell division. The barcode, an array of clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 target sites, marks cells and enables the elucidation of lineage relationships via the patterns of mutations shared between cells. In cell culture and zebrafish, we show that rates and patterns of editing are tunable and that thousands of lineage-informative barcode alleles can be generated. By sampling hundreds of thousands of cells from individual zebrafish, we find that most cells in adult organs derive from relatively few embryonic progenitors. In future analyses, genome editing of synthetic target arrays for lineage tracing (GESTALT) can be used to generate large-scale maps of cell lineage in multicellular systems for normal development and disease.
Collapse
Affiliation(s)
- Aaron McKenna
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Gregory M Findlay
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - James A Gagnon
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Marshall S Horwitz
- Department of Genome Sciences, University of Washington, Seattle, WA, USA. Department of Pathology, University of Washington, Seattle, WA, USA
| | - Alexander F Schier
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA. Center for Brain Science, Harvard University, Cambridge, MA, USA. The Broad Institute of Harvard and MIT, Cambridge, MA, USA. FAS Center for Systems Biology, Harvard University, Cambridge, MA, USA.
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA. Howard Hughes Medical Institute, Seattle, WA, USA.
| |
Collapse
|
41
|
Luo T, He X, Xing K. Lineage analysis by microsatellite loci deep sequencing in mice. Mol Reprod Dev 2016; 83:387-91. [PMID: 26932355 DOI: 10.1002/mrd.22632] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Accepted: 02/26/2016] [Indexed: 11/08/2022]
Abstract
Lineage analysis is the identification of all the progeny of a single progenitor cell, and has become particularly useful for studying developmental processes and cancer biology. Here, we propose a novel and effective method for lineage analysis that combines sequence capture and next-generation sequencing technology. Genome-wide mononucleotide and dinucleotide microsatellite loci in eight samples from two mice were identified and used to construct phylogenetic trees based on somatic indel mutations at these loci, which were unique enough to distinguish and parse samples from different mice into different groups along the lineage tree. For example, biopsies from the liver and stomach, which originate from the endoderm, were located in the same clade, while samples in kidney, which originate from the mesoderm, were located in another clade. Yet, tissue with a common developmental origin may still contain cells of a mixed ancestry. This genome-wide approach thus provides a non-invasive lineage analysis method based on mutations that accumulate in the genomes of opaque multicellular organism somatic cells. Mol. Reprod. Dev. 83: 387-391, 2016. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Tao Luo
- State Key Laboratory of Biocontrol, College of Ecology and Evolution, School of Life Sciences, Sun Yatsen University, Guangzhou, China
| | - Xionglei He
- State Key Laboratory of Biocontrol, College of Ecology and Evolution, School of Life Sciences, Sun Yatsen University, Guangzhou, China.,Collaborative Innovation Center of High Performance Computing, National University of Defense Technology, Changsha, China
| | - Ke Xing
- State Key Laboratory of Biocontrol, College of Ecology and Evolution, School of Life Sciences, Sun Yatsen University, Guangzhou, China
| |
Collapse
|
42
|
MSIplus for Integrated Colorectal Cancer Molecular Testing by Next-Generation Sequencing. J Mol Diagn 2015; 17:705-14. [PMID: 26322950 DOI: 10.1016/j.jmoldx.2015.05.008] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Revised: 04/27/2015] [Accepted: 05/26/2015] [Indexed: 12/30/2022] Open
Abstract
Molecular analysis of colon cancers currently requires multiphasic testing that uses various assays with different performance characteristics, adding cost and time to patient care. We have developed a single, next-generation sequencing assay to simultaneously evaluate colorectal cancers for mutations in relevant cancer genes (KRAS, NRAS, and BRAF) and for tumor microsatellite instability (MSI). In a sample set of 61 cases, the assay demonstrated overall sensitivity of 100% and specificity of 100% for identifying cancer-associated mutations, with a practical limit of detection at 2% mutant allele fraction. MSIplus was 97% sensitive (34 of 35 MSI-positive cases) and 100% specific (42 of 42 MSI-negative cases) for ascertaining MSI phenotype in a cohort of 78 tumor specimens. These performance characteristics were slightly better than for conventional multiplex PCR MSI testing (97% sensitivity and 95% specificity), which is based on comparison of microsatellite loci amplified from tumor and matched normal material, applied to the same specimen cohort. Because the assay uses an amplicon sequencing approach, it is rapid and appropriate for specimens with limited available material or fragmented DNA. This integrated testing strategy offers several advantages over existing methods, including a lack of need for matched normal material, sensitive and unbiased detection of variants in target genes, and an automated analysis pipeline enabling principled and reproducible identification of cancer-associated mutations and MSI status simultaneously.
Collapse
|
43
|
Koyanagi KO. Inferring cell differentiation processes based on phylogenetic analysis of genome-wide epigenetic information: hematopoiesis as a model case. Genome Biol Evol 2015; 7:699-705. [PMID: 25638259 PMCID: PMC5322552 DOI: 10.1093/gbe/evv024] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
How cells divide and differentiate is a fundamental question in organismal development; however, the discovery of differentiation processes in various cell types is laborious and sometimes impossible. Phylogenetic analysis is typically used to reconstruct evolutionary processes based on inherent characters. It could also be used to reconstruct developmental processes based on the developmental changes that occur during cell proliferation and differentiation. In this study, DNA methylation information from differentiated hematopoietic cells was used to perform phylogenetic analyses. The results were assessed for their validity in inferring hierarchical differentiation processes of hematopoietic cells and DNA methylation processes of differentiating progenitor cells. Overall, phylogenetic analyses based on DNA methylation information facilitated inferences regarding hematopoiesis.
Collapse
Affiliation(s)
- Kanako O Koyanagi
- Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan
| |
Collapse
|
44
|
|
45
|
Blundell JR, Levy SF. Beyond genome sequencing: Lineage tracking with barcodes to study the dynamics of evolution, infection, and cancer. Genomics 2014; 104:417-30. [DOI: 10.1016/j.ygeno.2014.09.005] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2014] [Revised: 09/03/2014] [Accepted: 09/16/2014] [Indexed: 12/19/2022]
|
46
|
Behjati S, Huch M, van Boxtel R, Karthaus W, Wedge DC, Tamuri AU, Martincorena I, Petljak M, Alexandrov LB, Gundem G, Tarpey PS, Roerink S, Blokker J, Maddison M, Mudie L, Robinson B, Nik-Zainal S, Campbell P, Goldman N, van de Wetering M, Cuppen E, Clevers H, Stratton MR. Genome sequencing of normal cells reveals developmental lineages and mutational processes. Nature 2014; 513:422-425. [PMID: 25043003 PMCID: PMC4227286 DOI: 10.1038/nature13448] [Citation(s) in RCA: 263] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Accepted: 05/07/2014] [Indexed: 02/08/2023]
Abstract
The somatic mutations present in the genome of a cell accumulate over the lifetime of a multicellular organism. These mutations can provide insights into the developmental lineage tree, the number of divisions that each cell has undergone and the mutational processes that have been operative. Here we describe whole genomes of clonal lines derived from multiple tissues of healthy mice. Using somatic base substitutions, we reconstructed the early cell divisions of each animal, demonstrating the contributions of embryonic cells to adult tissues. Differences were observed between tissues in the numbers and types of mutations accumulated by each cell, which likely reflect differences in the number of cell divisions they have undergone and varying contributions of different mutational processes. If somatic mutation rates are similar to those in mice, the results indicate that precise insights into development and mutagenesis of normal human cells will be possible.
Collapse
Affiliation(s)
- Sam Behjati
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
- Department of Paediatrics, University of Cambridge, Hills Road, Cambridge, CB2 2XY, UK
| | - Meritxell Huch
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences, CancerGenomiCs.nl & University Medical Center Utrecht, 3584 CT, Utrecht, The Netherlands
- Present address: Wellcome Trust / Cancer Research UK Gurdon Institute, Tennis Court Road, CB2 1QN, Cambridge, UK
| | - Ruben van Boxtel
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences, CancerGenomiCs.nl & University Medical Center Utrecht, 3584 CT, Utrecht, The Netherlands
| | - Wouter Karthaus
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences, CancerGenomiCs.nl & University Medical Center Utrecht, 3584 CT, Utrecht, The Netherlands
| | - David C Wedge
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Asif U Tamuri
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Inigo Martincorena
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Mia Petljak
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Ludmil B Alexandrov
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Gunes Gundem
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Patrick S Tarpey
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Sophie Roerink
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Joyce Blokker
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences, CancerGenomiCs.nl & University Medical Center Utrecht, 3584 CT, Utrecht, The Netherlands
| | - Mark Maddison
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Laura Mudie
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Ben Robinson
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Serena Nik-Zainal
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
- East Anglian Medical Genetics Service, Cambridge University Hospitals NHS Foundation Trust, Hills Road, Cambridge CB2 0QQ, UK
| | - Peter Campbell
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Marc van de Wetering
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences, CancerGenomiCs.nl & University Medical Center Utrecht, 3584 CT, Utrecht, The Netherlands
| | - Edwin Cuppen
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences, CancerGenomiCs.nl & University Medical Center Utrecht, 3584 CT, Utrecht, The Netherlands
| | - Hans Clevers
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences, CancerGenomiCs.nl & University Medical Center Utrecht, 3584 CT, Utrecht, The Netherlands
| | - Michael R Stratton
- Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| |
Collapse
|
47
|
Spiro A, Cardelli L, Shapiro E. Lineage grammars: describing, simulating and analyzing population dynamics. BMC Bioinformatics 2014; 15:249. [PMID: 25047682 PMCID: PMC4223406 DOI: 10.1186/1471-2105-15-249] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2014] [Accepted: 07/07/2014] [Indexed: 11/17/2022] Open
Abstract
Background Precise description of the dynamics of biological processes would enable the mathematical analysis and computational simulation of complex biological phenomena. Languages such as Chemical Reaction Networks and Process Algebras cater for the detailed description of interactions among individuals and for the simulation and analysis of ensuing behaviors of populations. However, often knowledge of such interactions is lacking or not available. Yet complete oblivion to the environment would make the description of any biological process vacuous. Here we present a language for describing population dynamics that abstracts away detailed interaction among individuals, yet captures in broad terms the effect of the changing environment, based on environment-dependent Stochastic Tree Grammars (eSTG). It is comprised of a set of stochastic tree grammar transition rules, which are context-free and as such abstract away specific interactions among individuals. Transition rule probabilities and rates, however, can depend on global parameters such as population size, generation count, and elapsed time. Results We show that eSTGs conveniently describe population dynamics at multiple levels including cellular dynamics, tissue development and niches of organisms. Notably, we show the utilization of eSTG for cases in which the dynamics is regulated by environmental factors, which affect the fate and rate of decisions of the different species. eSTGs are lineage grammars, in the sense that execution of an eSTG program generates the corresponding lineage trees, which can be used to analyze the evolutionary and developmental history of the biological system under investigation. These lineage trees contain a representation of the entire events history of the system, including the dynamics that led to the existing as well as to the extinct individuals. Conclusions We conclude that our suggested formalism can be used to easily specify, simulate and analyze complex biological systems, and supports modular description of local biological dynamics that can be later used as “black boxes” in a larger scope, thus enabling a gradual and hierarchical definition and simulation of complex biological systems. The simple, yet robust formalism enables to target a broad class of stochastic dynamic behaviors, especially those that can be modeled using global environmental feedback regulation rather than direct interaction between individuals.
Collapse
Affiliation(s)
| | | | - Ehud Shapiro
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
48
|
Yang JR, Ruan S, Zhang J. Determinative developmental cell lineages are robust to cell deaths. PLoS Genet 2014; 10:e1004501. [PMID: 25058586 PMCID: PMC4110091 DOI: 10.1371/journal.pgen.1004501] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2013] [Accepted: 05/24/2014] [Indexed: 11/18/2022] Open
Abstract
All forms of life are confronted with environmental and genetic perturbations, making phenotypic robustness an important characteristic of life. Although development has long been viewed as a key component of phenotypic robustness, the underlying mechanism is unclear. Here we report that the determinative developmental cell lineages of two protostomes and one deuterostome are structured such that the resulting cellular compositions of the organisms are only modestly affected by cell deaths. Several features of the cell lineages, including their shallowness, topology, early ontogenic appearances of rare cells, and non-clonality of most cell types, underlie the robustness. Simple simulations of cell lineage evolution demonstrate the possibility that the observed robustness arose as an adaptation in the face of random cell deaths in development. These results reveal general organizing principles of determinative developmental cell lineages and a conceptually new mechanism of phenotypic robustness, both of which have important implications for development and evolution.
Collapse
Affiliation(s)
- Jian-Rong Yang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Shuxiang Ruan
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
49
|
Gamermann D, Montagud A, Conejero JA, Urchueguía JF, de Córdoba PF. New approach for phylogenetic tree recovery based on genome-scale metabolic networks. J Comput Biol 2014; 21:508-19. [PMID: 24611553 PMCID: PMC4082356 DOI: 10.1089/cmb.2013.0150] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
A wide range of applications and research has been done with genome-scale metabolic models. In this work, we describe an innovative methodology for comparing metabolic networks constructed from genome-scale metabolic models and how to apply this comparison in order to infer evolutionary distances between different organisms. Our methodology allows a quantification of the metabolic differences between different species from a broad range of families and even kingdoms. This quantification is then applied in order to reconstruct phylogenetic trees for sets of various organisms.
Collapse
Affiliation(s)
- Daniel Gamermann
- Cátedra Energesis de Tecnología Interdisciplinar, Universidad Católica de Valencia San Vicente Mártir, Valencia, Spain
- Instituto Universitario de Matemática Pura y Aplicada, Universidad Politécnica de Valencia, Valencia, Spain
| | - Arnaud Montagud
- Instituto Universitario de Matemática Pura y Aplicada, Universidad Politécnica de Valencia, Valencia, Spain
| | - J. Alberto Conejero
- Instituto Universitario de Matemática Pura y Aplicada, Universidad Politécnica de Valencia, Valencia, Spain
| | - Javier F. Urchueguía
- Instituto Universitario de Matemática Pura y Aplicada, Universidad Politécnica de Valencia, Valencia, Spain
| | - Pedro Fernández de Córdoba
- Instituto Universitario de Matemática Pura y Aplicada, Universidad Politécnica de Valencia, Valencia, Spain
| |
Collapse
|
50
|
Salipante SJ, Scroggins SM, Hampel HL, Turner EH, Pritchard CC. Microsatellite instability detection by next generation sequencing. Clin Chem 2014; 60:1192-9. [PMID: 24987110 DOI: 10.1373/clinchem.2014.223677] [Citation(s) in RCA: 319] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
BACKGROUND Microsatellite instability (MSI) is a useful phenotype in cancer diagnosis and prognosis. Nevertheless, methods to detect MSI status from next generation DNA sequencing (NGS) data are underdeveloped. METHODS We developed an approach to detect the MSI phenotype using NGS (mSINGS). The method was used to evaluate mononucleotide microsatellite loci that were incidentally sequenced after targeted gene enrichment and could be applied to gene or exome capture panels designed for other purposes. For each microsatellite locus, the number of differently sized repeats in experimental samples were quantified and compared to a population of normal controls. Loci were considered unstable if the experimental number of repeats was statistically greater than in the control population. MSI status was determined by the fraction of unstable microsatellite loci. RESULTS We examined data from 324 samples generated using targeted gene capture assays of 3 different sizes, ranging from a 0.85-Mb to a 44-Mb exome design and incorporating from 15 to 2957 microsatellite markers. When we compared mSING results to MSI-PCR as a gold standard for 108 cases, we found the approach to be both diagnostically sensitive (range of 96.4% to 100% across 3 panels) and specific (range of 97.2% to 100%) for determining MSI status. The fraction of unstable microsatellite markers calculated from sequencing data correlated with the number of unstable loci detected by conventional MSI-PCR testing. CONCLUSIONS NGS data can enable highly accurate detection of MSI, even from limited capture designs. This novel approach offers several advantages over existing PCR-based methods.
Collapse
Affiliation(s)
| | | | - Heather L Hampel
- Department of Internal Medicine, Division of Human Genetics, The Ohio State University, Columbus, OH
| | - Emily H Turner
- Department of Laboratory Medicine, University of Washington, Seattle WA
| | - Colin C Pritchard
- Department of Laboratory Medicine, University of Washington, Seattle WA;
| |
Collapse
|