1
|
Cancer germline predisposing variants and late mortality from subsequent malignant neoplasms among long-term childhood cancer survivors: a report from the St Jude Lifetime Cohort and the Childhood Cancer Survivor Study. Lancet Oncol 2023; 24:1147-1156. [PMID: 37797633 PMCID: PMC10712938 DOI: 10.1016/s1470-2045(23)00403-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Revised: 08/08/2023] [Accepted: 08/09/2023] [Indexed: 10/07/2023]
Abstract
BACKGROUND Carriers of cancer predisposing variants are at an increased risk of developing subsequent malignant neoplasms among those who have survived childhood cancer. We aimed to investigate whether cancer predisposing variants contribute to the risk of subsequent malignant neoplasm-related late mortality (5 years or more after diagnosis). METHODS In this analysis, data were included from two retrospective cohort studies, St Jude Lifetime Cohort (SJLIFE) and the Childhood Cancer Survivor Study (CCSS), with prospective follow-up of patients who were alive for at least 5 years after diagnosis with childhood cancer (ie, long-term childhood cancer survivors) with corresponding germline whole genome or whole exome sequencing data. Cancer predisposing variants affecting 60 genes associated with well-established autosomal-dominant cancer-predisposition syndromes were characterised. Subsequent malignant neoplasms were graded using the National Cancer Institute Common Terminology Criteria for Adverse Events (CTCAE) version 4.03 with modifications. Cause-specific late mortality was based on linkage with the US National Death Index and systematic cohort follow up. Fine-Gray subdistribution hazard models were used to estimate subsequent malignant neoplasm-related late mortality starting from the first biospecimen collection, treating non-subsequent malignant neoplasm-related deaths as a competing risk, adjusting for genetic ancestry, sex, age at diagnosis, and cancer treatment exposures. SJLIFE (NCT00760656) and CCSS (NCT01120353) are registered with ClinicalTrials.gov. FINDINGS 12 469 (6172 male and 6297 female) participants were included, 4402 from the SJLIFE cohort (median follow-up time since collection of the first biospecimen 7·4 years [IQR 3·1-9·4]) and 8067 from the CCSS cohort (median follow-up time since collection of the first biospecimen 12·6 years [2·2-16·6]). 641 (5·1%) of 12 469 participants carried cancer predisposing variants (294 [6·7%] in the SJLIFE cohort and 347 [4·3%] in the CCSS cohort), which were significantly associated with an increased severity of subsequent malignant neoplasms (CTCAE grade ≥4 vs grade <4: odds ratio 2·15, 95% CI 1·18-4·19, p=0·0085). 263 (2·1%) subsequent malignant neoplasm-related deaths (44 [1·0%] in the SJLIFE cohort; and 219 [2·7%] in the CCSS cohort) and 426 (3·4%) other-cause deaths (103 [2·3%] in SJLIFE; and 323 [4·0%] in CCSS) occurred. Cumulative subsequent malignant neoplasm-related mortality at 10 years after the first biospecimen collection in carriers of cancer predisposing variants was 3·7% (95% CI 1·2-8·5) in SJLIFE and 6·9% (4·1-10·7) in CCSS versus 1·5% (1·0-2·1) in SJLIFE and 2·1% (1·7-2·5) in CCSS in non-carriers. Carrying a cancer predisposing variant was associated with an increased risk of subsequent malignant neoplasm-related mortality (SJLIFE: subdistribution hazard ratio 3·40 [95% CI 1·37-8·43]; p=0·0082; CCSS: 3·58 [2·27-5·63]; p<0·0001). INTERPRETATION Identifying participants at increased risk of subsequent malignant neoplasms via genetic counselling and clinical genetic testing for cancer predisposing variants and implementing early personalised cancer surveillance and prevention strategies might reduce the substantial subsequent malignant neoplasm-related mortality burden. FUNDING American Lebanese Syrian Associated Charities and US National Institutes of Health.
Collapse
|
2
|
Abstract 642: Genomes for Kids: Comprehensive DNA and RNA sequencing defining the scope of actionable mutations in pediatric cancer. Cancer Res 2021. [DOI: 10.1158/1538-7445.am2021-642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Clinical genomic studies of pediatric cancer have primarily focused on specific tumor types or high-risk disease. In the Genomes for Kids study (NCT02530658) we used a three-platform sequencing approach, including whole genome (WGS), whole exome (WES) and RNA sequencing, to examine tumor and paired germline genomes from prospectively identified children with cancer. The goal of the study was to assess the potential of comprehensive next generation sequencing to elucidate the molecular mechanisms underlying tumor formation and investigate the potential of this information to influence clinical decision-making.The cohort, with a median age of 6 yrs, range 0 - 26 yrs, included 301 patients with newly diagnosed (85%) or relapsed/refractory (15%) cancers, unselected for tumor type or stage. Patients with hematologic malignancies accounted for 41% of cases, 31% had CNS tumors, and 28% had other non-CNS solid tumors. This cohort also included 18 patients with very rare tumor types, defined here as occurring in less than 2 cases per million person per year.Two hundred fifty three patients (84%) had sufficient tumor for three-platform sequencing and all 301 had adequate paired germline samples. Following analysis, 86% of patients harbored diagnostic (53%), prognostic (57%), therapeutically relevant (25%), and/or cancer predisposing (18%) variants. The inclusion of WGS enabled detection of oncogenic gene fusions, as well as 22 cases in which oncogenes were activated through enhancer hijacking, a particularly frequent occurrence in hematologic malignancies. In addition, WGS effectively detected clinically relevant small intragenic deletions (15% of tumors) and a variety of mutational signatures, which were not detectable through analysis of whole exome data. Evaluation of 56 pathogenic germline variants in the context of paired tumor sequence data helped establish the disease relevance of several genes that are not typically associated with the cancer type in question, providing critical insights on a case-by-case basis. Examples include a pathogenic germline variant in MUTYH in a patient with retinoblastoma whose tumor exhibited a mutation signature associated with reactive oxygen species indicative of loss of MUTYH function; and conversely, a likely pathogenic variant in PMS2 in a rare brain cancer, which did not exhibit a mutation signature associated with microsatellite instability. This study successfully demonstrated the power of this three-platform approach to interrogate and interpret the full range of genomic variants across newly diagnosed as well as relapsed/refractory pediatric cancers. As a result of these findings, we have incorporated this three-platform approach into our routine real-time clinical service at St. Jude Children's Hospital.
Citation Format: David A. Wheeler, Scott Newman, Joy Nakitandwe, Chimene A. Kesserwan, Elizabeth M. Azzato, Michael C. Rusch, Sheila Shurtleff, Armita Bahrami, Brent Orr, Jeffery M. Klco, Dale J. Hedges, Kayla V. Hamilton, Scott G. Foy, Michael N. Edmonson, Andrew Thrasher, Jiali Gu, Lynn W. Harrison, Lu Wang, Roya Mostafavi, Manish Kubal, Jamie Maciaszek, Michael Clay, Annastasia Ouma, Antonina Silkov, Yanling Liu, Zhaojie Zhang, Yu Liu, Samuel W. Brady, Xin Zhou, Mark Wilkinson, Delaram Rahbarinia, Jay Knight, Jian Wang, Charles G. Mullighan, Rose B. McGee, Emily A. Quinn, Elsie L. Gerhardt, Leslie M. Taylor, Regina Nuccio, Jessica M. Valdez, Stacy J. Hines-Dowell, Alberto Pappo, Giles Robinson, Liza-Marie Johnson, Ching-Hon Pui, David W. Ellison, James R. Downing, Jinghui Zhang, Kim E. Nichols. Genomes for Kids: Comprehensive DNA and RNA sequencing defining the scope of actionable mutations in pediatric cancer [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 642.
Collapse
|
3
|
Abstract 2289: Empowering point-and-click genomic analysis with large pediatric genomic reference data on St. Jude Cloud. Cancer Res 2021. [DOI: 10.1158/1538-7445.am2021-2289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Next-generation sequencing-based genomic profiling is now a mainstay of pediatric oncology research and clinical testing. Correlating genomic features of patient cancer genomes with curated data extracted from large reference cohorts is critical for identifying molecular subtypes and underlying mutagenesis processes. To facilitate such investigation, we developed two user-friendly workflows on St. Jude Cloud, a data sharing ecosystem hosting genomic data for >10,000 pediatric cancer patients and survivors. These workflows leverage St. Jude Cloud comprehensive pediatric cancer genomic data, including 1,616 RNA-seq of 135 cancer subtypes and 958 whole genome sequencing (WGS) of 35 subtypes, to enable user analysis of their data in the context of St. Jude Cloud cohorts without a need to download large datasets.
The RNA-Seq Expression Classification workflow enables a user to compare their patient RNA-Seq gene expression data with blood (832), brain (456), and solid tumor (319) pediatric cancer reference cohorts and PDX models (45), enabling subtype classification using t-Distributed Stochastic Neighbor Embedding (t-SNE). Reference cohorts include curated subtype-defining somatic alterations integrating genomic variant data with expression profile. Resulting interactive t-SNE plots can be explored and annotated - with options to highlight cancer subtypes or samples and display sample information (age of onset, clinical diagnosis, molecular driver). To demonstrate, we analyze PAWNXH, a Children's Oncology Group AML sample with a novel ZBTB7A-NUTM1 fusion and find it clusters with AML samples harboring KMT2A re-arrangements suggesting a potential mechanism of pathogenesis. Integrating PDX samples enables model selection for functional experiments by connecting patient subtypes with mouse models.
The Mutational Signatures workflow identifies and quantifies COSMIC mutational signatures in user-uploaded somatic VCF files for comparison to reference pediatric cancer cohorts. The interactive interface enables rapid identification of signatures within the query cohort and facilitates comparison to the reference using a cohort-level summary view. Identified signatures may also be explored at the sample-level for both query and reference cohorts, enabling the user to identify samples with signatures of interest for further analysis. We show an example comparison of mutational signatures identified in pediatric and adult AML samples.
These workflows enable users to leverage curated pediatric cancer data to make discoveries in their own samples. Enabling point-and-click analysis in St. Jude Cloud removes the barrier for non-computational researchers and eliminates the need to download large reference datasets for local analysis. Both workflows utilize post-processed rather than raw genomic data, reducing transfer costs for uploading user data to the cloud.
Citation Format: Andrew Thrasher, Michael Macias, Alexander M. Gout, Delaram Rahbarinia, Xin Zhou, Samuel W. Brady, Clay McLeod, Michael C. Rusch, Xiaolong Chen, Soheil Meshinchi, Michael A. Dyer, Suzanne J. Baker, Martine F. Roussel, Jinghui Zhang. Empowering point-and-click genomic analysis with large pediatric genomic reference data on St. Jude Cloud [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 2289.
Collapse
|
4
|
Exploration of Coding and Non-coding Variants in Cancer Using GenomePaint. Cancer Cell 2021; 39:83-95.e4. [PMID: 33434514 PMCID: PMC7884056 DOI: 10.1016/j.ccell.2020.12.011] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 10/13/2020] [Accepted: 12/10/2020] [Indexed: 12/14/2022]
Abstract
GenomePaint (https://genomepaint.stjude.cloud/) is an interactive visualization platform for whole-genome, whole-exome, transcriptome, and epigenomic data of tumor samples. Its design captures the inter-relatedness between DNA variations and RNA expression, supporting in-depth exploration of both individual cancer genomes and full cohorts. Regulatory non-coding variants can be inspected and analyzed along with coding variants, and their functional impact further explored by examining 3D genome data from cancer cell lines. Further, GenomePaint correlates mutation and expression patterns with patient outcomes, and supports custom data upload. We used GenomePaint to unveil aberrant splicing that disrupts the RING domain of CREBBP, discover cis activation of the MYC oncogene by duplication of the NOTCH1-MYC enhancer in B-lineage acute lymphoblastic leukemia, and explore the inter- and intra-tumor heterogeneity at EGFR in adult glioblastomas. These examples demonstrate that deep multi-omics exploration of individual cancer genomes enabled by GenomePaint can lead to biological insights for follow-up validation.
Collapse
|
5
|
Pathogenic Germline Mutations in DNA Repair Genes in Combination With Cancer Treatment Exposures and Risk of Subsequent Neoplasms Among Long-Term Survivors of Childhood Cancer. J Clin Oncol 2020; 38:2728-2740. [PMID: 32496904 PMCID: PMC7430217 DOI: 10.1200/jco.19.02760] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/15/2020] [Indexed: 12/17/2022] Open
Abstract
PURPOSE To investigate cancer treatment plus pathogenic germline mutations (PGMs) in DNA repair genes (DRGs) for identification of childhood cancer survivors at increased risk of subsequent neoplasms (SNs). METHODS Whole-genome sequencing was performed on blood-derived DNA from survivors in the St Jude Lifetime Cohort. PGMs were evaluated in 127 genes from 6 major DNA repair pathways. Cumulative doses of chemotherapy and body region-specific radiotherapy (RT) were abstracted from medical records. Relative rates (RRs) and 95% CIs of SNs by mutation status were estimated using multivariable piecewise exponential models. RESULTS Of 4,402 survivors, 495 (11.2%) developed 1,269 SNs. We identified 538 PGMs in 98 DRGs (POLG, MUTYH, ERCC2, and BRCA2, among others) in 508 (11.5%) survivors. Mutations in homologous recombination (HR) genes were significantly associated with an increased rate of subsequent female breast cancer (RR, 3.7; 95% CI, 1.8 to 7.7), especially among survivors with chest RT ≥ 20 Gy (RR, 4.4; 95% CI, 1.6 to 12.4), or with a cumulative dose of anthracyclines in the second or third tertile (RR, 4.4; 95% CI, 1.7 to 11.4). Mutations in HR genes were also associated with an increased rate of subsequent sarcoma among those who received alkylating agent doses in the third tertile (RR, 14.9; 95% CI, 4.0 to 38.0). Mutations in nucleotide excision repair genes were associated with subsequent thyroid cancer for those treated with neck RT ≥ 30 Gy (RR, 12.9; 95% CI, 1.6 to 46.6) with marginal statistical significance. CONCLUSION Our study provides novel insights regarding the contribution of genetics, in combination with known treatment-related risks, for the development of SNs. These findings have the potential to facilitate identification of high-risk survivors who may benefit from genetic counseling and/or testing of DRGs, which may further inform personalized cancer surveillance and prevention strategies.
Collapse
|
6
|
Abstract 5478: CICERO: An accurate method for detecting complex and diverse driver fusions using cancer transcriptome sequencing (RNA-seq) data. Cancer Res 2020. [DOI: 10.1158/1538-7445.am2020-5478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Gene fusions are important biomarkers for cancer diagnosis, subtype classification and therapeutic decision-making. While fusion detection using RNA-seq data has become a standard practice, existing computational methods primarily focus on identifying canonical exon-to-exon fusions. However, more complex events such as multi-partner fusions, truncations, enhancer hijacking and internal tandem duplications (ITD) can also lead to abnormal function or aberrant transcription of cancer driver genes.
To aid discovery of complex and diverse driver fusions, we developed CICERO (CICERO Is Clipping Extended for RNA Optimization), a local assembly-based algorithm that integrates RNA-seq reads bearing aberrant mapping signatures with extensive annotation for ranking candidate fusions. Our benchmark data set, designed to support the main application of RNA-seq fusion analysis, consists of 184 driver fusions from 170 pediatric leukemia, solid tumor and brain tumor detected by paired tumor-normal WGS and orthogonally validated by capture sequencing, RT-PCR and/or FISH. CICERO detected 95% of these fusions with an average ranking of 1.9, whereas ChimeraScan, deFuse, FusionCatcher and STAR-Fusion detected only 63%, 66%, 77% and 63% with an average ranking of 37.0, 9.0, 18.1 and 4.4, respectively. Notably, events such as ITD and rearrangements involving the highly repetitive IGH locus were detected almost exclusively by CICERO.
Our re-analysis of 167 RNA-seq data from the TCGA Glioblastoma Multiforme (GBM) cohort unveiled 158 fusions of cancer genes that were not reported previously. These include kinase fusions (KLHL7-BRAF), ITD of EGFR kinase domain and a 13% prevalence of EGFR C-terminal truncation compared to the 6% reported by the TCGA Network.
CICERO has greatly improved our ability to discover non-canonical fusions which are overlooked by existing fusion detection methods, and has been used to analyze >2,000 RNA-seq samples generated by the two largest pediatric cancer genomics initiatives: the St. Jude/Washington University Pediatric Cancer Genome Project (PCGP) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) project. We anticipate that CICERO will also improve fusion analysis for adult cancer RNA-seq data, as demonstrated through our re-analysis of TCGA-GBM and our recent discovery of MAP3K8 C-terminal truncation fusion in 2% of TCGA melanoma samples.
CICERO is accessible via standard (https://github.com/stjude/Cicero) or cloud-based (https://platform.stjude.cloud/tools/rapid_rna-seq) implementation. To further improve accuracy, fusions predicted by CICERO can be curated by FusionEditor (https://proteinpaint.stjude.org/FusionEditor/), an interactive viewer allowing inspection of protein domains involved in the fusion and the gene expression status of fusion-positive samples.
Citation Format: Liqing Tian, Yongjin Li, Michael N. Edmonson, Xin Zhou, Scott Newman, Clay McLeod, Yu Liu, Bo Tang, Michael C. Rusch, John Easton, Jing Ma, Austyn Trull, J. Robert Michael, Andrew Thrasher, Charles Mullighan, Suzanne J. Baker, James R. Downing, David W. Ellison, Jinghui Zhang. CICERO: An accurate method for detecting complex and diverse driver fusions using cancer transcriptome sequencing (RNA-seq) data [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 5478.
Collapse
|
7
|
Discovery of regulatory noncoding variants in individual cancer genomes by using cis-X. Nat Genet 2020; 52:811-818. [PMID: 32632335 PMCID: PMC7679232 DOI: 10.1038/s41588-020-0659-5] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 06/05/2020] [Indexed: 12/30/2022]
Abstract
We developed cis-X, a computational method for discovering regulatory noncoding variants in cancer by integrating whole-genome and transcriptome sequencing data from a single cancer sample. cis-X first finds aberrantly cis-activated genes that exhibit allele-specific expression accompanied by an elevated outlier expression. It then searches for causal noncoding variants that may introduce aberrant transcription factor binding motifs or enhancer hijacking by structural variations. Analysis of 13 T-lineage acute lymphoblastic leukemias identified a recurrent intronic variant predicted to cis-activate the TAL1 oncogene, a finding validated in vivo by chromatin immunoprecipitation sequencing of a patient-derived xenograft. Candidate oncogenes include the prolactin receptor PRLR activated by a focal deletion that removes a CTCF-insulated neighborhood boundary. cis-X may be applied to pediatric and adult solid tumors that are aneuploid and heterogeneous. In contrast to existing approaches, which require large sample cohorts, cis-X enables the discovery of regulatory noncoding variants in individual cancer genomes.
Collapse
|
8
|
Molecular Mechanism of Telomere Length Dynamics and Its Prognostic Value in Pediatric Cancers. J Natl Cancer Inst 2020; 112:756-764. [PMID: 31647544 DOI: 10.1093/jnci/djz210] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 10/07/2019] [Accepted: 10/22/2019] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND We aimed to systematically evaluate telomere dynamics across a spectrum of pediatric cancers, search for underlying molecular mechanisms, and assess potential prognostic value. METHODS The fraction of telomeric reads was determined from whole-genome sequencing data for paired tumor and normal samples from 653 patients with 23 cancer types from the Pediatric Cancer Genome Project. Telomere dynamics were characterized as the ratio of telomere fractions between tumor and normal samples. Somatic mutations were gathered, RNA sequencing data for 330 patients were analyzed for gene expression, and Cox regression was used to assess the telomere dynamics on patient survival. RESULTS Telomere lengthening was observed in 28.7% of solid tumors, 10.5% of brain tumors, and 4.3% of hematological cancers. Among 81 samples with telomere lengthening, 26 had somatic mutations in alpha thalassemia/mental retardation syndrome X-linked gene, corroborated by a low level of the gene expression in the subset of tumors with RNA sequencing. Telomerase reverse transcriptase gene amplification and/or activation was observed in 10 tumors with telomere lengthening, including two leukemias of the E2A-PBX1 subtype. Among hematological cancers, pathway analysis for genes with expressions most negatively correlated with telomere fractions suggests the implication of a gene ontology process of antigen presentation by Major histocompatibility complex class II. A higher ratio of telomere fractions was statistically significantly associated with poorer survival for patients with brain tumors (hazard ratio = 2.18, 95% confidence interval = 1.37 to 3.46). CONCLUSION Because telomerase inhibitors are currently being explored as potential agents to treat pediatric cancer, these data are valuable because they identify a subpopulation of patients with reactivation of telomerase who are most likely to benefit from this novel therapeutic option.
Collapse
|
9
|
CICERO: a versatile method for detecting complex and diverse driver fusions using cancer RNA sequencing data. Genome Biol 2020; 21:126. [PMID: 32466770 PMCID: PMC7325161 DOI: 10.1186/s13059-020-02043-x] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 05/13/2020] [Indexed: 02/08/2023] Open
Abstract
To discover driver fusions beyond canonical exon-to-exon chimeric transcripts, we develop CICERO, a local assembly-based algorithm that integrates RNA-seq read support with extensive annotation for candidate ranking. CICERO outperforms commonly used methods, achieving a 95% detection rate for 184 independently validated driver fusions including internal tandem duplications and other non-canonical events in 170 pediatric cancer transcriptomes. Re-analysis of TCGA glioblastoma RNA-seq unveils previously unreported kinase fusions (KLHL7-BRAF) and a 13% prevalence of EGFR C-terminal truncation. Accessible via standard or cloud-based implementation, CICERO enhances driver fusion detection for research and precision oncology. The CICERO source code is available at https://github.com/stjude/Cicero.
Collapse
|
10
|
Shortened Leukocyte Telomere Length Associates with an Increased Prevalence of Chronic Health Conditions among Survivors of Childhood Cancer: A Report from the St. Jude Lifetime Cohort. Clin Cancer Res 2020; 26:2362-2371. [PMID: 31969337 DOI: 10.1158/1078-0432.ccr-19-2503] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 11/11/2019] [Accepted: 01/17/2020] [Indexed: 12/19/2022]
Abstract
PURPOSE We aimed to analyze and compare leukocyte telomere length (LTL) and age-dependent LTL attrition between childhood cancer survivors and noncancer controls, and to evaluate the associations of LTL with treatment exposures, chronic health conditions (CHC), and health behaviors among survivors. EXPERIMENTAL DESIGN We included 2,427 survivors and 293 noncancer controls of European ancestry, drawn from the participants in St. Jude Lifetime Cohort Study (SJLIFE), a retrospective hospital-based study with prospective follow-up (2007-2016). Common nonneoplastic CHCs (59 types) and subsequent malignant neoplasms (5 types) were clinically assessed. LTL was measured with whole-genome sequencing data. RESULTS After adjusting for age at DNA sampling, gender, genetic risk score based on 9 SNPs known to be associated with telomere length, and eigenvectors, LTL among survivors was significantly shorter both overall [adjusted mean (AM) = 6.20 kb; SE = 0.03 kb] and across diagnoses than controls (AM = 6.69 kb; SE = 0.07 kb). Among survivors, specific treatment exposures associated with shorter LTL included chest or abdominal irradiation, glucocorticoid, and vincristine chemotherapies. Significant negative associations of LTL with 14 different CHCs, and a positive association with subsequent thyroid cancer occurring out of irradiation field were identified. Health behaviors were significantly associated with LTL among survivors aged 18 to 35 years (P trend = 0.03). CONCLUSIONS LTL is significantly shorter among childhood cancer survivors than noncancer controls, and is associated with CHCs and health behaviors, suggesting LTL as an aging biomarker may be a potential mechanistic target for future intervention studies designed to prevent or delay onset of CHCs in childhood cancer survivors.See related commentary by Walsh, p. 2281.
Collapse
|
11
|
Pediatric Cancer Variant Pathogenicity Information Exchange (PeCanPIE): a cloud-based platform for curating and classifying germline variants. Genome Res 2019; 29:1555-1565. [PMID: 31439692 PMCID: PMC6724669 DOI: 10.1101/gr.250357.119] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Accepted: 07/23/2019] [Indexed: 01/06/2023]
Abstract
Variant interpretation in the era of massively parallel sequencing is challenging. Although many resources and guidelines are available to assist with this task, few integrated end-to-end tools exist. Here, we present the Pediatric Cancer Variant Pathogenicity Information Exchange (PeCanPIE), a web- and cloud-based platform for annotation, identification, and classification of variations in known or putative disease genes. Starting from a set of variants in variant call format (VCF), variants are annotated, ranked by putative pathogenicity, and presented for formal classification using a decision-support interface based on published guidelines from the American College of Medical Genetics and Genomics (ACMG). The system can accept files containing millions of variants and handle single-nucleotide variants (SNVs), simple insertions/deletions (indels), multiple-nucleotide variants (MNVs), and complex substitutions. PeCanPIE has been applied to classify variant pathogenicity in cancer predisposition genes in two large-scale investigations involving >4000 pediatric cancer patients and serves as a repository for the expert-reviewed results. PeCanPIE was originally developed for pediatric cancer but can be easily extended for use for nonpediatric cancers and noncancer genetic diseases. Although PeCanPIE's web-based interface was designed to be accessible to non-bioinformaticians, its back-end pipelines may also be run independently on the cloud, facilitating direct integration and broader adoption. PeCanPIE is publicly available and free for research use.
Collapse
|
12
|
Abstract 4178: Germline mutations in BRCA2 and pediatric/adolescent non-Hodgkin's lymphoma: A report from the St. Jude Lifetime (SJLIFE) and Childhood Cancer Survivor Study (CCSS) cohorts. Cancer Res 2019. [DOI: 10.1158/1538-7445.am2019-4178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Pathogenic or likely pathogenic (P/LP) monoallelic germline mutations in BRCA2 increase risk of developing breast, ovarian, prostate, and pancreatic cancers. In a prior report from the SJLIFE study, BRCA2 was the third most frequently mutated gene (14 occurrences) among 3006 survivors of childhood cancer with the highest number observed among lymphoma survivors (7/586). To further investigate BRCA2 as a potential predisposition gene for pediatric/adolescent lymphoma, we analyzed 781 additional lymphoma survivors in the SJLIFE and CCSS cohorts with whole-genome sequencing (30X coverage). In the combined set of 1367 survivors (808 Hodgkin’s lymphoma [HL], 559 non-Hodgkin’s lymphoma [NHL]; 54% male; median age at diagnosis 12.6 [range 1.1-22.7] years), 13 P/LP mutations in BRCA2 were identified, with 7 mapped to the breast or ovarian cancer cluster regions defined by the Consortium of Investigators of Modifiers of BRCA1/2. Compared to reference controls in the Genome Aggregation Database (gnomAD) (Table 1), a significant association was observed between lymphoma and mutations in BRCA2 (odds ratio [OR], 3.1; 95% CI, 1.7-5.5) but not BRCA1. When stratified by diagnosis, the association was significant for NHL (OR, 4.8; 95% CI, 2.0-9.6) but not for HL. BRCA2 mutation carriers included a broad spectrum of NHL histological subtypes. Our findings support inclusion of pediatric/adolescent NHL in the spectrum of cancers associated with germline BRCA2 mutations. Approximately 1.4% of survivors of pediatric/adolescent NHL are carriers of a P/LP mutation in BRCA2, which may be the underlying etiology of their primary diagnosis. Clinically, counselling regarding BRCA2 mutation status should be considered for pediatric/adolescent NHL patients. Large scale genetic studies of newly diagnosed pediatric/adolescent lymphoma patients are warranted to replicate and refine diagnosis-specific risk estimates.
Table 1.Comparisons of Mutation Carriers for BRCA1/2 Genes Between Lymphoma Survivors and gnomAD ControlsLymphoma SurvivorsgnomAD Controls (Hu et al. JAMA 2018)Cancer Risk (Fisher''s Exact Test)GeneCancer DiagnosisCarriersNon-CarriersCarriersNon-CarriersOdds Ratio (95% CI)P ValueBRCA2HL+NHL1313543131024263.1 (1.7-5.5)0.00045HL58033131024262.0 (0.7-4.8)0.11NHL85513131024264.8 (2.0-9.6)0.00041BRCA1HL+NHL313642081039141.1 (0.2-3.3)0.76HL18072081039140.6 (0.02-3.5)1.0NHL25572081039141.8 (0.2-6.6)0.31
Citation Format: Zhaoming Wang, Ti-Cheng Chang, Carmen L. Wilson, Chimene A. Kesserwan, Todd M. Gibson, Nan Li, John Easton, Heather L. Mulder, Gang Wu, Michael N. Edmonson, Michael C. Rusch, James R. Downing, Kim E. Nichols, Smita Bhatia, Gregory T. Armstrong, Melissa M. Hudson, Jinghui Zhang, Yutaka Yasui, Leslie L. Robison. Germline mutations in BRCA2 and pediatric/adolescent non-Hodgkin's lymphoma: A report from the St. Jude Lifetime (SJLIFE) and Childhood Cancer Survivor Study (CCSS) cohorts [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 4178.
Collapse
|
13
|
Correction to: H3.3 K27M depletion increases differentiation and extends latency of diffuse intrinsic pontine glioma growth in vivo. Acta Neuropathol 2019; 137:1021. [PMID: 30976974 DOI: 10.1007/s00401-019-02006-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Accepted: 04/03/2019] [Indexed: 11/25/2022]
Abstract
The original article can be found online.
Collapse
|
14
|
Real-time sharing of comprehensive clinical genomics sequencing data in St. Jude Cloud. J Clin Oncol 2019. [DOI: 10.1200/jco.2019.37.15_suppl.10019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
10019 Background: As tumor and germline genomic data from pediatric cancer patients is scarce in existing genomic databases, there is an urgent need for more comprehensive datasets. Such data will allow us to fully assess the actionable pediatric cancer genome, facilitate biomarker discovery, and identify new clinical associations. Methods: We sequenced 1002 tumor/normal pairs as part of a real-time clinical genomics service including whole genome, exome and transcriptome for 775 and exome/transcriptome for 227 samples. Tumor types were representative of the common and rare diseases treated at our institution (37% hematological, 31% brain and 32% solid tumors). A multidisciplinary team assessed every case, and after clinical reporting was complete, genomics data and basic clinical information (primary diagnosis, age, sex, ethnicity, primary/relapse/metastasis status), was made securely available online through St. Jude Cloud (www.stjude.cloud). Results: Based on analysis of 253 initial cases from the Genomes for Kids study, our multi-platform sequencing approach uncovered diagnostic, prognostic and/or therapeutically relevant findings in 78% of patients. We estimate 11-16% of clinically-relevant gene mutations could be missed by less comprehensive sequencing approaches. One quarter of patients had a potentially druggable mutation. This surprisingly high proportion was driven, in part, by BRAF fused low-grade gliomas and diverse JAK/STAT pathway alterations in B-Cell acute lymphoblastic leukemias. Whole genome/transcriptome sequencing allowed us to detect rare and novel gene fusions in 8% of cases and facilitated discovery of a new recurrent fusion gene in pediatric melanoma. All data is available online for others to mine and it is likely that additional clinically-relevant mutations can be uncovered. Conclusions: These data demonstrate the value of incorporating comprehensive sequencing into clinical diagnostics and patient care. We endeavor to make this large and richly annotated dataset available to others in real time rather than holding it back for months or years until publication. We anticipate adding approximately 500 additional cases per year at regular intervals, and as the resource grows, expect users to identify new targetable alterations that may be incorporated into patient care.
Collapse
|
15
|
Polygenic risk of subsequent thyroid cancer after childhood cancer: A report from St. Jude lifetime cohort (SJLIFE) and Childhood Cancer Survivor Study (CCSS). J Clin Oncol 2019. [DOI: 10.1200/jco.2019.37.15_suppl.10060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
10060 Background: Subsequent thyroid cancer (STC) is among the most common malignancies in childhood cancer survivors, especially those with thyroid exposure to radiotherapy (RT). Identification of genetic risk factors may inform screening practices. Methods: Twelve SNPs were previously identified as thyroid cancer risk loci in the general population of European ancestry. A polygenic risk score (PRS) was calculated as a sum of risk alleles carried by a survivor, weighted by the natural logarithm of the published per-allele odds ratios (range: 1.2-1.8). With piecewise exponential models, associations of STC rates with PRS were assessed, both overall and stratified by neck RT exposure. Models were adjusted for sex, age at primary diagnosis, attained age, neck RT dose, epipodophyllotoxin therapy, and eigenvectors within survivors of European ancestry from SJLIFE with whole-genome sequencing data and CCSS with SNP data imputed to Haplotype Reference Consortium. Results: Among 2,324 SJLIFE survivors, 61 (43 with, 18 without neck RT) developed STC. The rate of STC was increased by 5.3-fold (95% confidence interval (CI), 2.2-12.6) and 3.1-fold (CI, 1.3-7.7) for survivors in the third and second PRS tertiles, respectively, compared to those in the first tertile, with corresponding cumulative incidence at age of 40 years of 5.3% (CI, 3.3-7.3%), 2.5% (CI, 1.1-3.9%), and 1.0% (CI, 0.005-2.0%), respectively. Stratified by neck RT, the corresponding rate increases were 7.6 (CI, 2.3-25.3) and 3.8 (CI, 1.1-13.4), respectively, among survivors exposed to neck RT; however, no association was observed among survivors without neck RT (only 18 STC cases). Replication was performed among 4,302 CCSS survivors, 100 (61 with, 39 without neck RT) developed STC. The rates of STC were increased by 2.3-fold (CI, 1.4-3.9) and 1.7-fold (CI, 1.0-2.9) for survivors in the third and the second PRS tertiles, compared to those in the first tertile. The similar significant associations were observed in survivors with and without neck RT ( Ptrend = 0.04 and 0.02, respectively). Conclusions: High PRS conferring STC risk can inform screening practices and help personalize and improve survivorship care.
Collapse
|
16
|
Polygenic Determinants for Subsequent Breast Cancer Risk in Survivors of Childhood Cancer: The St Jude Lifetime Cohort Study (SJLIFE). Clin Cancer Res 2018; 24:6230-6235. [PMID: 30366939 PMCID: PMC6295266 DOI: 10.1158/1078-0432.ccr-18-1775] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 07/28/2018] [Accepted: 09/05/2018] [Indexed: 01/19/2023]
Abstract
PURPOSE The risk of subsequent breast cancer among female childhood cancer survivors is markedly elevated. We aimed to determine genetic contributions to this risk, focusing on polygenic determinants implicated in breast cancer susceptibility in the general population. EXPERIMENTAL DESIGN Whole-genome sequencing (30×) was performed on survivors in the St Jude Lifetime Cohort, and germline mutations in breast cancer predisposition genes were classified for pathogenicity. A polygenic risk score (PRS) was constructed for each survivor using 170 established common risk variants. Relative rate (RR) and 95% confidence interval (95% CI) of subsequent breast cancer incidence were estimated using multivariable piecewise exponential regression. RESULTS The analysis included 1,133 female survivors of European ancestry (median age at last follow-up = 35.4 years; range, 8.4-67.4), of whom 47 were diagnosed with one or more subsequent breast cancers (median age at subsequent breast cancer = 40.3 years; range, 24.5-53.0). Adjusting for attained age, age at primary diagnosis, chest irradiation, doses of alkylating agents and anthracyclines, and genotype eigenvectors, RRs for survivors with PRS in the highest versus lowest quintiles were 2.7 (95% CI, 1.0-7.3), 3.0 (95% CI, 1.1-8.1), and 2.4 (95% CI, 0.1-81.1) for all survivors and survivors with and without chest irradiation, respectively. Similar associations were observed after excluding carriers of pathogenic/likely pathogenic mutations in breast cancer predisposition genes. Notably, the PRS was associated with the subsequent breast cancer rate under the age of 45 years (RR = 3.2; 95% CI, 1.2-8.3). CONCLUSIONS Genetic profiles comprised of small-effect common variants and large-effect predisposing mutations can inform personalized breast cancer risk and surveillance/intervention in female childhood cancer survivors.
Collapse
|
17
|
Corrigendum: Inhibition of SF3B1 by molecules targeting the spliceosome results in massive aberrant exon skipping. RNA (NEW YORK, N.Y.) 2018; 24:1886. [PMID: 30446591 PMCID: PMC6239181 DOI: 10.1261/rna.068544.118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
|
18
|
Inhibition of SF3B1 by molecules targeting the spliceosome results in massive aberrant exon skipping. RNA (NEW YORK, N.Y.) 2018; 24:1056-1066. [PMID: 29844105 PMCID: PMC6049506 DOI: 10.1261/rna.065383.117] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Accepted: 05/14/2018] [Indexed: 05/22/2023]
Abstract
The recent identification of compounds that interact with the spliceosome (sudemycins, spliceostatin A, and meayamycin) indicates that these molecules modulate aberrant splicing via SF3B1 inhibition. Through whole transcriptome sequencing, we have demonstrated that treatment of Rh18 cells with sudemycin leads to exon skipping as the predominant aberrant splicing event. This was also observed following reanalysis of published RNA-seq data sets derived from HeLa cells after spliceostatin A exposure. These results are in contrast to previous reports that indicate that intron retention was the major consequence of SF3B1 inhibition. Analysis of the exon junctions up-regulated by these small molecules indicated that these sequences were absent in annotated human genes, suggesting that aberrant splicing events yielded novel RNA transcripts. Interestingly, the length of preferred downstream exons was significantly longer than the skipped exons, although there was no difference between the lengths of introns flanking skipped exons. The reading frame of the aberrantly skipped exons maintained a ratio of 2:1:1, close to that of the cassette exons (3:1:1) present in naturally occurring isoforms, suggesting negative selection by the nonsense-mediated decay (NMD) machinery for out-of-frame transcripts. Accordingly, genes involved in NMD and RNAs encoding proteins involved in the splicing process were enriched in both data sets. Our findings, therefore, further elucidate the mechanisms by which SF3B1 inhibition modulates pre-mRNA splicing.
Collapse
|
19
|
Abstract
Purpose Childhood cancer survivors are at increased risk of subsequent neoplasms (SNs), but the germline genetic contribution is largely unknown. We assessed the contribution of pathogenic/likely pathogenic (P/LP) mutations in cancer predisposition genes to their SN risk. Patients and Methods Whole-genome sequencing (30-fold) was performed on samples from childhood cancer survivors who were ≥ 5 years since initial cancer diagnosis and participants in the St Jude Lifetime Cohort Study, a retrospective hospital-based study with prospective clinical follow-up. Germline mutations in 60 genes known to be associated with autosomal dominant cancer predisposition syndromes with moderate to high penetrance were classified by their pathogenicity according to the American College of Medical Genetics and Genomics guidelines. Relative rates (RRs) and 95% CIs of SN occurrence by mutation status were estimated using multivariable piecewise exponential regression stratified by radiation exposure. Results Participants were 3,006 survivors (53% male; median age, 35.8 years [range, 7.1 to 69.8 years]; 56% received radiotherapy), 1,120 SNs were diagnosed among 439 survivors (14.6%), and 175 P/LP mutations were identified in 5.8% (95% CI, 5.0% to 6.7%) of survivors. Mutations were associated with significantly increased rates of breast cancer (RR, 13.9; 95% CI, 6.0 to 32.2) and sarcoma (RR, 10.6; 95% CI, 4.3 to 26.3) among irradiated survivors and with increased rates of developing any SN (RR, 4.7; 95% CI, 2.4 to 9.3), breast cancer (RR, 7.7; 95% CI, 2.4 to 24.4), nonmelanoma skin cancer (RR, 11.0; 95% CI, 2.9 to 41.4), and two or more histologically distinct SNs (RR, 18.6; 95% CI, 3.5 to 99.3) among nonirradiated survivors. Conclusion The findings support referral of all survivors for genetic counseling for potential clinical genetic testing, which should be prioritized for nonirradiated survivors with any SN and for those with breast cancer or sarcoma in the field of prior irradiation.
Collapse
|
20
|
|
21
|
The landscape of somatic mutations in epigenetic regulators across 1,000 paediatric cancer genomes. Nat Commun 2014; 5:3630. [PMID: 24710217 DOI: 10.1038/ncomms4630] [Citation(s) in RCA: 288] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2013] [Accepted: 03/12/2014] [Indexed: 02/07/2023] Open
Abstract
Studies of paediatric cancers have shown a high frequency of mutation across epigenetic regulators. Here we sequence 633 genes, encoding the majority of known epigenetic regulatory proteins, in over 1,000 paediatric tumours to define the landscape of somatic mutations in epigenetic regulators in paediatric cancer. Our results demonstrate a marked variation in the frequency of gene mutations across 21 different paediatric cancer subtypes, with the highest frequency of mutations detected in high-grade gliomas, T-lineage acute lymphoblastic leukaemia and medulloblastoma, and a paucity of mutations in low-grade glioma and retinoblastoma. The most frequently mutated genes are H3F3A, PHF6, ATRX, KDM6A, SMARCA4, ASXL2, CREBBP, EZH2, MLL2, USP7, ASXL1, NSD2, SETD2, SMC1A and ZMYM3. We identify novel loss-of-function mutations in the ubiquitin-specific processing protease 7 (USP7) in paediatric leukaemia, which result in decreased deubiquitination activity. Collectively, our results help to define the landscape of mutations in epigenetic regulatory genes in paediatric cancer and yield a valuable new database for investigating the role of epigenetic dysregulations in cancer.
Collapse
|
22
|
Abstract SY25-01: Analysis of next-generation sequencing data for cancer genomes: challenges and pitfalls. Cancer Res 2012. [DOI: 10.1158/1538-7445.am2012-sy25-01] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The characterization of the landscape of genetic lesions that underlie cancer has been significantly advanced with the recent application of next-generation sequencing (NGS) technology. This methodology can be used to sequence selected subsets of genes, the whole exome, the whole genome, or the expressed transcriptome within a cancer cell. By comparing the acquired sequences from both a cancer and matched normal tissue sample from the same patients, one should be able to identify almost all somatic lesions within the cancer. As part of the St Jude Children's Research Hospital - Washington University Pediatric Cancer Genome Project (PCGP), we have undertaken the approach of performing whole genome sequencing (WGS) on 600 pediatric cancers and matched control tissue (1200 total genomes). Although the acquisition of the primary sequence is a formidable challenge, the analysis of these data is where the real work begins.
Unfortunately, the majority of published NGS analysis methods were developed to identify germ line variation and therefore perform sub-optimally when applied to the task of identifying somatic mutations in cancer genomes. This is in part a result of the distinct difference in logic that must be used to accurately identify all somatic lesions within a cancer. A cancer genome typically exists within a heterogenous DNA sample that is composed of normal cells admixed with an oligoclonal tumor sample. Moreover, the range of somatic lesions seen in cancer is broader than what is seen as part of germ line genetic variation, with some cancers having exceedingly complex genomes containing focal insertions, deletions, inversions, intra-chromosomal and inter-chromosomal rearrangements and large copy number abnormalities. The accurate identification of these lesions requires not only the presence of the lesions within the cancer DNA, but also their absence from the matched germ line sample.
To approach these problems, we, as well as others, have recently developed new analytical approaches to enhance our ability to identify the somatic mutations in cancer. The starting point for these analyses is ≥75 bp paired-end sequencing reads from patient matched tumor and normal DNA samples. Our goal is to identify all somatic single-nucleotide variation (SNV), small insertion/deletion (indel), copy number alteration (CNA) and structural variation (SV) that occur within the cancer DNA sample. Paired tumor-normal NGS data were analyzed together to ensure sensitivity for detecting DNA alterations in tumor and for confirming their absence in the matched normal sample. Somatic lesions initially identified by mapped NGS reads were further analyzed using more accurate algorithms to correct errors cause by suboptimal NGS mapping. The sensitivity of the methods we have developed depends on the read depth, but with WGS at 30X haploid coverage we are able to detect mono-allelic mutation present in as low as ∼25% of the analyzed cellular populations. This sensitivity can be significantly enhanced with greater read densities. Key among the methods we have developed are two new algorithms focused on identifying gross DNA alterations: CREST (Clipping REveals STructure) for SV analysis and CONSERTING (COpy Number SEgmentation by Regression Tree) for CNA analysis. CREST uses sequencing reads with partial alignments to the reference human genome (so-called soft-clipped reads) to directly map the breakpoints of somatic SVs. CONSERTING integrates read depth analysis with SV detection and adjust for sequencing artifacts, coverage bias and germ line CNVs. Together, these methods identify somatic lesions with a high validation rate (92-98% of SNV and Indels, 80% for SVs).
In this talk, I will highlight the NGS analytic pipeline we have developed and the recent discoveries that have emerged through its application to pediatric cancer genomes. In addition, I will point out some of the significant challenges that remain to be tacked in order for us to identify the full landscape and functional consequences of the somatic mutations in cancer.
Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 103rd Annual Meeting of the American Association for Cancer Research; 2012 Mar 31-Apr 4; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2012;72(8 Suppl):Abstract nr SY25-01. doi:1538-7445.AM2012-SY25-01
Collapse
|
23
|
CREST maps somatic structural variation in cancer genomes with base-pair resolution. Nat Methods 2011; 8:652-4. [PMID: 21666668 DOI: 10.1038/nmeth.1628] [Citation(s) in RCA: 390] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2010] [Accepted: 05/19/2011] [Indexed: 12/30/2022]
Abstract
We developed 'clipping reveals structure' (CREST), an algorithm that uses next-generation sequencing reads with partial alignments to a reference genome to directly map structural variations at the nucleotide level of resolution. Application of CREST to whole-genome sequencing data from five pediatric T-lineage acute lymphoblastic leukemias (T-ALLs) and a human melanoma cell line, COLO-829, identified 160 somatic structural variations. Experimental validation exceeded 80%, demonstrating that CREST had a high predictive accuracy.
Collapse
|