201
|
Oesper L, Satas G, Raphael BJ. Quantifying tumor heterogeneity in whole-genome and whole-exome sequencing data. ACTA ACUST UNITED AC 2014; 30:3532-40. [PMID: 25297070 DOI: 10.1093/bioinformatics/btu651] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
MOTIVATION Most tumor samples are a heterogeneous mixture of cells, including admixture by normal (non-cancerous) cells and subpopulations of cancerous cells with different complements of somatic aberrations. This intra-tumor heterogeneity complicates the analysis of somatic aberrations in DNA sequencing data from tumor samples. RESULTS We describe an algorithm called THetA2 that infers the composition of a tumor sample-including not only tumor purity but also the number and content of tumor subpopulations-directly from both whole-genome (WGS) and whole-exome (WXS) high-throughput DNA sequencing data. This algorithm builds on our earlier Tumor Heterogeneity Analysis (THetA) algorithm in several important directions. These include improved ability to analyze highly rearranged genomes using a variety of data types: both WGS sequencing (including low ∼7× coverage) and WXS sequencing. We apply our improved THetA2 algorithm to WGS (including low-pass) and WXS sequence data from 18 samples from The Cancer Genome Atlas (TCGA). We find that the improved algorithm is substantially faster and identifies numerous tumor samples containing subclonal populations in the TCGA data, including in one highly rearranged sample for which other tumor purity estimation algorithms were unable to estimate tumor purity.
Collapse
Affiliation(s)
- Layla Oesper
- Department of Computer Science and Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
| | - Gryte Satas
- Department of Computer Science and Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
| | - Benjamin J Raphael
- Department of Computer Science and Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA Department of Computer Science and Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
| |
Collapse
|
202
|
Nadaf J, Majewski J, Fahiminiya S. ExomeAI: detection of recurrent allelic imbalance in tumors using whole-exome sequencing data. ACTA ACUST UNITED AC 2014; 31:429-31. [PMID: 25297069 DOI: 10.1093/bioinformatics/btu665] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
SUMMARY Whole-exome sequencing (WES) has extensively been used in cancer genome studies; however, the use of WES data in the study of loss of heterozygosity or more generally allelic imbalance (AI) has so far been very limited, which highlights the need for user-friendly and flexible software that can handle low-quality datasets. We have developed a statistical approach, ExomeAI, for the detection of recurrent AI events using WES datasets, specifically where matched normal samples are not available. AVAILABILITY ExomeAI is a web-based application, publicly available at: http://genomequebec.mcgill.ca/exomeai. CONTACT JavadNadaf@gmail.com or somayyeh.fahiminiya@mcgill.ca SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Javad Nadaf
- Department of Human Genetics, Faculty of Medicine, McGill University and Genome Quebec Innovation Center, Montreal, Quebec, Canada
| | - Jacek Majewski
- Department of Human Genetics, Faculty of Medicine, McGill University and Genome Quebec Innovation Center, Montreal, Quebec, Canada
| | - Somayyeh Fahiminiya
- Department of Human Genetics, Faculty of Medicine, McGill University and Genome Quebec Innovation Center, Montreal, Quebec, Canada
| |
Collapse
|
203
|
Bellos E, Kumar V, Lin C, Maggi J, Phua ZY, Cheng CY, Cheung CMG, Hibberd ML, Wong TY, Coin LJM, Davila S. cnvCapSeq: detecting copy number variation in long-range targeted resequencing data. Nucleic Acids Res 2014; 42:e158. [PMID: 25228465 PMCID: PMC4227763 DOI: 10.1093/nar/gku849] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Targeted resequencing technologies have allowed for efficient and cost-effective detection of genomic variants in specific regions of interest. Although capture sequencing has been primarily used for investigating single nucleotide variants and indels, it has the potential to elucidate a broader spectrum of genetic variation, including copy number variants (CNVs). Various methods exist for detecting CNV in whole-genome and exome sequencing datasets. However, no algorithms have been specifically designed for contiguous target sequencing, despite its increasing importance in clinical and research applications. We have developed cnvCapSeq, a novel method for accurate and sensitive CNV discovery and genotyping in long-range targeted resequencing. cnvCapSeq was benchmarked using a simulated contiguous capture sequencing dataset comprising 21 genomic loci of various lengths. cnvCapSeq was shown to outperform the best existing exome CNV method by a wide margin both in terms of sensitivity (92.0 versus 48.3%) and specificity (99.8 versus 70.5%). We also applied cnvCapSeq to a real capture sequencing cohort comprising a contiguous 358 kb region that contains the Complement Factor H gene cluster. In this dataset, cnvCapSeq identified 41 samples with CNV, including two with duplications, with a genotyping accuracy of 99%, as ascertained by quantitative real-time PCR.
Collapse
Affiliation(s)
- Evangelos Bellos
- Department of Genomics of Common Disease, School of Public Health, Imperial College London, London W12 0NN, UK
| | - Vikrant Kumar
- Genome Institute of Singapore, 60 Biopolis St., 138672, Singapore
| | - Clarabelle Lin
- Genome Institute of Singapore, 60 Biopolis St., 138672, Singapore
| | - Jordi Maggi
- Institute of Medical Molecular Genetics, University of Zurich, Wagistrasse 12, 8952 Schlieren, Switzerland
| | - Zai Yang Phua
- Genome Institute of Singapore, 60 Biopolis St., 138672, Singapore
| | - Ching-Yu Cheng
- Singapore Eye Research Institute, Singapore National Eye Center, 11 Third Hospital Avenue, 168751, Singapore Department of Ophthalmology, National University of Singapore, 1E Kent Ridge Road, 119228, Singapore
| | - Chui Ming Gemmy Cheung
- Singapore Eye Research Institute, Singapore National Eye Center, 11 Third Hospital Avenue, 168751, Singapore Department of Ophthalmology, National University of Singapore, 1E Kent Ridge Road, 119228, Singapore
| | - Martin L Hibberd
- Genome Institute of Singapore, 60 Biopolis St., 138672, Singapore Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK
| | - Tien Yin Wong
- Singapore Eye Research Institute, Singapore National Eye Center, 11 Third Hospital Avenue, 168751, Singapore Department of Ophthalmology, National University of Singapore, 1E Kent Ridge Road, 119228, Singapore
| | - Lachlan J M Coin
- Department of Genomics of Common Disease, School of Public Health, Imperial College London, London W12 0NN, UK Institute for Molecular Bioscience, University of Queensland, St Lucia, QLD 4072, Australia
| | - Sonia Davila
- Genome Institute of Singapore, 60 Biopolis St., 138672, Singapore
| |
Collapse
|
204
|
Magi A, Tattini L, Cifola I, D'Aurizio R, Benelli M, Mangano E, Battaglia C, Bonora E, Kurg A, Seri M, Magini P, Giusti B, Romeo G, Pippucci T, De Bellis G, Abbate R, Gensini GF. EXCAVATOR: detecting copy number variants from whole-exome sequencing data. Genome Biol 2014; 14:R120. [PMID: 24172663 PMCID: PMC4053953 DOI: 10.1186/gb-2013-14-10-r120] [Citation(s) in RCA: 188] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2013] [Accepted: 10/30/2013] [Indexed: 12/11/2022] Open
Abstract
We developed a novel software tool, EXCAVATOR, for the detection of copy number variants (CNVs) from whole-exome sequencing data. EXCAVATOR combines a three-step normalization procedure with a novel heterogeneous hidden Markov model algorithm and a calling method that classifies genomic regions into five copy number states. We validate EXCAVATOR on three datasets and compare the results with three other methods. These analyses show that EXCAVATOR outperforms the other methods and is therefore a valuable tool for the investigation of CNVs in largescale projects, as well as in clinical research and diagnostics. EXCAVATOR is freely available at http://sourceforge.net/projects/excavatortool/.
Collapse
|
205
|
Liu B, Morrison CD, Johnson CS, Trump DL, Qin M, Conroy JC, Wang J, Liu S. Computational methods for detecting copy number variations in cancer genome using next generation sequencing: principles and challenges. Oncotarget 2014; 4:1868-81. [PMID: 24240121 PMCID: PMC3875755 DOI: 10.18632/oncotarget.1537] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Accurate detection of somatic copy number variations (CNVs) is an essential part of cancer genome analysis, and plays an important role in oncotarget identifications. Next generation sequencing (NGS) holds the promise to revolutionize somatic CNV detection. In this review, we provide an overview of current analytic tools used for CNV detection in NGS-based cancer studies. We summarize the NGS data types used for CNV detection, decipher the principles for data preprocessing, segmentation, and interpretation, and discuss the challenges in somatic CNV detection. This review aims to provide a guide to the analytic tools used in NGS-based cancer CNV studies, and to discuss the important factors that researchers need to consider when analyzing NGS data for somatic CNV detections.
Collapse
Affiliation(s)
- Biao Liu
- Center for Personalized Medicine, Roswell Park Cancer Institute, Buffalo, NY
| | | | | | | | | | | | | | | |
Collapse
|
206
|
Amarasinghe KC, Li J, Hunter SM, Ryland GL, Cowin PA, Campbell IG, Halgamuge SK. Inferring copy number and genotype in tumour exome data. BMC Genomics 2014; 15:732. [PMID: 25167919 PMCID: PMC4162913 DOI: 10.1186/1471-2164-15-732] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2014] [Accepted: 08/18/2014] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Using whole exome sequencing to predict aberrations in tumours is a cost effective alternative to whole genome sequencing, however is predominantly used for variant detection and infrequently utilised for detection of somatic copy number variation. RESULTS We propose a new method to infer copy number and genotypes using whole exome data from paired tumour/normal samples. Our algorithm uses two Hidden Markov Models to predict copy number and genotypes and computationally resolves polyploidy/aneuploidy, normal cell contamination and signal baseline shift. Our method makes explicit detection on chromosome arm level events, which are commonly found in tumour samples. The methods are combined into a package named ADTEx (Aberration Detection in Tumour Exome). We applied our algorithm to a cohort of 17 in-house generated and 18 TCGA paired ovarian cancer/normal exomes and evaluated the performance by comparing against the copy number variations and genotypes predicted using Affymetrix SNP 6.0 data of the same samples. Further, we carried out a comparison study to show that ADTEx outperformed its competitors in terms of precision and F-measure. CONCLUSIONS Our proposed method, ADTEx, uses both depth of coverage ratios and B allele frequencies calculated from whole exome sequencing data, to predict copy number variations along with their genotypes. ADTEx is implemented as a user friendly software package using Python and R statistical language. Source code and sample data are freely available under GNU license (GPLv3) at http://adtex.sourceforge.net/.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Saman K Halgamuge
- Optimisation and Pattern Recognition group, Mechanical Engineering Department, Melbourne School of Engineering, The University of Melbourne, Parkville, Victoria 3010, Australia.
| |
Collapse
|
207
|
Kadalayil L, Rafiq S, Rose-Zerilli MJJ, Pengelly RJ, Parker H, Oscier D, Strefford JC, Tapper WJ, Gibson J, Ennis S, Collins A. Exome sequence read depth methods for identifying copy number changes. Brief Bioinform 2014; 16:380-92. [PMID: 25169955 DOI: 10.1093/bib/bbu027] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2014] [Accepted: 07/10/2014] [Indexed: 01/04/2023] Open
Abstract
Copy number variants (CNVs) play important roles in a number of human diseases and in pharmacogenetics. Powerful methods exist for CNV detection in whole genome sequencing (WGS) data, but such data are costly to obtain. Many disease causal CNVs span or are found in genome coding regions (exons), which makes CNV detection using whole exome sequencing (WES) data attractive. If reliably validated against WGS-based CNVs, exome-derived CNVs have potential applications in a clinical setting. Several algorithms have been developed to exploit exome data for CNV detection and comparisons made to find the most suitable methods for particular data samples. The results are not consistent across studies. Here, we review some of the exome CNV detection methods based on depth of coverage profiles and examine their performance to identify problems contributing to discrepancies in published results. We also present a streamlined strategy that uses a single metric, the likelihood ratio, to compare exome methods, and we demonstrated its utility using the VarScan 2 and eXome Hidden Markov Model (XHMM) programs using paired normal and tumour exome data from chronic lymphocytic leukaemia patients. We use array-based somatic CNV (SCNV) calls as a reference standard to compute prevalence-independent statistics, such as sensitivity, specificity and likelihood ratio, for validation of the exome-derived SCNVs. We also account for factors known to influence the performance of exome read depth methods, such as CNV size and frequency, while comparing our findings with published results.
Collapse
|
208
|
Samarakoon PS, Sorte HS, Kristiansen BE, Skodje T, Sheng Y, Tjønnfjord GE, Stadheim B, Stray-Pedersen A, Rødningen OK, Lyle R. Identification of copy number variants from exome sequence data. BMC Genomics 2014; 15:661. [PMID: 25102989 PMCID: PMC4132917 DOI: 10.1186/1471-2164-15-661] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Accepted: 07/01/2014] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND With advances in next generation sequencing technologies and genomic capture techniques, exome sequencing has become a cost-effective approach for mutation detection in genetic diseases. However, computational prediction of copy number variants (CNVs) from exome sequence data is a challenging task. Whilst numerous programs are available, they have different sensitivities, and have low sensitivity to detect smaller CNVs (1-4 exons). Additionally, exonic CNV discovery using standard aCGH has limitations due to the low probe density over exonic regions. The goal of our study was to develop a protocol to detect exonic CNVs (including shorter CNVs that cover 1-4 exons), combining computational prediction algorithms and a high-resolution custom CGH array. RESULTS We used six published CNV prediction programs (ExomeCNV, CONTRA, ExomeCopy, ExomeDepth, CoNIFER, XHMM) and an in-house modification to ExomeCopy and ExomeDepth (ExCopyDepth) for computational CNV prediction on 30 exomes from the 1000 genomes project and 9 exomes from primary immunodeficiency patients. CNV predictions were tested using a custom CGH array designed to capture all exons (exaCGH). After this validation, we next evaluated the computational prediction of shorter CNVs. ExomeCopy and the in-house modified algorithm, ExCopyDepth, showed the highest capability in detecting shorter CNVs. Finally, the performance of each computational program was assessed by calculating the sensitivity and false positive rate. CONCLUSIONS In this paper, we assessed the ability of 6 computational programs to predict CNVs, focussing on short (1-4 exon) CNVs. We also tested these predictions using a custom array targeting exons. Based on these results, we propose a protocol to identify and confirm shorter exonic CNVs combining computational prediction algorithms and custom aCGH experiments.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - Robert Lyle
- Department of Medical Genetics, University of Oslo, Oslo, Norway.
| |
Collapse
|
209
|
Improved molecular diagnosis by the detection of exonic deletions with target gene capture and deep sequencing. Genet Med 2014; 17:99-107. [PMID: 25032985 PMCID: PMC4338802 DOI: 10.1038/gim.2014.80] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2014] [Accepted: 05/29/2014] [Indexed: 12/25/2022] Open
Abstract
Purpose: We aimed to demonstrate the detection of exonic deletions using target capture and deep sequencing data. Methods: Sequence data from target gene capture followed by massively parallel sequencing were analyzed for the detection of exonic deletions using the normalized mean coverage of individual exons. We compared the results with those obtained from high-density exon-targeted array comparative genomic hybridization and applied similar analysis to examine samples from patients with pathogenic exonic deletions. Results: Thirty-eight samples, each containing 2,134, 2,833, or 4,688 coding exons from different panels, with a total of 103,863 exons, were analyzed by capture–massively parallel sequencing and array comparative genomic hybridization. Ten deletions detected by array comparative genomic hybridization were all detected by massively parallel sequencing, whereas only two of three duplications were detected. We were able to detect all pathogenic exonic deletions in 11 positive cases. Thirty-one exonic copy number changes from nine perspective clinical samples were also identified. Conclusion: Our results demonstrated the feasibility of using the same set of sequence data to detect both point mutations and exonic deletions, thus improving the diagnostic power of massively parallel sequencing–based assays.
Collapse
|
210
|
Boeva V, Popova T, Lienard M, Toffoli S, Kamal M, Le Tourneau C, Gentien D, Servant N, Gestraud P, Rio Frio T, Hupé P, Barillot E, Laes JF. Multi-factor data normalization enables the detection of copy number aberrations in amplicon sequencing data. ACTA ACUST UNITED AC 2014; 30:3443-50. [PMID: 25016581 PMCID: PMC4253825 DOI: 10.1093/bioinformatics/btu436] [Citation(s) in RCA: 99] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
Motivation: Because of its low cost, amplicon sequencing, also known as ultra-deep targeted sequencing, is now becoming widely used in oncology for detection of actionable mutations, i.e. mutations influencing cell sensitivity to targeted therapies. Amplicon sequencing is based on the polymerase chain reaction amplification of the regions of interest, a process that considerably distorts the information on copy numbers initially present in the tumor DNA. Therefore, additional experiments such as single nucleotide polymorphism (SNP) or comparative genomic hybridization (CGH) arrays often complement amplicon sequencing in clinics to identify copy number status of genes whose amplification or deletion has direct consequences on the efficacy of a particular cancer treatment. So far, there has been no proven method to extract the information on gene copy number aberrations based solely on amplicon sequencing. Results: Here we present ONCOCNV, a method that includes a multifactor normalization and annotation technique enabling the detection of large copy number changes from amplicon sequencing data. We validated our approach on high and low amplicon density datasets and demonstrated that ONCOCNV can achieve a precision comparable with that of array CGH techniques in detecting copy number aberrations. Thus, ONCOCNV applied on amplicon sequencing data would make the use of additional array CGH or SNP array experiments unnecessary. Availability and implementation:http://oncocnv.curie.fr/ Contact:valentina.boeva@curie.fr Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Valentina Boeva
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium
| | - Tatiana Popova
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium
| | - Maxime Lienard
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium
| | - Sebastien Toffoli
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium
| | - Maud Kamal
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium
| | - Christophe Le Tourneau
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium
| | - David Gentien
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium
| | - Nicolas Servant
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium
| | - Pierre Gestraud
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium
| | - Thomas Rio Frio
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium
| | - Philippe Hupé
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900
| | - Emmanuel Barillot
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium
| | - Jean-François Laes
- Inserm, U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie, Centre de Recherche, 26 rue d'Ulm, Paris 75248, Mines ParisTech, Fontainebleau 77300, Inserm, U830, Genetics and Biology of Cancers, Paris 75248, France, Institut de Pathologie et de Génétique, Gosselies 6041, Belgium, Clinical Research Department, Department of Medical Oncology, Plateforme de Génomique, Département de recherche translationnelle, Centre de recherche, Next-generation sequencing platform, Institut Curie, CNRS, UMR144, Subcellular Structure and cellular Dynamics, Paris 75248, France and OncoDNA, Gosselies 6041, Belgium
| |
Collapse
|
211
|
Jeck WR, Parker J, Carson CC, Shields JM, Sambade MJ, Peters EC, Burd CE, Thomas NE, Chiang DY, Liu W, Eberhard DA, Ollila D, Grilley-Olson J, Moschos S, Neil Hayes D, Sharpless NE. Targeted next generation sequencing identifies clinically actionable mutations in patients with melanoma. Pigment Cell Melanoma Res 2014; 27:653-63. [PMID: 24628946 PMCID: PMC4121659 DOI: 10.1111/pcmr.12238] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2013] [Revised: 01/08/2014] [Indexed: 12/30/2022]
Abstract
Somatic sequencing of cancers has produced new insight into tumorigenesis, tumor heterogeneity, and disease progression, but the vast majority of genetic events identified are of indeterminate clinical significance. Here, we describe a NextGen sequencing approach to fully analyzing 248 genes, including all those of known clinical significance in melanoma. This strategy features solution capture of DNA followed by multiplexed, high-throughput sequencing and was evaluated in 31 melanoma cell lines and 18 tumor tissues from patients with metastatic melanoma. Mutations in melanoma cell lines correlated with their sensitivity to corresponding small molecule inhibitors, confirming, for example, lapatinib sensitivity in ERBB4 mutant lines and identifying a novel activating mutation of BRAF. The latter event would not have been identified by clinical sequencing and was associated with responsiveness to a BRAF kinase inhibitor. This approach identified focal copy number changes of PTEN not found by standard methods, such as comparative genomic hybridization (CGH). Actionable mutations were found in 89% of the tumor tissues analyzed, 56% of which would not be identified by standard-of-care approaches. This work shows that targeted sequencing is an attractive approach for clinical use in melanoma.
Collapse
Affiliation(s)
- William R Jeck
- Department of Genetics, University of North Carolina School of Medicine, Chapel Hill, NC, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
212
|
Lacey S, Chung JY, Lin H. A comparison of whole genome sequencing with exome sequencing for family-based association studies. BMC Proc 2014; 8:S38. [PMID: 25519383 PMCID: PMC4143706 DOI: 10.1186/1753-6561-8-s1-s38] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
As the cost of DNA sequencing decreases, association studies based on whole genome sequencing are now becoming feasible. It is still unclear, however, how much more we could gain from whole genome sequencing compared to exome sequencing, which has been widely used to study a variety of diseases. In this project, we performed a comparison between whole genome sequencing and exome sequencing for family-based association analysis using data from Genetic Analysis Workshop 18. Whole genome sequencing was able to identify several significant hits within intergenic regions. However, the increased cost of multiple testing counteracted the benefits and resulted in a higher false discovery rate. Our results suggest that exome sequencing is a cost-effective way to identify disease-related variants. With the decreasing sequencing cost and accumulating knowledge of the human genome, whole genome sequencing has the potential to identify important variants in regulatory regions typically inaccessible for exome sequencing.
Collapse
Affiliation(s)
- Sean Lacey
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue 3rd Floor, Boston, MA 02118, USA
| | - Jae Yoon Chung
- Bioinformatics Program, Boston University, 44 Cummington Mall, Boston, MA 02215, USA
| | - Honghuang Lin
- Department of Medicine, Boston University School of Medicine, 72 East Concord Street, Boston, MA 02118, USA
| |
Collapse
|
213
|
Teer JK. An improved understanding of cancer genomics through massively parallel sequencing. Transl Cancer Res 2014; 3:243-259. [PMID: 26146607 PMCID: PMC4486294 DOI: 10.3978/j.issn.2218-676x.2014.05.05] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
DNA sequencing technology advances have enabled genetic investigation of more samples in a shorter time than has previously been possible. Furthermore, the ability to analyze and understand large sequencing datasets has improved due to concurrent advances in sequence data analysis methods and software tools. Constant improvements to both technology and analytic approaches in this fast moving field are evidenced by many recent publications of computational methods, as well as biological results linking genetic events to human disease. Cancer in particular has been the subject of intense investigation, owing to the genetic underpinnings of this complex collection of diseases. New massively-parallel sequencing (MPS) technologies have enabled the investigation of thousands of samples, divided across tens of different tumor types, resulting in new driver gene identification, mutagenic pattern characterization, and other newly uncovered features of tumor biology. This review will focus both on methods and recent results: current analytical approaches to DNA and RNA sequencing will be presented followed by a review of recent pan-cancer sequencing studies. This overview of methods and results will not only highlight the recent advances in cancer genomics, but also the methods and tools used to accomplish these advancements in a constantly and rapidly improving field.
Collapse
Affiliation(s)
- Jamie K Teer
- , H. Lee Moffitt Cancer Center and Research Institute, 12902 Magnolia Dr., Tampa, FL 33612, Tel: 813-745-2650
| |
Collapse
|
214
|
Wang C, Evans JM, Bhagwate AV, Prodduturi N, Sarangi V, Middha M, Sicotte H, Vedell PT, Hart SN, Oliver GR, Kocher JPA, Maurer MJ, Novak AJ, Slager SL, Cerhan JR, Asmann YW. PatternCNV: a versatile tool for detecting copy number changes from exome sequencing data. ACTA ACUST UNITED AC 2014; 30:2678-80. [PMID: 24876377 PMCID: PMC4155258 DOI: 10.1093/bioinformatics/btu363] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Motivation: Exome sequencing (exome-seq) data, which are typically used for calling exonic mutations, have also been utilized in detecting DNA copy number variations (CNVs). Despite the existence of several CNV detection tools, there is still a great need for a sensitive and an accurate CNV-calling algorithm with built-in QC steps, and does not require a paired reference for each sample. Results: We developed a novel method named PatternCNV, which (i) accounts for the read coverage variations between exons while leveraging the consistencies of this variability across different samples; (ii) reduces alignment BAM files to WIG format and therefore greatly accelerates computation; (iii) incorporates multiple QC measures designed to identify outlier samples and batch effects; and (iv) provides a variety of visualization options including chromosome, gene and exon-level views of CNVs, along with a tabular summarization of the exon-level CNVs. Compared with other CNV-calling algorithms using data from a lymphoma exome-seq study, PatternCNV has higher sensitivity and specificity. Availability and implementation: The software for PatternCNV is implemented using Perl and R, and can be used in Mac or Linux environments. Software and user manual are available at http://bioinformaticstools.mayo.edu/research/patterncnv/, and R package at https://github.com/topsoil/patternCNV/. Contact:Asmann.Yan@mayo.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chen Wang
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Jared M Evans
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Aditya V Bhagwate
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Naresh Prodduturi
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Vivekananda Sarangi
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Mridu Middha
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Hugues Sicotte
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Peter T Vedell
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Steven N Hart
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Gavin R Oliver
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Jean-Pierre A Kocher
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Matthew J Maurer
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Anne J Novak
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Susan L Slager
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - James R Cerhan
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| | - Yan W Asmann
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Division of Epidemiology, Department of Health Sciences Research, Division of Hematology, Department of Internal Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 and Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville, FL 32224, USA
| |
Collapse
|
215
|
Yu Z, Liu Y, Shen Y, Wang M, Li A. CLImAT: accurate detection of copy number alteration and loss of heterozygosity in impure and aneuploid tumor samples using whole-genome sequencing data. ACTA ACUST UNITED AC 2014; 30:2576-83. [PMID: 24845652 PMCID: PMC4155249 DOI: 10.1093/bioinformatics/btu346] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Motivation: Whole-genome sequencing of tumor samples has been demonstrated as an efficient approach for comprehensive analysis of genomic aberrations in cancer genome. Critical issues such as tumor impurity and aneuploidy, GC-content and mappability bias have been reported to complicate identification of copy number alteration and loss of heterozygosity in complex tumor samples. Therefore, efficient computational methods are required to address these issues. Results: We introduce CLImAT (CNA and LOH Assessment in Impure and Aneuploid Tumors), a bioinformatics tool for identification of genomic aberrations from tumor samples using whole-genome sequencing data. Without requiring a matched normal sample, CLImAT takes integrated analysis of read depth and allelic frequency and provides extensive data processing procedures including GC-content and mappability correction of read depth and quantile normalization of B-allele frequency. CLImAT accurately identifies copy number alteration and loss of heterozygosity even for highly impure tumor samples with aneuploidy. We evaluate CLImAT on both simulated and real DNA sequencing data to demonstrate its ability to infer tumor impurity and ploidy and identify genomic aberrations in complex tumor samples. Availability and implementation: The CLImAT software package can be freely downloaded at http://bioinformatics.ustc.edu.cn/CLImAT/. Contact: aoli@ustc.edu.cn Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhenhua Yu
- School of Information Science and Technology and Centers for Biomedical Engineering, University of Science and Technology of China, Hefei AH230027, China
| | - Yuanning Liu
- School of Information Science and Technology and Centers for Biomedical Engineering, University of Science and Technology of China, Hefei AH230027, China
| | - Yi Shen
- School of Information Science and Technology and Centers for Biomedical Engineering, University of Science and Technology of China, Hefei AH230027, China
| | - Minghui Wang
- School of Information Science and Technology and Centers for Biomedical Engineering, University of Science and Technology of China, Hefei AH230027, China School of Information Science and Technology and Centers for Biomedical Engineering, University of Science and Technology of China, Hefei AH230027, China
| | - Ao Li
- School of Information Science and Technology and Centers for Biomedical Engineering, University of Science and Technology of China, Hefei AH230027, China School of Information Science and Technology and Centers for Biomedical Engineering, University of Science and Technology of China, Hefei AH230027, China
| |
Collapse
|
216
|
Tan R, Wang Y, Kleinstein SE, Liu Y, Zhu X, Guo H, Jiang Q, Allen AS, Zhu M. An evaluation of copy number variation detection tools from whole-exome sequencing data. Hum Mutat 2014; 35:899-907. [PMID: 24599517 DOI: 10.1002/humu.22537] [Citation(s) in RCA: 151] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2013] [Accepted: 02/21/2014] [Indexed: 01/11/2023]
Abstract
Copy number variation (CNV) has been found to play an important role in human disease. Next-generation sequencing technology, including whole-genome sequencing (WGS) and whole-exome sequencing (WES), has become a primary strategy for studying the genetic basis of human disease. Several CNV calling tools have recently been developed on the basis of WES data. However, the comparative performance of these tools using real data remains unclear. An objective evaluation study of these tools in practical research situations would be beneficial. Here, we evaluated four well-known WES-based CNV detection tools (XHMM, CoNIFER, ExomeDepth, and CONTRA) using real data generated in house. After evaluation using six metrics, we found that the sensitive and accurate detection of CNVs in WES data remains challenging despite the many algorithms available. Each algorithm has its own strengths and weaknesses. None of the exome-based CNV calling methods performed well in all situations; in particular, compared with CNVs identified from high coverage WGS data from the same samples, all tools suffered from limited power. Our evaluation provides a comprehensive and objective comparison of several well-known detection tools designed for WES data, which will assist researchers in choosing the most suitable tools for their research needs.
Collapse
Affiliation(s)
- Renjie Tan
- Center for Biomedical Informatics, School of Computer Science and Technology, Harbin Institute Technology, Harbin, Heilongjiang, China; Center for Human Genome Variation, Duke University School of Medicine, Durham, North Carolina
| | | | | | | | | | | | | | | | | |
Collapse
|
217
|
Backenroth D, Homsy J, Murillo LR, Glessner J, Lin E, Brueckner M, Lifton R, Goldmuntz E, Chung WK, Shen Y. CANOES: detecting rare copy number variants from whole exome sequencing data. Nucleic Acids Res 2014; 42:e97. [PMID: 24771342 PMCID: PMC4081054 DOI: 10.1093/nar/gku345] [Citation(s) in RCA: 103] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
We present CANOES, an algorithm for the detection of rare copy number variants from exome sequencing data. CANOES models read counts using a negative binomial distribution and estimates variance of the read counts using a regression-based approach based on selected reference samples in a given dataset. We test CANOES on a family-based exome sequencing dataset, and show that its sensitivity and specificity is comparable to that of XHMM. Moreover, the method is complementary to Gaussian approximation-based methods (e.g. XHMM or CoNIFER). When CANOES is used in combination with these methods, it will be possible to produce high accuracy calls, as demonstrated by a much reduced and more realistic de novo rate in results from trio data.
Collapse
Affiliation(s)
- Daniel Backenroth
- Departments of Systems Biology and Biomedical Informatics, Columbia University Medical Center, New York, NY 10032, USA JP Sulzberger Columbia Genome Center, Columbia University Medical Center, New York, NY 10032, USA
| | - Jason Homsy
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA 02114, USA Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Laura R Murillo
- Departments of Pediatrics and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Joe Glessner
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Edwin Lin
- Departments of Systems Biology and Biomedical Informatics, Columbia University Medical Center, New York, NY 10032, USA Departments of Pediatrics and Medicine, Columbia University Medical Center, New York, NY 10032, USA
| | - Martina Brueckner
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06510, USA
| | - Richard Lifton
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06510, USA Howard Hughes Medical Institute, Yale University, New Haven, CT 06510, USA
| | - Elizabeth Goldmuntz
- Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Wendy K Chung
- Departments of Pediatrics and Medicine, Columbia University Medical Center, New York, NY 10032, USA
| | - Yufeng Shen
- Departments of Systems Biology and Biomedical Informatics, Columbia University Medical Center, New York, NY 10032, USA JP Sulzberger Columbia Genome Center, Columbia University Medical Center, New York, NY 10032, USA
| |
Collapse
|
218
|
Fromer M, Purcell SM. Using XHMM Software to Detect Copy Number Variation in Whole-Exome Sequencing Data. CURRENT PROTOCOLS IN HUMAN GENETICS 2014; 81:7.23.1-7.23.21. [PMID: 24763994 PMCID: PMC4065038 DOI: 10.1002/0471142905.hg0723s81] [Citation(s) in RCA: 104] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Copy number variation (CNV) has emerged as an important genetic component in human diseases, which are increasingly being studied for large numbers of samples by sequencing the coding regions of the genome, i.e., exome sequencing. Nonetheless, detecting this variation from such targeted sequencing data is a difficult task, involving sorting out signal from noise, for which we have recently developed a set of statistical and computational tools called XHMM. In this unit, we give detailed instructions on how to run XHMM and how to use the resulting CNV calls in biological analyses.
Collapse
Affiliation(s)
- Menachem Fromer
- Division of Psychiatric Genomics and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA,Stanley Center for Psychiatric Research and Medical and Population Genetics Program, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA,Analytic and Translational Genetics Unit, Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Shaun M. Purcell
- Division of Psychiatric Genomics and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA,Stanley Center for Psychiatric Research and Medical and Population Genetics Program, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA,Analytic and Translational Genetics Unit, Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| |
Collapse
|
219
|
Li J, Doyle MA, Saeed I, Wong SQ, Mar V, Goode DL, Caramia F, Doig K, Ryland GL, Thompson ER, Hunter SM, Halgamuge SK, Ellul J, Dobrovic A, Campbell IG, Papenfuss AT, McArthur GA, Tothill RW. Bioinformatics pipelines for targeted resequencing and whole-exome sequencing of human and mouse genomes: a virtual appliance approach for instant deployment. PLoS One 2014; 9:e95217. [PMID: 24752294 PMCID: PMC3994043 DOI: 10.1371/journal.pone.0095217] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Accepted: 03/25/2014] [Indexed: 12/30/2022] Open
Abstract
Targeted resequencing by massively parallel sequencing has become an effective and affordable way to survey small to large portions of the genome for genetic variation. Despite the rapid development in open source software for analysis of such data, the practical implementation of these tools through construction of sequencing analysis pipelines still remains a challenging and laborious activity, and a major hurdle for many small research and clinical laboratories. We developed TREVA (Targeted REsequencing Virtual Appliance), making pre-built pipelines immediately available as a virtual appliance. Based on virtual machine technologies, TREVA is a solution for rapid and efficient deployment of complex bioinformatics pipelines to laboratories of all sizes, enabling reproducible results. The analyses that are supported in TREVA include: somatic and germline single-nucleotide and insertion/deletion variant calling, copy number analysis, and cohort-based analyses such as pathway and significantly mutated genes analyses. TREVA is flexible and easy to use, and can be customised by Linux-based extensions if required. TREVA can also be deployed on the cloud (cloud computing), enabling instant access without investment overheads for additional hardware. TREVA is available at http://bioinformatics.petermac.org/treva/.
Collapse
Affiliation(s)
- Jason Li
- Bioinformatics, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
- Department of Mechanical Engineering, The University of Melbourne, Parkville, VIC, Australia
| | - Maria A. Doyle
- Bioinformatics, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
| | - Isaam Saeed
- Department of Mechanical Engineering, The University of Melbourne, Parkville, VIC, Australia
- YourGene Biosciences Australia, Southbank, VIC, Australia
| | - Stephen Q. Wong
- Molecular Pathology Research and Development Laboratory, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
| | - Victoria Mar
- Victorian Melanoma Service, Alfred Hospital, Prahran, VIC, Australia
- Department of Epidemiology and Preventive Medicine, Monash University, Clayton, VIC, Australia
- Molecular Oncology Laboratory, Oncogenic Signaling and Growth Control Program, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
| | - David L. Goode
- Sarcoma Genetics and Genomics Laboratory, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC, Australia
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
| | - Franco Caramia
- Bioinformatics, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
| | - Ken Doig
- Bioinformatics, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
| | - Georgina L. Ryland
- Cancer Genetics Laboratory, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
| | - Ella R. Thompson
- Cancer Genetics Laboratory, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
| | - Sally M. Hunter
- Cancer Genetics Laboratory, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
| | - Saman K. Halgamuge
- Department of Mechanical Engineering, The University of Melbourne, Parkville, VIC, Australia
| | - Jason Ellul
- Bioinformatics, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
| | - Alexander Dobrovic
- Molecular Pathology Research and Development Laboratory, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
- Translational Genomics & Epigenomics Laboratory, Ludwig Institute for Cancer Research, Heidelberg, VIC, Australia
| | - Ian G. Campbell
- Cancer Genetics Laboratory, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC, Australia
| | - Anthony T. Papenfuss
- Bioinformatics division, The Walter and Eliza Hall Institute for Medical Research, Parkville, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC, Australia
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
| | - Grant A. McArthur
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC, Australia
- Molecular Oncology Laboratory, Oncogenic Signaling and Growth Control Program, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
- Translational Research Laboratory, Cancer Therapeutics Program, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
- Department of Medicine, St. Vincent’s Hospital, Fitzroy, VIC, Australia
- Department of Pathology, University of Melbourne, Parkville, VIC, Australia
| | - Richard W. Tothill
- Translational Research Laboratory, Cancer Therapeutics Program, Peter MacCallum Cancer Centre, East Melbourne, VIC, Australia
- Department of Pathology, University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
220
|
Watson CT, Marques-Bonet T, Sharp AJ, Mefford HC. The genetics of microdeletion and microduplication syndromes: an update. Annu Rev Genomics Hum Genet 2014; 15:215-244. [PMID: 24773319 DOI: 10.1146/annurev-genom-091212-153408] [Citation(s) in RCA: 115] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Chromosomal abnormalities, including microdeletions and microduplications, have long been associated with abnormal developmental outcomes. Early discoveries relied on a common clinical presentation and the ability to detect chromosomal abnormalities by standard karyotype analysis or specific assays such as fluorescence in situ hybridization. Over the past decade, the development of novel genomic technologies has allowed more comprehensive, unbiased discovery of microdeletions and microduplications throughout the human genome. The ability to quickly interrogate large cohorts using chromosome microarrays and, more recently, next-generation sequencing has led to the rapid discovery of novel microdeletions and microduplications associated with disease, including very rare but clinically significant rearrangements. In addition, the observation that some microdeletions are associated with risk for several neurodevelopmental disorders contributes to our understanding of shared genetic susceptibility for such disorders. Here, we review current knowledge of microdeletion/duplication syndromes, with a particular focus on recurrent rearrangement syndromes.
Collapse
Affiliation(s)
- Corey T Watson
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029
| | - Tomas Marques-Bonet
- Institut de Biologia Evolutiva, Universitat Pompeu Fabra/CSIC, 08003 Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), 08010 Barcelona, Spain.,Centro Nacional de Análisis Genómico, 08023 Barcelona, Spain
| | - Andrew J Sharp
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029
| | - Heather C Mefford
- Department of Pediatrics, University of Washington, Seattle, Washington 98195
| |
Collapse
|
221
|
Natrajan R, Wilkerson PM, Marchiò C, Piscuoglio S, Ng CKY, Wai P, Lambros MB, Samartzis EP, Dedes KJ, Frankum J, Bajrami I, Kopec A, Mackay A, A'hern R, Fenwick K, Kozarewa I, Hakas J, Mitsopoulos C, Hardisson D, Lord CJ, Kumar-Sinha C, Ashworth A, Weigelt B, Sapino A, Chinnaiyan AM, Maher CA, Reis-Filho JS. Characterization of the genomic features and expressed fusion genes in micropapillary carcinomas of the breast. J Pathol 2014; 232:553-65. [PMID: 24395524 PMCID: PMC4013428 DOI: 10.1002/path.4325] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2013] [Revised: 12/04/2013] [Accepted: 12/29/2013] [Indexed: 12/30/2022]
Abstract
Micropapillary carcinoma (MPC) is a rare histological special type of breast cancer, characterized by an aggressive clinical behaviour and a pattern of copy number aberrations (CNAs) distinct from that of grade- and oestrogen receptor (ER)-matched invasive carcinomas of no special type (IC-NSTs). The aims of this study were to determine whether MPCs are underpinned by a recurrent fusion gene(s) or mutations in 273 genes recurrently mutated in breast cancer. Sixteen MPCs were subjected to microarray-based comparative genomic hybridization (aCGH) analysis and Sequenom OncoCarta mutation analysis. Eight and five MPCs were subjected to targeted capture and RNA sequencing, respectively. aCGH analysis confirmed our previous observations about the repertoire of CNAs of MPCs. Sequencing analysis revealed a spectrum of mutations similar to those of luminal B IC-NSTs, and recurrent mutations affecting mitogen-activated protein kinase family genes and NBPF10. RNA-sequencing analysis identified 17 high-confidence fusion genes, eight of which were validated and two of which were in-frame. No recurrent fusions were identified in an independent series of MPCs and IC-NSTs. Forced expression of in-frame fusion genes (SLC2A1-FAF1 and BCAS4-AURKA) resulted in increased viability of breast cancer cells. In addition, genomic disruption of CDK12 caused by out-of-frame rearrangements was found in one MPC and in 13% of HER2-positive breast cancers, identified through a re-analysis of publicly available massively parallel sequencing data. In vitro analyses revealed that CDK12 gene disruption results in sensitivity to PARP inhibition, and forced expression of wild-type CDK12 in a CDK12-null cell line model resulted in relative resistance to PARP inhibition. Our findings demonstrate that MPCs are neither defined by highly recurrent mutations in the 273 genes tested, nor underpinned by a recurrent fusion gene. Although seemingly private genetic events, some of the fusion transcripts found in MPCs may play a role in maintenance of a malignant phenotype and potentially offer therapeutic opportunities.
Collapse
Affiliation(s)
- Rachael Natrajan
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | - Paul M Wilkerson
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | | | - Salvatore Piscuoglio
- Department of Pathology, Memorial Sloan-Kettering Cancer CenterNew York, NY, USA
| | - Charlotte KY Ng
- Department of Pathology, Memorial Sloan-Kettering Cancer CenterNew York, NY, USA
| | - Patty Wai
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | - Maryou B Lambros
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | | | | | - Jessica Frankum
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | - Ilirjana Bajrami
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | - Alicja Kopec
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | - Alan Mackay
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | - Roger A'hern
- Cancer Research UK Clinical Trials Unit, The Institute of Cancer ResearchSutton, UK
| | - Kerry Fenwick
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | - Iwanka Kozarewa
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | - Jarle Hakas
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | - Costas Mitsopoulos
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | - David Hardisson
- Department of Pathology, Hospital Universitario La Paz, Universidad Autonoma de Madrid, Hospital La Paz Institute for Health Research (IdiPAZ)Madrid, Spain
| | - Christopher J Lord
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | - Chandan Kumar-Sinha
- Michigan Center for Translational Pathology (MCTP), Department of Pathology, University of MichiganAnn Arbor, MI, USA
| | - Alan Ashworth
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer ResearchLondon, UK
| | - Britta Weigelt
- Department of Pathology, Memorial Sloan-Kettering Cancer CenterNew York, NY, USA
| | - Anna Sapino
- Department of Medical Sciences, University of TurinTurin, Italy
| | - Arul M Chinnaiyan
- Michigan Center for Translational Pathology (MCTP), Department of Pathology, University of MichiganAnn Arbor, MI, USA
| | - Christopher A Maher
- Washington University Genome Institute, Washington UniversitySt Louis, MO, USA
| | - Jorge S Reis-Filho
- Department of Pathology, Memorial Sloan-Kettering Cancer CenterNew York, NY, USA
| |
Collapse
|
222
|
Alkodsi A, Louhimo R, Hautaniemi S. Comparative analysis of methods for identifying somatic copy number alterations from deep sequencing data. Brief Bioinform 2014; 16:242-54. [DOI: 10.1093/bib/bbu004] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
|
223
|
Johnson AK, Gaudio DD. Clinical utility of next-generation sequencing for the molecular diagnosis of monogenic diabetes. Per Med 2014; 11:155-165. [PMID: 29751380 DOI: 10.2217/pme.13.111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Monogenic diabetes resulting from mutations that primarily reduce insulin-secreting pancreatic β-cell function accounts for 1-2% of all cases of diabetes, and is genetically and clinically heterogeneous. Currently, genetic testing for monogenic diabetes relies on selection of the appropriate gene for analysis based on the availability of comprehensive phenotypic information, which can be time consuming, costly and can limit the differential diagnosis to a few selected genes. In recent years, the exponential growth in the field of high-throughput capture and sequencing technology has made it possible and cost effective to sequence many genes simultaneously, making it an efficient diagnostic tool for clinically and genetically heterogeneous disorders such as monogenic diabetes. Making a diagnosis of monogenic diabetes is important as it enables more appropriate treatment, better prediction of disease prognosis and progression, and counseling and screening of family members. We provide a concise overview of the genetic etiology of some forms of monogenic diabetes, as well as a discussion of the clinical utility of genetic testing by comprehensive multigene panel using next-generation sequencing methodologies.
Collapse
Affiliation(s)
- Amy Knight Johnson
- Department of Human Genetics, University of Chicago, 5841 S Maryland MC0077, Chicago, IL 60637, USA
| | - Daniela Del Gaudio
- Department of Human Genetics, University of Chicago, 5841 S Maryland MC0077, Chicago, IL 60637, USA
| |
Collapse
|
224
|
Jin M, Zhu S, Hu P, Liu D, Li Q, Li Z, Zhang X, Xie Y, Chen X. Genomic and epigenomic analyses of monozygotic twins discordant for congenital renal agenesis. Am J Kidney Dis 2014; 64:119-22. [PMID: 24583054 DOI: 10.1053/j.ajkd.2014.01.423] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2013] [Accepted: 01/06/2014] [Indexed: 11/11/2022]
Abstract
Monozygotic twins have been widely studied to distinguish genetic and environmental factors in the pathogenesis of human diseases. For renal agenesis, the one-sided absence of renal tissue, the relative contributions of genetic and environmental factors to its pathogenesis are still unclear. In this study of a pair of monozygotic twins discordant for congenital renal agenesis, the genomic profile was analyzed from a set of blood samples using high-throughput exome-capture sequencing to detect single-nucleotide polymorphisms (SNPs), copy number variations (CNVs), and insertions and deletions (indels). Also, an epigenomic analysis used reduced-representation bisulfite sequencing to detect differentially methylated regions (DMRs). No discordant SNPs, CNVs, or indels were confirmed, but 514 DMRs were detected. KEGG analysis indicated the DMRs localized to 10 signaling pathways and 25 genes, including the mitogen-activated protein kinase pathway and 6 genes (FGF18, FGF12, PDGFRA, MAPK11, AMH, CTBP1) involved in organ development. Although methylation results from our adult patient and her sister may not represent the pattern that was present during kidney development, we could at least confirm a lack of obvious differences at the genome level, which suggests that nongenetic factors may be involved in the pathogenesis of renal agenesis.
Collapse
Affiliation(s)
- Meiling Jin
- State Key Laboratory of Kidney Disease, Institute of Nephrology, Chinese PLA General Hospital, Beijing, China; Medical College, Nankai University, Tianjin, China
| | | | - Panpan Hu
- State Key Laboratory of Kidney Disease, Institute of Nephrology, Chinese PLA General Hospital, Beijing, China
| | | | - Qinggang Li
- State Key Laboratory of Kidney Disease, Institute of Nephrology, Chinese PLA General Hospital, Beijing, China
| | - Zuoxiang Li
- State Key Laboratory of Kidney Disease, Institute of Nephrology, Chinese PLA General Hospital, Beijing, China
| | - Xueguang Zhang
- State Key Laboratory of Kidney Disease, Institute of Nephrology, Chinese PLA General Hospital, Beijing, China
| | - Yuansheng Xie
- State Key Laboratory of Kidney Disease, Institute of Nephrology, Chinese PLA General Hospital, Beijing, China.
| | - Xiangmei Chen
- State Key Laboratory of Kidney Disease, Institute of Nephrology, Chinese PLA General Hospital, Beijing, China.
| |
Collapse
|
225
|
Ku CS, Wu M, Cooper DN, Naidoo N, Pawitan Y, Pang B, Iacopetta B, Soong R. Exome versus transcriptome sequencing in identifying coding region variants. Expert Rev Mol Diagn 2014; 12:241-51. [DOI: 10.1586/erm.12.10] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
226
|
Hirsch CD, Evans J, Buell CR, Hirsch CN. Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes. Brief Funct Genomics 2014; 13:257-67. [PMID: 24395692 DOI: 10.1093/bfgp/elt051] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Technology and software improvements in the last decade now provide methodologies to access the genome sequence of not only a single accession, but also multiple accessions of plant species. This provides a means to interrogate species diversity at the genome level. Ample diversity among accessions in a collection of species can be found, including single-nucleotide polymorphisms, insertions and deletions, copy number variation and presence/absence variation. For species with small, non-repetitive rich genomes, re-sequencing of query accessions is robust, highly informative, and economically feasible. However, for species with moderate to large sized repetitive-rich genomes, technical and economic barriers prevent en masse genome re-sequencing of accessions. Multiple approaches to access a focused subset of loci in species with larger genomes have been developed, including reduced representation sequencing, exome capture and transcriptome sequencing. Collectively, these approaches have enabled interrogation of diversity on a genome scale for large plant genomes, including crop species important to worldwide food security.
Collapse
|
227
|
Bansal V, Dorn C, Grunert M, Klaassen S, Hetzer R, Berger F, Sperling SR. Outlier-based identification of copy number variations using targeted resequencing in a small cohort of patients with Tetralogy of Fallot. PLoS One 2014; 9:e85375. [PMID: 24400131 PMCID: PMC3882271 DOI: 10.1371/journal.pone.0085375] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2013] [Accepted: 11/26/2013] [Indexed: 12/22/2022] Open
Abstract
Copy number variations (CNVs) are one of the main sources of variability in the human genome. Many CNVs are associated with various diseases including cardiovascular disease. In addition to hybridization-based methods, next-generation sequencing (NGS) technologies are increasingly used for CNV discovery. However, respective computational methods applicable to NGS data are still limited. We developed a novel CNV calling method based on outlier detection applicable to small cohorts, which is of particular interest for the discovery of individual CNVs within families, de novo CNVs in trios and/or small cohorts of specific phenotypes like rare diseases. Approximately 7,000 rare diseases are currently known, which collectively affect ∼6% of the population. For our method, we applied the Dixon's Q test to detect outliers and used a Hidden Markov Model for their assessment. The method can be used for data obtained by exome and targeted resequencing. We evaluated our outlier-based method in comparison to the CNV calling tool CoNIFER using eight HapMap exome samples and subsequently applied both methods to targeted resequencing data of patients with Tetralogy of Fallot (TOF), the most common cyanotic congenital heart disease. In both the HapMap samples and the TOF cases, our method is superior to CoNIFER, such that it identifies more true positive CNVs. Called CNVs in TOF cases were validated by qPCR and HapMap CNVs were confirmed with available array-CGH data. In the TOF patients, we found four copy number gains affecting three genes, of which two are important regulators of heart development (NOTCH1, ISL1) and one is located in a region associated with cardiac malformations (PRODH at 22q11). In summary, we present a novel CNV calling method based on outlier detection, which will be of particular interest for the analysis of de novo or individual CNVs in trios or cohorts up to 30 individuals, respectively.
Collapse
Affiliation(s)
- Vikas Bansal
- Department of Cardiovascular Genetics, Experimental and Clinical Research Center, Charité - Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin, Germany
- Department of Mathematics and Computer Science, Free University of Berlin, Berlin, Germany
| | - Cornelia Dorn
- Department of Cardiovascular Genetics, Experimental and Clinical Research Center, Charité - Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin, Germany
- Department of Biology, Chemistry, and Pharmacy, Free University of Berlin, Berlin, Germany
| | - Marcel Grunert
- Department of Cardiovascular Genetics, Experimental and Clinical Research Center, Charité - Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin, Germany
| | - Sabine Klaassen
- For the National Register for Congenital Heart Defects, Berlin, Germany
- Experimental and Clinical Research Center, Charité - Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin, Germany
- Department of Pediatric Cardiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Roland Hetzer
- Department of Cardiac Surgery, German Heart Institute Berlin, Berlin, Germany
| | - Felix Berger
- Department of Pediatric Cardiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Department of Pediatric Cardiology, German Heart Institute Berlin, Berlin, Germany
| | - Silke R. Sperling
- Department of Cardiovascular Genetics, Experimental and Clinical Research Center, Charité - Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin, Germany
- Department of Biology, Chemistry, and Pharmacy, Free University of Berlin, Berlin, Germany
- * E-mail:
| |
Collapse
|
228
|
Ozer HG, Usubalieva A, Dorrance A, Yilmaz AS, Caligiuri M, Marcucci G, Huang K. Identification of medium-sized copy number alterations in whole-genome sequencing. Cancer Inform 2014; 13:105-11. [PMID: 25788829 PMCID: PMC4356486 DOI: 10.4137/cin.s14023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Revised: 12/29/2014] [Accepted: 01/04/2015] [Indexed: 11/05/2022] Open
Abstract
The genome-wide discoveries such as detection of copy number alterations (CNA) from high-throughput whole-genome sequencing data enabled new developments in personalized medicine. The CNAs have been reported to be associated with various diseases and cancers including acute myeloid leukemia. However, there are multiple challenges to the use of current CNA detection tools that lead to high false-positive rates and thus impede widespread use of such tools in cancer research. In this paper, we discuss these issues and propose possible solutions. First, since the entire genome cannot be mapped due to some regions lacking sequence uniqueness, current methods cannot be appropriately adjusted to handle these regions in the analyses. Thus, detection of medium-sized CNAs is also being directly affected by these mappability problems. The requirement for matching control samples is also an important limitation because acquiring matching controls might not be possible or might not be cost efficient. Here we present an approach that addresses these issues and detects medium-sized CNAs in cancer genomes by (1) masking unmappable regions during the initial CNA detection phase, (2) using pool of a few normal samples as control, and (3) employing median filtering to adjust CNA ratios to its surrounding coverage and eliminate false positives.
Collapse
Affiliation(s)
- Hatice Gulcin Ozer
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Aisulu Usubalieva
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Adrienne Dorrance
- Division of Hematology, Department of Medicine, The Ohio State University, Columbus, OH, USA
| | - Ayse Selen Yilmaz
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Michael Caligiuri
- Division of Hematology, Department of Medicine, The Ohio State University, Columbus, OH, USA
| | - Guido Marcucci
- Division of Hematology, Department of Medicine, The Ohio State University, Columbus, OH, USA
| | - Kun Huang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
229
|
Abstract
Cancer is a complex disease driven by multiple mutations acquired over the lifetime of the cancer cells. These alterations, termed somatic mutations to distinguish them from inherited germline mutations, can include single-nucleotide substitutions, insertions, deletions, copy number alterations, and structural rearrangements. A patient's cancer can contain a combination of these aberrations, and the ability to generate a comprehensive genetic profile should greatly improve patient diagnosis and treatment. Next-generation sequencing has become the tool of choice to uncover multiple cancer mutations from a single tumor source, and the falling costs of this rapid high-throughput technology are encouraging its transition from basic research into a clinical setting. However, the detection of mutations in sequencing data is still an evolving area and cancer genomic data requires some special considerations. This chapter discusses these aspects and gives an overview of current bioinformatics methods for the detection of somatic mutations in cancer sequencing data.
Collapse
|
230
|
Shi H, Hugo W, Kong X, Hong A, Koya RC, Moriceau G, Chodon T, Guo R, Johnson DB, Dahlman KB, Kelley MC, Kefford RF, Chmielowski B, Glaspy JA, Sosman JA, van Baren N, Long GV, Ribas A, Lo RS. Acquired resistance and clonal evolution in melanoma during BRAF inhibitor therapy. Cancer Discov 2013; 4:80-93. [PMID: 24265155 DOI: 10.1158/2159-8290.cd-13-0642] [Citation(s) in RCA: 772] [Impact Index Per Article: 64.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
BRAF inhibitors elicit rapid antitumor responses in the majority of patients with BRAF(V600)-mutant melanoma, but acquired drug resistance is almost universal. We sought to identify the core resistance pathways and the extent of tumor heterogeneity during disease progression. We show that mitogen-activated protein kinase reactivation mechanisms were detected among 70% of disease-progressive tissues, with RAS mutations, mutant BRAF amplification, and alternative splicing being most common. We also detected PI3K-PTEN-AKT-upregulating genetic alterations among 22% of progressive melanomas. Distinct molecular lesions in both core drug escape pathways were commonly detected concurrently in the same tumor or among multiple tumors from the same patient. Beyond harboring extensively heterogeneous resistance mechanisms, melanoma regrowth emerging from BRAF inhibitor selection displayed branched evolution marked by altered mutational spectra/signatures and increased fitness. Thus, melanoma genomic heterogeneity contributes significantly to BRAF inhibitor treatment failure, implying upfront, cotargeting of two core pathways as an essential strategy for durable responses.
Collapse
Affiliation(s)
- Hubing Shi
- Division of Dermatology, Department of Medicine.,David Geffen School of Medicine, University of California, LA, California 90095-1662 USA
| | - Willy Hugo
- Division of Dermatology, Department of Medicine.,David Geffen School of Medicine, University of California, LA, California 90095-1662 USA
| | - Xiangju Kong
- Division of Dermatology, Department of Medicine.,David Geffen School of Medicine, University of California, LA, California 90095-1662 USA
| | - Aayoung Hong
- Division of Dermatology, Department of Medicine.,Department of Molecular and Medical Pharmacology.,David Geffen School of Medicine, University of California, LA, California 90095-1662 USA
| | - Richard C Koya
- Division of Surgical Oncology, Department of Surgery.,David Geffen School of Medicine, University of California, LA, California 90095-1662 USA
| | - Gatien Moriceau
- Division of Dermatology, Department of Medicine.,David Geffen School of Medicine, University of California, LA, California 90095-1662 USA
| | - Thinle Chodon
- Division of Hematology & Oncology, Department of Medicine.,David Geffen School of Medicine, University of California, LA, California 90095-1662 USA
| | - Rongqing Guo
- Division of Hematology & Oncology, Department of Medicine.,David Geffen School of Medicine, University of California, LA, California 90095-1662 USA
| | - Douglas B Johnson
- Department of Medicine.,Vanderbilt-Ingram Cancer Center, Nashville, TN 37232
| | - Kimberly B Dahlman
- Department of Cancer Biology.,Vanderbilt-Ingram Cancer Center, Nashville, TN 37232
| | - Mark C Kelley
- Department of Surgery.,Vanderbilt-Ingram Cancer Center, Nashville, TN 37232
| | - Richard F Kefford
- Melanoma Institute of Australia, Westmead Millenium Institute, Westmead Hospital, University of Sydney, New South Wales, Australia
| | - Bartosz Chmielowski
- Division of Hematology & Oncology, Department of Medicine.,Jonsson Comprehensive Cancer Center.,David Geffen School of Medicine, University of California, LA, California 90095-1662 USA
| | - John A Glaspy
- Division of Hematology & Oncology, Department of Medicine.,Jonsson Comprehensive Cancer Center.,David Geffen School of Medicine, University of California, LA, California 90095-1662 USA
| | - Jeffrey A Sosman
- Department of Medicine.,Vanderbilt-Ingram Cancer Center, Nashville, TN 37232
| | | | - Georgina V Long
- Melanoma Institute of Australia, Westmead Millenium Institute, Westmead Hospital, University of Sydney, New South Wales, Australia
| | - Antoni Ribas
- Division of Dermatology, Department of Medicine.,Division of Hematology & Oncology, Department of Medicine.,Jonsson Comprehensive Cancer Center.,Department of Molecular and Medical Pharmacology.,David Geffen School of Medicine, University of California, LA, California 90095-1662 USA
| | - Roger S Lo
- Division of Dermatology, Department of Medicine.,Jonsson Comprehensive Cancer Center.,Department of Molecular and Medical Pharmacology.,David Geffen School of Medicine, University of California, LA, California 90095-1662 USA
| |
Collapse
|
231
|
Abel HJ, Duncavage EJ. Detection of structural DNA variation from next generation sequencing data: a review of informatic approaches. Cancer Genet 2013; 206:432-40. [PMID: 24405614 DOI: 10.1016/j.cancergen.2013.11.002] [Citation(s) in RCA: 75] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2013] [Revised: 11/06/2013] [Accepted: 11/15/2013] [Indexed: 10/26/2022]
Abstract
Next generation sequencing (NGS), or massively paralleled sequencing, refers to a collective group of methods in which numerous sequencing reactions take place simultaneously, resulting in enormous amounts of sequencing data for a small fraction of the cost of Sanger sequencing. Typically short (50-250 bp), NGS reads are first mapped to a reference genome, and then variants are called from the mapped data. While most NGS applications focus on the detection of single nucleotide variants (SNVs) or small insertions/deletions (indels), structural variation, including translocations, larger indels, and copy number variation (CNV), can be identified from the same data. Structural variation detection can be performed from whole genome NGS data or "targeted" data including exomes or gene panels. However, while targeted sequencing greatly increases sequencing coverage or depth of particular genes, it may introduce biases in the data that require specialized informatic analyses. In the past several years, there have been considerable advances in methods used to detect structural variation, and a full range of variants from SNVs to balanced translocations to CNV can now be detected with reasonable sensitivity from either whole genome or targeted NGS data. Such methods are being rapidly applied to clinical testing where they can supplement or in some cases replace conventional fluorescence in situ hybridization or array-based testing. Here we review some of the informatics approaches used to detect structural variation from NGS data.
Collapse
Affiliation(s)
- Haley J Abel
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | - Eric J Duncavage
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
232
|
SomatiCA: identifying, characterizing and quantifying somatic copy number aberrations from cancer genome sequencing data. PLoS One 2013; 8:e78143. [PMID: 24265680 PMCID: PMC3827077 DOI: 10.1371/journal.pone.0078143] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2013] [Accepted: 09/07/2013] [Indexed: 11/19/2022] Open
Abstract
Whole genome sequencing of matched tumor-normal sample pairs is becoming routine in cancer research. However, analysis of somatic copy-number changes from sequencing data is still challenging because of insufficient sequencing coverage, unknown tumor sample purity and subclonal heterogeneity. Here we describe a computational framework, named SomatiCA, which explicitly accounts for tumor purity and subclonality in the analysis of somatic copy-number profiles. Taking read depths (RD) and lesser allele frequencies (LAF) as input, SomatiCA will output 1) admixture rate for each tumor sample, 2) somatic allelic copy-number for each genomic segment, 3) fraction of tumor cells with subclonal change in each somatic copy number aberration (SCNA), and 4) a list of substantial genomic aberration events including gain, loss and LOH. SomatiCA is available as a Bioconductor R package at http://www.bioconductor.org/packages/2.13/bioc/html/SomatiCA.html.
Collapse
|
233
|
Comparative study of exome copy number variation estimation tools using array comparative genomic hybridization as control. BIOMED RESEARCH INTERNATIONAL 2013; 2013:915636. [PMID: 24303503 PMCID: PMC3835197 DOI: 10.1155/2013/915636] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/03/2013] [Accepted: 09/24/2013] [Indexed: 11/24/2022]
Abstract
Exome sequencing using next-generation sequencing technologies is a cost-efficient approach to selectively sequencing coding regions of the human genome for detection of disease variants. One of the lesser known yet important applications of exome sequencing data is to identify copy number variation (CNV). There have been many exome CNV tools developed over the last few years, but the performance and accuracy of these programs have not been thoroughly evaluated. In this study, we systematically compared four popular exome CNV tools (CoNIFER, cn.MOPS, exomeCopy, and ExomeDepth) and evaluated their effectiveness against array comparative genome hybridization (array CGH) platforms. We found that exome CNV tools are capable of identifying CNVs, but they can have problems such as high false positives, low sensitivity, and duplication bias when compared to array CGH platforms. While exome CNV tools do serve their purpose for data mining, careful evaluation and additional validation is highly recommended. Based on all these results, we recommend CoNIFER and cn.MOPs for nonpaired exome CNV detection over the other two tools due to a low false-positive rate, although none of the four exome CNV tools performed at an outstanding level when compared to array CGH.
Collapse
|
234
|
Andor N, Harness JV, Müller S, Mewes HW, Petritsch C. EXPANDS: expanding ploidy and allele frequency on nested subpopulations. ACTA ACUST UNITED AC 2013; 30:50-60. [PMID: 24177718 PMCID: PMC3866558 DOI: 10.1093/bioinformatics/btt622] [Citation(s) in RCA: 108] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Motivation: Several cancer types consist of multiple genetically and phenotypically distinct subpopulations. The underlying mechanism for this intra-tumoral heterogeneity can be explained by the clonal evolution model, whereby growth advantageous mutations cause the expansion of cancer cell subclones. The recurrent phenotype of many cancers may be a consequence of these coexisting subpopulations responding unequally to therapies. Methods to computationally infer tumor evolution and subpopulation diversity are emerging and they hold the promise to improve the understanding of genetic and molecular determinants of recurrence. Results: To address cellular subpopulation dynamics within human tumors, we developed a bioinformatic method, EXPANDS. It estimates the proportion of cells harboring specific mutations in a tumor. By modeling cellular frequencies as probability distributions, EXPANDS predicts mutations that accumulate in a cell before its clonal expansion. We assessed the performance of EXPANDS on one whole genome sequenced breast cancer and performed SP analyses on 118 glioblastoma multiforme samples obtained from TCGA. Our results inform about the extent of subclonal diversity in primary glioblastoma, subpopulation dynamics during recurrence and provide a set of candidate genes mutated in the most well-adapted subpopulations. In summary, EXPANDS predicts tumor purity and subclonal composition from sequencing data. Availability and implementation: EXPANDS is available for download at http://code.google.com/p/expands (matlab version - used in this manuscript) and http://cran.r-project.org/web/packages/expands (R version). Contact: claudia.petritsch@ucsf.edu Supplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Noemi Andor
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, USA, Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, 85764 Neuherberg, Germany, Brain Tumor Research Center, University of California San Francisco, San Francisco, CA 94158, USA, Department of Neurology, University of California San Francisco, San Francisco, CA 94143, USA, Department of Pediatrics, University of California San Francisco, San Francisco, CA 94143, USA, Chair of Genome Oriented Bioinformatics, Center of Life and Food Science, Freising-Weihenstephan, Technische Universität München, 80333, Munich, Germany, Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158 and Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California San Francisco, San Francisco, CA 94143, USA
| | | | | | | | | |
Collapse
|
235
|
Fang LT, Lee S, Choi H, Kim HK, Jew G, Kang HC, Chen L, Jablons D, Kim IJ. Comprehensive genomic analyses of a metastatic colon cancer to the lung by whole exome sequencing and gene expression analysis. Int J Oncol 2013; 44:211-21. [PMID: 24172857 DOI: 10.3892/ijo.2013.2150] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2013] [Accepted: 09/16/2013] [Indexed: 11/05/2022] Open
Abstract
We performed whole exome sequencing and gene expression analysis on a metastatic colon cancer to the lung, along with the adjacent normal tissue of the lung. Whole exome sequencing uncovered 71 high-confidence non‑synonymous mutations. We selected 16 mutation candidates, and 13 out of 16 mutations were validated by targeted deep sequencing using the Ion Torrent PGM customized AmpliSeq panel. By integrating mutation, copy number and gene expression microarray data, we identified a JAZF1 mutation with a gain-of-copy, suggesting its oncogenic potential for the lung metastasis from colon cancer. Our pathway analyses showed that the identified mutations closely reflected characteristics of the metastatic site (lung) while mRNA gene expression patterns kept genetic information of its primary tumor (colon). The most significant gene expression network was the 'Colorectal Cancer Metastasis Signaling', containing 6 (ADCY2, ADCY9, APC, GNB5, K-ras and LRP6) out of the 71 mutated genes. Some of these mutated genes (ADCY9, ADCY2, GNB5, K-ras, HDAC6 and ARHGEF17) also belong to the 'Phospholipase C Signaling' network, which suggests that this pathway and its mutated genes may contribute to a lung metastasis from colon cancer.
Collapse
Affiliation(s)
- Li Tai Fang
- Thoracic Oncology Laboratory, Department of Surgery, University of California, San Francisco, San Francisco, CA, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
236
|
Shi CH, Schisler JC, Rubel CE, Tan S, Song B, McDonough H, Xu L, Portbury AL, Mao CY, True C, Wang RH, Wang QZ, Sun SL, Seminara SB, Patterson C, Xu YM. Ataxia and hypogonadism caused by the loss of ubiquitin ligase activity of the U box protein CHIP. Hum Mol Genet 2013; 23:1013-24. [PMID: 24113144 DOI: 10.1093/hmg/ddt497] [Citation(s) in RCA: 107] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Gordon Holmes syndrome (GHS) is a rare Mendelian neurodegenerative disorder characterized by ataxia and hypogonadism. Recently, it was suggested that disordered ubiquitination underlies GHS though the discovery of exome mutations in the E3 ligase RNF216 and deubiquitinase OTUD4. We performed exome sequencing in a family with two of three siblings afflicted with ataxia and hypogonadism and identified a homozygous mutation in STUB1 (NM_005861) c.737C→T, p.Thr246Met, a gene that encodes the protein CHIP (C-terminus of HSC70-interacting protein). CHIP plays a central role in regulating protein quality control, in part through its ability to function as an E3 ligase. Loss of CHIP function has long been associated with protein misfolding and aggregation in several genetic mouse models of neurodegenerative disorders; however, a role for CHIP in human neurological disease has yet to be identified. Introduction of the Thr246Met mutation into CHIP results in a loss of ubiquitin ligase activity measured directly using recombinant proteins as well as in cell culture models. Loss of CHIP function in mice resulted in behavioral and reproductive impairments that mimic human ataxia and hypogonadism. We conclude that GHS can be caused by a loss-of-function mutation in CHIP. Our findings further highlight the role of disordered ubiquitination and protein quality control in the pathogenesis of neurodegenerative disease and demonstrate the utility of combining whole-exome sequencing with molecular analyses and animal models to define causal disease polymorphisms.
Collapse
Affiliation(s)
- Chang-He Shi
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450000, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
237
|
Piazza R, Magistroni V, Pirola A, Redaelli S, Spinelli R, Redaelli S, Galbiati M, Valletta S, Giudici G, Cazzaniga G, Gambacorti-Passerini C. CEQer: a graphical tool for copy number and allelic imbalance detection from whole-exome sequencing data. PLoS One 2013; 8:e74825. [PMID: 24124457 PMCID: PMC3790773 DOI: 10.1371/journal.pone.0074825] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2013] [Accepted: 08/06/2013] [Indexed: 11/24/2022] Open
Abstract
Copy number alterations (CNA) are common events occurring in leukaemias and solid tumors. Comparative Genome Hybridization (CGH) is actually the gold standard technique to analyze CNAs; however, CGH analysis requires dedicated instruments and is able to perform only low resolution Loss of Heterozygosity (LOH) analyses. Here we present CEQer (Comparative Exome Quantification analyzer), a new graphical, event-driven tool for CNA/allelic-imbalance (AI) coupled analysis of exome sequencing data. By using case-control matched exome data, CEQer performs a comparative digital exonic quantification to generate CNA data and couples this information with exome-wide LOH and allelic imbalance detection. This data is used to build mixed statistical/heuristic models allowing the identification of CNA/AI events. To test our tool, we initially used in silico generated data, then we performed whole-exome sequencing from 20 leukemic specimens and corresponding matched controls and we analyzed the results using CEQer. Taken globally, these analyses showed that the combined use of comparative digital exon quantification and LOH/AI allows generating very accurate CNA data. Therefore, we propose CEQer as an efficient, robust and user-friendly graphical tool for the identification of CNA/AI in the context of whole-exome sequencing data.
Collapse
Affiliation(s)
- Rocco Piazza
- Department of Health Sciences, University of Milano-Bicocca, Monza, Italy
- * E-mail:
| | - Vera Magistroni
- Department of Health Sciences, University of Milano-Bicocca, Monza, Italy
| | - Alessandra Pirola
- Department of Health Sciences, University of Milano-Bicocca, Monza, Italy
| | - Sara Redaelli
- Department of Health Sciences, University of Milano-Bicocca, Monza, Italy
| | - Roberta Spinelli
- Department of Health Sciences, University of Milano-Bicocca, Monza, Italy
| | - Serena Redaelli
- Department of Neurosciences and Biomedical Technologies, University of Milano-Bicocca, Monza, Italy
| | - Marta Galbiati
- Tettamanti Research Center, University of Milano-Bicocca, San Gerardo Hospital, Monza, Italy
| | - Simona Valletta
- Department of Health Sciences, University of Milano-Bicocca, Monza, Italy
| | - Giovanni Giudici
- Tettamanti Research Center, University of Milano-Bicocca, San Gerardo Hospital, Monza, Italy
| | - Giovanni Cazzaniga
- Tettamanti Research Center, University of Milano-Bicocca, San Gerardo Hospital, Monza, Italy
| | | |
Collapse
|
238
|
Abstract
Human genetic mosaicism is the presence of two or more cellular populations with distinct genotypes in an individual who developed from a single fertilized ovum. While initially observed across a spectrum of rare genetic disorders, detailed assessment of data from genome-wide association studies now reveal that detectable clonal mosaicism involving large structural alterations (>2 Mb) can also be seen in populations of apparently healthy individuals. The first generation of descriptive studies has generated new interest in understanding the molecular basis of the affected genomic regions, percent of the cellular subpopulation involved, and developmental timing of the underlying mutational event, which could reveal new insights into the initiation, clonal expansion, and phenotypic manifestations of mosaic events. Early evidence indicates detectable clonal mosaicism increases in frequency with age and could preferentially occur in males. The observed pattern of recurrent events affecting specific chromosomal regions indicates some regions are more susceptible to these events, which could reflect inter-individual differences in genomic stability. Moreover, it is also plausible that the presence of large structural events could be associated with cancer risk. The characterization of detectable genetic mosaicism reveals that there could be important dynamic changes in the human genome associated with the aging process, which could be associated with risk for common disorders, such as cancer, cardiovascular disease, diabetes, and neurological disorders.
Collapse
Affiliation(s)
- Mitchell J. Machiela
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland, USA. 20892-4605
| | - Stephen J. Chanock
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland, USA. 20892-4605
| |
Collapse
|
239
|
Abstract
MOTIVATION Data quality is a critical issue in the analyses of DNA copy number alterations obtained from microarrays. It is commonly assumed that copy number alteration data can be modeled as piecewise constant and the measurement errors of different probes are independent. However, these assumptions do not always hold in practice. In some published datasets, we find that measurement errors are highly correlated between probes that interrogate nearby genomic loci, and the piecewise-constant model does not fit the data well. The correlated errors cause problems in downstream analysis, leading to a large number of DNA segments falsely identified as having copy number gains and losses. METHOD We developed a simple tool, called autocorrelation scanning profile, to assess the dependence of measurement error between neighboring probes. RESULTS Autocorrelation scanning profile can be used to check data quality and refine the analysis of DNA copy number data, which we demonstrate in some typical datasets. CONTACT lzhangli@mdanderson.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Liangcai Zhang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77230, USA and Department of Biophysics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | | |
Collapse
|
240
|
Zhao M, Wang Q, Wang Q, Jia P, Zhao Z. Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives. BMC Bioinformatics 2013; 14 Suppl 11:S1. [PMID: 24564169 PMCID: PMC3846878 DOI: 10.1186/1471-2105-14-s11-s1] [Citation(s) in RCA: 347] [Impact Index Per Article: 28.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Copy number variation (CNV) is a prevalent form of critical genetic variation that leads to an abnormal number of copies of large genomic regions in a cell. Microarray-based comparative genome hybridization (arrayCGH) or genotyping arrays have been standard technologies to detect large regions subject to copy number changes in genomes until most recently high-resolution sequence data can be analyzed by next-generation sequencing (NGS). During the last several years, NGS-based analysis has been widely applied to identify CNVs in both healthy and diseased individuals. Correspondingly, the strong demand for NGS-based CNV analyses has fuelled development of numerous computational methods and tools for CNV detection. In this article, we review the recent advances in computational methods pertaining to CNV detection using whole genome and whole exome sequencing data. Additionally, we discuss their strengths and weaknesses and suggest directions for future development.
Collapse
|
241
|
Lim YH, Ovejero D, Sugarman JS, Deklotz CMC, Maruri A, Eichenfield LF, Kelley PK, Jüppner H, Gottschalk M, Tifft CJ, Gafni RI, Boyce AM, Cowen EW, Bhattacharyya N, Guthrie LC, Gahl WA, Golas G, Loring EC, Overton JD, Mane SM, Lifton RP, Levy ML, Collins MT, Choate KA. Multilineage somatic activating mutations in HRAS and NRAS cause mosaic cutaneous and skeletal lesions, elevated FGF23 and hypophosphatemia. Hum Mol Genet 2013; 23:397-407. [PMID: 24006476 DOI: 10.1093/hmg/ddt429] [Citation(s) in RCA: 83] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Pathologically elevated serum levels of fibroblast growth factor-23 (FGF23), a bone-derived hormone that regulates phosphorus homeostasis, result in renal phosphate wasting and lead to rickets or osteomalacia. Rarely, elevated serum FGF23 levels are found in association with mosaic cutaneous disorders that affect large proportions of the skin and appear in patterns corresponding to the migration of ectodermal progenitors. The cause and source of elevated serum FGF23 is unknown. In those conditions, such as epidermal and large congenital melanocytic nevi, skin lesions are variably associated with other abnormalities in the eye, brain and vasculature. The wide distribution of involved tissues and the appearance of multiple segmental skin and bone lesions suggest that these conditions result from early embryonic somatic mutations. We report five such cases with elevated serum FGF23 and bone lesions, four with large epidermal nevi and one with a giant congenital melanocytic nevus. Exome sequencing of blood and affected skin tissue identified somatic activating mutations of HRAS or NRAS in each case without recurrent secondary mutation, and we further found that the same mutation is present in dysplastic bone. Our finding of somatic activating RAS mutation in bone, the endogenous source of FGF23, provides the first evidence that elevated serum FGF23 levels, hypophosphatemia and osteomalacia are associated with pathologic Ras activation and may provide insight in the heretofore limited understanding of the regulation of FGF23.
Collapse
|
242
|
Targeted next-generation sequencing reveals further genetic heterogeneity in axonal Charcot-Marie-Tooth neuropathy and a mutation in HSPB1. Eur J Hum Genet 2013; 22:522-7. [PMID: 23963299 DOI: 10.1038/ejhg.2013.190] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2013] [Revised: 07/19/2013] [Accepted: 07/24/2013] [Indexed: 12/11/2022] Open
Abstract
Charcot-Marie-Tooth disease (CMT) is a group of hereditary peripheral neuropathies. The dominantly inherited axonal CMT2 displays striking genetic heterogeneity, with 17 presently known disease genes. The large number of candidate genes, combined with lack of genotype-phenotype correlations, has made genetic diagnosis in CMT2 time-consuming and costly. In Finland, 25% of dominant CMT2 is explained by either a GDAP1 founder mutation or private MFN2 mutations but the rest of the families have remained without molecular diagnosis. Whole-exome and genome sequencing are powerful techniques to find disease mutations for CMT patients but they require large amounts of sequencing to confidently exclude heterozygous variants in all candidate genes, and they generate a vast amount of irrelevant data for diagnostic needs. Here we tested a targeted next-generation sequencing approach to screen the CMT2 genes. In total, 15 unrelated patients from dominant CMT2 families from Finland, in whom MFN2 and GDAP1 mutations had been excluded, participated in the study. The targeted approach produced sufficient sequence coverage for 95% of the 309 targeted exons, the rest we excluded by Sanger sequencing. Unexpectedly, the screen revealed a disease mutation only in one family, in the HSPB1 gene. Thus, new disease genes underlie CMT2 in the remaining families, indicating further genetic heterogeneity. We conclude that targeted next-generation sequencing is an efficient tool for genetic screening in CMT2 that also aids in the selection of patients for genome-wide approaches.
Collapse
|
243
|
Zheng C, Miao X, Li Y, Huang Y, Ruan J, Ma X, Wang L, Wu CI, Cai J. Determination of genomic copy number alteration emphasizing a restriction site-based strategy of genome re-sequencing. Bioinformatics 2013; 29:2813-21. [DOI: 10.1093/bioinformatics/btt481] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
|
244
|
Bromberg Y. Building a genome analysis pipeline to predict disease risk and prevent disease. J Mol Biol 2013; 425:3993-4005. [PMID: 23928561 DOI: 10.1016/j.jmb.2013.07.038] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Revised: 07/26/2013] [Accepted: 07/28/2013] [Indexed: 12/24/2022]
Abstract
Reduced costs and increased speed and accuracy of sequencing can bring the genome-based evaluation of individual disease risk to the bedside. While past efforts have identified a number of actionable mutations, the bulk of genetic risk remains hidden in sequence data. The biggest challenge facing genomic medicine today is the development of new techniques to predict the specifics of a given human phenome (set of all expressed phenotypes) encoded by each individual variome (full set of genome variants) in the context of the given environment. Numerous tools exist for the computational identification of the functional effects of a single variant. However, the pipelines taking advantage of full genomic, exomic, transcriptomic (and other) sequences have only recently become a reality. This review looks at the building of methodologies for predicting "variome"-defined disease risk. It also discusses some of the challenges for incorporating such a pipeline into everyday medical practice.
Collapse
Affiliation(s)
- Y Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08873, USA.
| |
Collapse
|
245
|
Woo HM, Park HJ, Baek JI, Park MH, Kim UK, Sagong B, Koo SK. Whole-exome sequencing identifies MYO15A mutations as a cause of autosomal recessive nonsyndromic hearing loss in Korean families. BMC MEDICAL GENETICS 2013; 14:72. [PMID: 23865914 PMCID: PMC3727941 DOI: 10.1186/1471-2350-14-72] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/08/2013] [Accepted: 07/09/2013] [Indexed: 01/05/2023]
Abstract
Background The genetic heterogeneity of hearing loss makes genetic diagnosis expensive and time consuming using available methods. Whole-exome sequencing has recently been introduced as an alternative approach to identifying causative mutations in Mendelian disorders. Methods To identify the hidden mutations that cause autosomal recessive nonsyndromic hearing loss (ARNSHL), we performed whole-exome sequencing of 13 unrelated Korean small families with ARNSHL who were negative for GJB2 or SLC26A4 mutations. Results We found two novel compound heterozygous mutations, IVS11 + 1 and p.R2146Q, of MYO15A in one (SR903 family) of the 13 families with ARNSHL. In addition to these causative mutations, 13 nonsynonymous variants, including variants with uncertain pathogenicity (SR285 family), were identified in the coding exons of MYO15A from Korean exomes. Conclusion This is the first report of MYO15A mutations in an East Asian population. We suggest that close attention should be paid to this gene when performing genetic testing of patients with hearing loss in East Asia. The present results also indicate that whole-exome sequencing is a valuable method for comprehensive medical diagnosis of a genetically heterogeneous recessive disease, especially in small-sized families.
Collapse
Affiliation(s)
- Hae-Mi Woo
- Division of Intractable Diseases, Center for Biomedical Sciences, National Institute of Health, Chungcheongbuk-do 363-951, South Korea
| | | | | | | | | | | | | |
Collapse
|
246
|
Spinelli R, Pirola A, Redaelli S, Sharma N, Raman H, Valletta S, Magistroni V, Piazza R, Gambacorti-Passerini C. Identification of novel point mutations in splicing sites integrating whole-exome and RNA-seq data in myeloproliferative diseases. Mol Genet Genomic Med 2013; 1:246-59. [PMID: 24498620 PMCID: PMC3865592 DOI: 10.1002/mgg3.23] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2013] [Revised: 05/22/2013] [Accepted: 05/24/2013] [Indexed: 12/13/2022] Open
Abstract
Point mutations in intronic regions near mRNA splice junctions can affect the splicing process. To identify novel splicing variants from exome sequencing data, we developed a bioinformatics splice-site prediction procedure to analyze next-generation sequencing (NGS) data (SpliceFinder). SpliceFinder integrates two functional annotation tools for NGS, ANNOVAR and MutationTaster and two canonical splice site prediction programs for single mutation analysis, SSPNN and NetGene2. By SpliceFinder, we identified somatic mutations affecting RNA splicing in a colon cancer sample, in eight atypical chronic myeloid leukemia (aCML), and eight CML patients. A novel homozygous splicing mutation was found in APC (NM_000038.4:c.1312+5G>A) and six heterozygous in GNAQ (NM_002072.2:c.735+1C>T), ABCC3 (NM_003786.3:c.1783-1G>A), KLHDC1 (NM_172193.1:c.568-2A>G), HOOK1 (NM_015888.4:c.1662-1G>A), SMAD9 (NM_001127217.2:c.1004-1C>T), and DNAH9 (NM_001372.3:c.10242+5G>A). Integrating whole-exome and RNA sequencing in aCML and CML, we assessed the phenotypic effect of mutations on mRNA splicing for GNAQ, ABCC3, HOOK1. In ABCC3 and HOOK1, RNA-Seq showed the presence of aberrant transcripts with activation of a cryptic splice site or intron retention, validated by the reverse transcription-polymerase chain reaction (RT-PCR) in the case of HOOK1. In GNAQ, RNA-Seq showed 22% of wild-type transcript and 78% of mRNA skipping exon 5, resulting in a 4–6 frameshift fusion confirmed by RT-PCR. The pipeline can be useful to identify intronic variants affecting RNA sequence by complementing conventional exome analysis.
Collapse
Affiliation(s)
- Roberta Spinelli
- Department of Health Sciences, University of Milano-Bicocca, Monza, Italy
| | - Alessandra Pirola
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Sara Redaelli
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Nitesh Sharma
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Hima Raman
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Simona Valletta
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Vera Magistroni
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Rocco Piazza
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Carlo Gambacorti-Passerini
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy ; Hematology and Clinical Research Unit, San Gerardo Hospital Monza, Italy
| |
Collapse
|
247
|
|
248
|
Nocq J, Celton M, Gendron P, Lemieux S, Wilhelm BT. Harnessing virtual machines to simplify next-generation DNA sequencing analysis. Bioinformatics 2013; 29:2075-83. [PMID: 23786767 DOI: 10.1093/bioinformatics/btt352] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION The growth of next-generation sequencing (NGS) has not only dramatically accelerated the pace of research in the field of genomics, but it has also opened the door to personalized medicine and diagnostics. The resulting flood of data has led to the rapid development of large numbers of bioinformatic tools for data analysis, creating a challenging situation for researchers when choosing and configuring a variety of software for their analysis, and for other researchers trying to replicate their analysis. As NGS technology continues to expand from the research environment into clinical laboratories, the challenges associated with data analysis have the potential to slow the adoption of this technology. RESULTS Here we discuss the potential of virtual machines (VMs) to be used as a method for sharing entire installations of NGS software (bioinformatic 'pipelines'). VMs are created by programs designed to allow multiple operating systems to co-exist on a single physical machine, and they can be made following the object-oriented paradigm of encapsulating data and methods together. This allows NGS data to be distributed within a VM, along with the pre-configured software for its analysis. Although VMs have historically suffered from poor performance relative to native operating systems, we present benchmarking results demonstrating that this reduced performance can now be minimized. We further discuss the many potential benefits of VMs as a solution for NGS analysis and describe several published examples. Lastly, we consider the benefits of VMs in facilitating the introduction of NGS technology into the clinical environment. CONTACT brian.wilhelm@umontreal.ca.
Collapse
Affiliation(s)
- Julie Nocq
- Institute for Research in Immunology and Cancer, Laboratory for High-Throughput Genomics, Department of Medicine, University of Montreal, QC, Canada
| | | | | | | | | |
Collapse
|
249
|
Simon R, Roychowdhury S. Implementing personalized cancer genomics in clinical trials. Nat Rev Drug Discov 2013; 12:358-69. [PMID: 23629504 DOI: 10.1038/nrd3979] [Citation(s) in RCA: 220] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The recent surge in high-throughput sequencing of cancer genomes has supported an expanding molecular classification of cancer. These studies have identified putative predictive biomarkers signifying aberrant oncogene pathway activation and may provide a rationale for matching patients with molecularly targeted therapies in clinical trials. Here, we discuss some of the challenges of adapting these data for rare cancers or molecular subsets of certain cancers, which will require aligning the availability of investigational agents, rapid turnaround of clinical grade sequencing, molecular eligibility and reconsidering clinical trial design and end points.
Collapse
Affiliation(s)
- Richard Simon
- Biometric Research Branch, US National Cancer Institute, Bethesda, Maryland 20892-7434, USA
| | | |
Collapse
|
250
|
Jia P, Jin H, Meador CB, Xia J, Ohashi K, Liu L, Pirazzoli V, Dahlman KB, Politi K, Michor F, Zhao Z, Pao W. Next-generation sequencing of paired tyrosine kinase inhibitor-sensitive and -resistant EGFR mutant lung cancer cell lines identifies spectrum of DNA changes associated with drug resistance. Genome Res 2013; 23:1434-45. [PMID: 23733853 PMCID: PMC3759720 DOI: 10.1101/gr.152322.112] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Somatic mutations in kinase genes are associated with sensitivity of solid tumors to kinase inhibitors, but patients with metastatic cancer eventually develop disease progression. In EGFR mutant lung cancer, modeling of acquired resistance (AR) with drug-sensitive cell lines has identified clinically relevant EGFR tyrosine kinase inhibitor (TKI) resistance mechanisms such as the second-site mutation, EGFR T790M, amplification of the gene encoding an alternative kinase, MET, and epithelial-mesenchymal transition (EMT). The full spectrum of DNA changes associated with AR remains unknown. We used next-generation sequencing to characterize mutational changes associated with four populations of EGFR mutant drug-sensitive and five matched drug-resistant cell lines. Comparing resistant cells with parental counterparts, 18-91 coding SNVs/indels were predicted to be acquired and 1-27 were lost; few SNVs/indels were shared across resistant lines. Comparison of two related parental lines revealed no unique coding SNVs/indels, suggesting that changes in the resistant lines were due to drug selection. Surprisingly, we observed more CNV changes across all resistant lines, and the line with EMT displayed significantly higher levels of CNV changes than the other lines with AR. These results demonstrate a framework for studying the evolution of AR and provide the first genome-wide spectrum of mutations associated with the development of cellular drug resistance in an oncogene-addicted cancer. Collectively, the data suggest that CNV changes may play a larger role than previously appreciated in the acquisition of drug resistance and highlight that resistance may be heterogeneous in the context of different tumor cell backgrounds.
Collapse
Affiliation(s)
- Peilin Jia
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee 37232, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|