1
|
Guille A, Adélaïde J, Finetti P, Andre F, Birnbaum D, Mamessier E, Bertucci F, Chaffanet M. A benchmarking study of individual somatic variant callers and voting-based ensembles for whole-exome sequencing. Brief Bioinform 2024; 26:bbae697. [PMID: 39828270 PMCID: PMC11790059 DOI: 10.1093/bib/bbae697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 11/22/2024] [Indexed: 01/22/2025] Open
Abstract
By identifying somatic mutations, whole-exome sequencing (WES) has become a technology of choice for the diagnosis and guiding treatment decisions in many cancers. Despite advances in the field of somatic variant detection and the emergence of sophisticated tools incorporating machine learning, accurately identifying somatic variants remains challenging. Each new somatic variant caller is often accompanied by claims of superior performance compared to predecessors. Furthermore, most comparative studies focus on a limited set of tools and reference datasets, leading to inconsistent results and making it difficult for laboratories to select the optimal solution. Our study comprehensively evaluated 20 somatic variant callers across four reference WES datasets. We subsequently assessed the performance of ensemble approaches by exploring all possible combinations of these callers, generating 8178 and 1013 combinations for single-nucleotide variants (SNVs) and indels, respectively, with varying voting thresholds. Our analysis identified five high-performing individual somatic variant callers: Muse, Mutect2, Dragen, TNScope, and NeuSomatic. For somatic SNVs, an ensemble combining LoFreq, Muse, Mutect2, SomaticSniper, Strelka, and Lancet outperformed the top-performing caller (Dragen) by >3.6% (mean F1 score = 0.927). Similarly, for somatic indels, an ensemble of Mutect2, Strelka, Varscan2, and Pindel outperformed the best individual caller (Neusomatic) by >3.5% (mean F1 score = 0.867). By considering the computational costs of each combination, we were able to identify an optimal solution involving four somatic variant callers, Muse, Mutect2, and Strelka for the SNVs and Mutect2, Strelka, and Varscan2 for the indels, enabling accurate and cost-effective somatic variant detection in whole exome.
Collapse
Affiliation(s)
- Arnaud Guille
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
| | - José Adélaïde
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
| | - Pascal Finetti
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
| | - Fabrice Andre
- Department of Medical Oncology, Gustave Roussy, University Paris-Saclay, 94805 Villejuif, France
| | - Daniel Birnbaum
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
| | - Emilie Mamessier
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
| | - François Bertucci
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
- Medical Oncology, Institut Paoli-Calmettes, 13009, Marseille, France
| | - Max Chaffanet
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
| |
Collapse
|
2
|
Martín R, Gaitán N, Jarlier F, Feuerbach L, de Soyres H, Arbonés M, Gutman T, Puiggròs M, Ferriz A, Gonzalez A, Estelles L, Gut I, Capella-Gutierrez S, Stein LD, Brors B, Royo R, Hupé P, Torrents D. ONCOLINER: A new solution for monitoring, improving, and harmonizing somatic variant calling across genomic oncology centers. CELL GENOMICS 2024; 4:100639. [PMID: 39216474 PMCID: PMC11480849 DOI: 10.1016/j.xgen.2024.100639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 06/13/2024] [Accepted: 08/07/2024] [Indexed: 09/04/2024]
Abstract
The characterization of somatic genomic variation associated with the biology of tumors is fundamental for cancer research and personalized medicine, as it guides the reliability and impact of cancer studies and genomic-based decisions in clinical oncology. However, the quality and scope of tumor genome analysis across cancer research centers and hospitals are currently highly heterogeneous, limiting the consistency of tumor diagnoses across hospitals and the possibilities of data sharing and data integration across studies. With the aim of providing users with actionable and personalized recommendations for the overall enhancement and harmonization of somatic variant identification across research and clinical environments, we have developed ONCOLINER. Using specifically designed mosaic and tumorized genomes for the analysis of recall and precision across somatic SNVs, insertions or deletions (indels), and structural variants (SVs), we demonstrate that ONCOLINER is capable of improving and harmonizing genome analysis across three state-of-the-art variant discovery pipelines in genomic oncology.
Collapse
Affiliation(s)
- Rodrigo Martín
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Nicolás Gaitán
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Frédéric Jarlier
- Institut Curie, Paris, France; U900, Paris, France; PSL Research University, Paris, France; Mines Paris Tech, Fontainebleau, France
| | - Lars Feuerbach
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Henri de Soyres
- Institut Curie, Paris, France; U900, Paris, France; PSL Research University, Paris, France; Mines Paris Tech, Fontainebleau, France
| | - Marc Arbonés
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Tom Gutman
- Institut Curie, Paris, France; U900, Paris, France; PSL Research University, Paris, France; Mines Paris Tech, Fontainebleau, France
| | - Montserrat Puiggròs
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Alvaro Ferriz
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Asier Gonzalez
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | | | - Ivo Gut
- Centro Nacional de Análisis Genómico, Barcelona, Spain
| | | | - Lincoln D Stein
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Benedikt Brors
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany; German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Romina Royo
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Philippe Hupé
- Institut Curie, Paris, France; U900, Paris, France; PSL Research University, Paris, France; Mines Paris Tech, Fontainebleau, France; UMR144, CNRS, Paris, France
| | - David Torrents
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
3
|
Heo DH, Kim I, Seo H, Kim SG, Kim M, Park J, Park H, Kang S, Kim J, Paik S, Hong SE. DEEPOMICS FFPE, a deep neural network model, identifies DNA sequencing artifacts from formalin fixed paraffin embedded tissue with high accuracy. Sci Rep 2024; 14:2559. [PMID: 38297116 PMCID: PMC10831091 DOI: 10.1038/s41598-024-53167-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 01/29/2024] [Indexed: 02/02/2024] Open
Abstract
Formalin-fixed, paraffin-embedded (FFPE) tissue specimens are routinely used in pathological diagnosis, but their large number of artifactual mutations complicate the evaluation of companion diagnostics and analysis of next-generation sequencing data. Identification of variants with low allele frequencies is challenging because existing FFPE filtering tools label all low-frequency variants as artifacts. To address this problem, we aimed to develop DEEPOMICS FFPE, an AI model that can classify a true variant from an artifact. Paired whole exome sequencing data from fresh frozen and FFPE samples from 24 tumors were obtained from public sources and used as training and validation sets at a ratio of 7:3. A deep neural network model with three hidden layers was trained with input features using outputs of the MuTect2 caller. Contributing features were identified using the SHapley Additive exPlanations algorithm and optimized based on training results. The performance of the final model (DEEPOMICS FFPE) was compared with those of existing models (MuTect filter, FFPolish, and SOBDetector) by using well-defined test datasets. We found 41 discriminating properties for FFPE artifacts. Optimization of property quantification improved the model performance. DEEPOMICS FFPE removed 99.6% of artifacts while maintaining 87.1% of true variants, with an F1-score of 88.3 in the entire dataset not used for training, which is significantly higher than those of existing tools. Its performance was maintained even for low-allele-fraction variants with a specificity of 0.995, suggesting that it can be used to identify subclonal variants. Different from existing methods, DEEPOMICS FFPE identified most of the sequencing artifacts in the FFPE samples while retaining more of true variants, including those of low allele frequencies. The newly developed tool DEEPOMICS FFPE may be useful in designing capture panels for personalized circulating tumor DNA assay and identifying candidate neoepitopes for personalized vaccine design. DEEPOMICS FFPE is freely available on the web ( http://deepomics.co.kr/ffpe ) for research.
Collapse
Affiliation(s)
- Dong-Hyuk Heo
- Theragen Bio Co., Ltd., Seongnam, Gyeonggi-do, 13488, Republic of Korea
| | - Inyoung Kim
- Theragen Bio Co., Ltd., Seongnam, Gyeonggi-do, 13488, Republic of Korea
| | - Heejae Seo
- Theragen Bio Co., Ltd., Seongnam, Gyeonggi-do, 13488, Republic of Korea
| | - Seong-Gwang Kim
- Theragen Bio Co., Ltd., Seongnam, Gyeonggi-do, 13488, Republic of Korea
| | - Minji Kim
- Theragen Bio Co., Ltd., Seongnam, Gyeonggi-do, 13488, Republic of Korea
| | - Jiin Park
- Theragen Bio Co., Ltd., Seongnam, Gyeonggi-do, 13488, Republic of Korea
| | - Hongsil Park
- Theragen Bio Co., Ltd., Seongnam, Gyeonggi-do, 13488, Republic of Korea
| | - Seungmo Kang
- Theragen Bio Co., Ltd., Seongnam, Gyeonggi-do, 13488, Republic of Korea
| | - Juhee Kim
- Theragen Bio Co., Ltd., Seongnam, Gyeonggi-do, 13488, Republic of Korea
| | - Soonmyung Paik
- Theragen Bio Co., Ltd., Seongnam, Gyeonggi-do, 13488, Republic of Korea
| | - Seong-Eui Hong
- Theragen Bio Co., Ltd., Seongnam, Gyeonggi-do, 13488, Republic of Korea.
| |
Collapse
|
4
|
Homan CC, Drazer MW, Yu K, Lawrence DM, Feng J, Arriola-Martinez L, Pozsgai MJ, McNeely KE, Ha T, Venugopal P, Arts P, King-Smith SL, Cheah J, Armstrong M, Wang P, Bödör C, Cantor AB, Cazzola M, Degelman E, DiNardo CD, Duployez N, Favier R, Fröhling S, Rio-Machin A, Klco JM, Krämer A, Kurokawa M, Lee J, Malcovati L, Morgan NV, Natsoulis G, Owen C, Patel KP, Preudhomme C, Raslova H, Rienhoff H, Ripperger T, Schulte R, Tawana K, Velloso E, Yan B, Kim E, Sood R, NISC Comparative Sequencing Program, Hsu AP, Holland SM, Phillips K, Poplawski NK, Babic M, Wei AH, Forsyth C, Mar Fan H, Lewis ID, Cooney J, Susman R, Fox LC, Blombery P, Singhal D, Hiwase D, Phipson B, Schreiber AW, Hahn CN, Scott HS, Liu P, Godley LA, Brown AL. Somatic mutational landscape of hereditary hematopoietic malignancies caused by germline variants in RUNX1, GATA2, and DDX41. Blood Adv 2023; 7:6092-6107. [PMID: 37406166 PMCID: PMC10582382 DOI: 10.1182/bloodadvances.2023010045] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 05/22/2023] [Accepted: 06/19/2023] [Indexed: 07/07/2023] Open
Abstract
Individuals with germ line variants associated with hereditary hematopoietic malignancies (HHMs) have a highly variable risk for leukemogenesis. Gaps in our understanding of premalignant states in HHMs have hampered efforts to design effective clinical surveillance programs, provide personalized preemptive treatments, and inform appropriate counseling for patients. We used the largest known comparative international cohort of germline RUNX1, GATA2, or DDX41 variant carriers without and with hematopoietic malignancies (HMs) to identify patterns of genetic drivers that are unique to each HHM syndrome before and after leukemogenesis. These patterns included striking heterogeneity in rates of early-onset clonal hematopoiesis (CH), with a high prevalence of CH in RUNX1 and GATA2 variant carriers who did not have malignancies (carriers-without HM). We observed a paucity of CH in DDX41 carriers-without HM. In RUNX1 carriers-without HM with CH, we detected variants in TET2, PHF6, and, most frequently, BCOR. These genes were recurrently mutated in RUNX1-driven malignancies, suggesting CH is a direct precursor to malignancy in RUNX1-driven HHMs. Leukemogenesis in RUNX1 and DDX41 carriers was often driven by second hits in RUNX1 and DDX41, respectively. This study may inform the development of HHM-specific clinical trials and gene-specific approaches to clinical monitoring. For example, trials investigating the potential benefits of monitoring DDX41 carriers-without HM for low-frequency second hits in DDX41 may now be beneficial. Similarly, trials monitoring carriers-without HM with RUNX1 germ line variants for the acquisition of somatic variants in BCOR, PHF6, and TET2 and second hits in RUNX1 are warranted.
Collapse
Affiliation(s)
- Claire C. Homan
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
| | - Michael W. Drazer
- Departments of Medicine and Human Genetics, Section of Hematology/Oncology, Center for Clinical Cancer Genetics, and The University of Chicago Comprehensive Cancer Center, The University of Chicago, Chicago, IL
| | - Kai Yu
- Division of Intramural Research, Oncogenesis and Development Section, Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - David M. Lawrence
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
- ACRF Genomics Facility, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, SA, Australia
| | - Jinghua Feng
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
- ACRF Genomics Facility, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, SA, Australia
| | - Luis Arriola-Martinez
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
| | - Matthew J. Pozsgai
- Departments of Medicine and Human Genetics, Section of Hematology/Oncology, Center for Clinical Cancer Genetics, and The University of Chicago Comprehensive Cancer Center, The University of Chicago, Chicago, IL
| | - Kelsey E. McNeely
- Departments of Medicine and Human Genetics, Section of Hematology/Oncology, Center for Clinical Cancer Genetics, and The University of Chicago Comprehensive Cancer Center, The University of Chicago, Chicago, IL
| | - Thuong Ha
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
| | - Parvathy Venugopal
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
| | - Peer Arts
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
| | - Sarah L. King-Smith
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
| | - Jesse Cheah
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
| | - Mark Armstrong
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
| | - Paul Wang
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
- ACRF Genomics Facility, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, SA, Australia
| | - Csaba Bödör
- HCEMM-SE Molecular Oncohematology Research Group, 1st Department of Pathology and Experimental Cancer Research, Semmelweis University, Budapest, Hungary
| | - Alan B. Cantor
- Division of Hematology/Oncology, Boston Children's Hospital and Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA
| | - Mario Cazzola
- Department of Molecular Medicine, University of Pavia, Pavia, Italy
- Department of Hematology Oncology, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy
| | - Erin Degelman
- Alberta Children’s Hospital, Calgary, Alberta, Canada
| | - Courtney D. DiNardo
- Department of Leukemia, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Nicolas Duployez
- Laboratory of Hematology, Biology and Pathology Center, Centre Hospitalier Regional Universitaire de Lille, Lille, France
- Jean-Pierre Aubert Research Center, INSERM, Universitaire de Lille, Lille, France
| | - Remi Favier
- Assistance Publique-Hôpitaux de Paris, Armand Trousseau Children's Hospital, Paris, France
| | - Stefan Fröhling
- Department of Translational Medical Oncology, National Center for Tumor Diseases and German Cancer Research Center (DKFZ), Heidelberg, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Ana Rio-Machin
- Centre for Haemato-Oncology, Barts Cancer Institute, Queen Mary University of London, London, United Kingdom
| | | | - Alwin Krämer
- Clinical Cooperation Unit Molecular Hematology/Oncology, German Cancer Research Center (DKFZ) and Department of Internal Medicine V, University of Heidelberg, Heidelberg, Germany
| | - Mineo Kurokawa
- Department of Hematology & Oncology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Joanne Lee
- Department of Haematology-Oncology, National University Cancer Institute, National University Health System, Singapore
| | - Luca Malcovati
- Department of Molecular Medicine, University of Pavia, Pavia, Italy
- Department of Hematology Oncology, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy
| | - Neil V. Morgan
- Institute of Cardiovascular Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham, United Kingdom
| | | | - Carolyn Owen
- Division of Hematology and Hematological Malignancies, Foothills Medical Centre, Calgary, AB, Canada
| | - Keyur P. Patel
- Department of Leukemia, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Claude Preudhomme
- Laboratory of Hematology, Biology and Pathology Center, Centre Hospitalier Regional Universitaire de Lille, Lille, France
- Jean-Pierre Aubert Research Center, INSERM, Universitaire de Lille, Lille, France
| | - Hana Raslova
- Institut Gustave Roussy, Université Paris Sud, Equipe Labellisée par la Ligue Nationale Contre le Cancer, Villejuif, France
| | | | - Tim Ripperger
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Rachael Schulte
- Division of Pediatric Hematology and Oncology, Riley Children’s Hospital, Indiana University School of Medicine, Indianapolis, IN
| | - Kiran Tawana
- Department of Haematology, Addenbrooke’s Hospital, Cambridge, United Kingdom
| | - Elvira Velloso
- Service of Hematology, Transfusion and Cell Therapy and Laboratory of Medical Investigation in Pathogenesis and Directed Therapy in Onco-Immuno-Hematology (LIM-31) HCFMUSP, University of Sao Paulo Medical School, Sao Paulo, Brazil
- Genetics Laboratory, Hospital Israelita Albert Einstein, Sao Paulo, Brazil
| | - Benedict Yan
- Department of Haematology-Oncology, National University Cancer Institute, National University Health System, Singapore
| | - Erika Kim
- National Cancer Institute, National Institutes of Health, Rockville, MD
| | - Raman Sood
- Division of Intramural Research, Oncogenesis and Development Section, Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | | | - Amy P. Hsu
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD
| | - Steven M. Holland
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD
| | - Kerry Phillips
- Adult Genetics Unit, Royal Adelaide Hospital, Adelaide, SA, Australia
| | - Nicola K. Poplawski
- Adult Genetics Unit, Royal Adelaide Hospital, Adelaide, SA, Australia
- Adelaide Medical School, The University of Adelaide, Adelaide, SA, Australia
| | - Milena Babic
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
| | - Andrew H. Wei
- Department of Haematology, Peter McCallum Cancer Centre, Royal Melbourne Hospital, Walter and Eliza Hall Institute of Medical Research, The University of Melbourne, Melbourne, VIC, Australia
| | - Cecily Forsyth
- Central Coast Haematology, North Gosford, NSW, Australia
| | - Helen Mar Fan
- Department of Medicine, The University of Queensland, Brisbane, QLD, Australia
| | - Ian D. Lewis
- Adelaide Oncology & Haematology, North Adelaide, SA, Australia
| | - Julian Cooney
- Department of Haematology, Fiona Stanley Hospital, Murdoch, WA, Australia
| | - Rachel Susman
- Genetic Health Queensland, Royal Brisbane and Women’s Hospital, Brisbane, QLD, Australia
| | - Lucy C. Fox
- Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
| | - Piers Blombery
- Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
| | - Deepak Singhal
- Department of Haematology, SA Pathology, Adelaide, SA, Australia
| | - Devendra Hiwase
- Adelaide Medical School, The University of Adelaide, Adelaide, SA, Australia
- Department of Haematology, SA Pathology, Adelaide, SA, Australia
| | - Belinda Phipson
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- Department of Paediatrics and Department of Molecular Biology, The University of Melbourne, Melbourne, VIC, Australia
| | - Andreas W. Schreiber
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
- ACRF Genomics Facility, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, SA, Australia
- School of Biological Sciences, The University of Adelaide, Adelaide, SA, Australia
| | - Christopher N. Hahn
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
- Adelaide Medical School, The University of Adelaide, Adelaide, SA, Australia
| | - Hamish S. Scott
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
- ACRF Genomics Facility, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, SA, Australia
- Adelaide Medical School, The University of Adelaide, Adelaide, SA, Australia
| | - Paul Liu
- Division of Intramural Research, Oncogenesis and Development Section, Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Lucy A. Godley
- Departments of Medicine and Human Genetics, Section of Hematology/Oncology, Center for Clinical Cancer Genetics, and The University of Chicago Comprehensive Cancer Center, The University of Chicago, Chicago, IL
| | - Anna L. Brown
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An alliance between SA Pathology and the University of South Australia, Adelaide, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, Australia
- Adelaide Medical School, The University of Adelaide, Adelaide, SA, Australia
| |
Collapse
|
5
|
Li X, Wang Y, Deng S, Zhu G, Wang C, Johnson NA, Zhang Z, Tirado CR, Xu Y, Metang LA, Gonzalez J, Mukherji A, Ye J, Yang Y, Peng W, Tang Y, Hofstad M, Xie Z, Yoon H, Chen L, Liu X, Chen S, Zhu H, Strand D, Liang H, Raj G, He HH, Mendell JT, Li B, Wang T, Mu P. Loss of SYNCRIP unleashes APOBEC-driven mutagenesis, tumor heterogeneity, and AR-targeted therapy resistance in prostate cancer. Cancer Cell 2023; 41:1427-1449.e12. [PMID: 37478850 PMCID: PMC10530398 DOI: 10.1016/j.ccell.2023.06.010] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Revised: 05/24/2023] [Accepted: 06/29/2023] [Indexed: 07/23/2023]
Abstract
Tumor mutational burden and heterogeneity has been suggested to fuel resistance to many targeted therapies. The cytosine deaminase APOBEC proteins have been implicated in the mutational signatures of more than 70% of human cancers. However, the mechanism underlying how cancer cells hijack the APOBEC mediated mutagenesis machinery to promote tumor heterogeneity, and thereby foster therapy resistance remains unclear. We identify SYNCRIP as an endogenous molecular brake which suppresses APOBEC-driven mutagenesis in prostate cancer (PCa). Overactivated APOBEC3B, in SYNCRIP-deficient PCa cells, is a key mutator, representing the molecular source of driver mutations in some frequently mutated genes in PCa, including FOXA1, EP300. Functional screening identifies eight crucial drivers for androgen receptor (AR)-targeted therapy resistance in PCa that are mutated by APOBEC3B: BRD7, CBX8, EP300, FOXA1, HDAC5, HSF4, STAT3, and AR. These results uncover a cell-intrinsic mechanism that unleashes APOBEC-driven mutagenesis, which plays a significant role in conferring AR-targeted therapy resistance in PCa.
Collapse
Affiliation(s)
- Xiaoling Li
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Yunguan Wang
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, UT Southwestern Medical Center, Dallas, TX, USA; Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Su Deng
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Guanghui Zhu
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada; Princess Margaret Cancer Center, University Health Network, Toronto, ON, Canada
| | - Choushi Wang
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Nickolas A Johnson
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Zeda Zhang
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | | | - Yaru Xu
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Lauren A Metang
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Julisa Gonzalez
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Atreyi Mukherji
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Jianfeng Ye
- Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX, USA
| | - Yuqiu Yang
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, UT Southwestern Medical Center, Dallas, TX, USA
| | - Wei Peng
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Yitao Tang
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX, USA
| | - Mia Hofstad
- Department of Urology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Zhiqun Xie
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, UT Southwestern Medical Center, Dallas, TX, USA
| | - Heewon Yoon
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Liping Chen
- Department of Urology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Xihui Liu
- Department of Urology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Sujun Chen
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada; Princess Margaret Cancer Center, University Health Network, Toronto, ON, Canada
| | - Hong Zhu
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, UT Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, UT Southwestern Medical Center, Dallas, TX, USA
| | - Douglas Strand
- Department of Urology, UT Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, UT Southwestern Medical Center, Dallas, TX, USA
| | - Han Liang
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX, USA; Department of Systems Biology, MD Anderson Cancer Center, Houston, TX, USA
| | - Ganesh Raj
- Department of Urology, UT Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, UT Southwestern Medical Center, Dallas, TX, USA
| | - Housheng Hansen He
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada; Princess Margaret Cancer Center, University Health Network, Toronto, ON, Canada
| | - Joshua T Mendell
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, UT Southwestern Medical Center, Dallas, TX, USA; Howard Hughes Medical Institute, Chevy Chase, MD, USA; Hamon Center for Regenerative Science and Medicine, UT Southwestern Medical Center, Dallas, TX, USA
| | - Bo Li
- Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX, USA
| | - Tao Wang
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, UT Southwestern Medical Center, Dallas, TX, USA
| | - Ping Mu
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, UT Southwestern Medical Center, Dallas, TX, USA; Hamon Center for Regenerative Science and Medicine, UT Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
6
|
Vaisband M, Schubert M, Gassner FJ, Geisberger R, Greil R, Zaborsky N, Hasenauer J. Validation of genetic variants from NGS data using deep convolutional neural networks. BMC Bioinformatics 2023; 24:158. [PMID: 37081386 PMCID: PMC10116675 DOI: 10.1186/s12859-023-05255-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 03/27/2023] [Indexed: 04/22/2023] Open
Abstract
Accurate somatic variant calling from next-generation sequencing data is one most important tasks in personalised cancer therapy. The sophistication of the available technologies is ever-increasing, yet, manual candidate refinement is still a necessary step in state-of-the-art processing pipelines. This limits reproducibility and introduces a bottleneck with respect to scalability. We demonstrate that the validation of genetic variants can be improved using a machine learning approach resting on a Convolutional Neural Network, trained using existing human annotation. In contrast to existing approaches, we introduce a way in which contextual data from sequencing tracks can be included into the automated assessment. A rigorous evaluation shows that the resulting model is robust and performs on par with trained researchers following published standard operating procedure.
Collapse
Affiliation(s)
- Marc Vaisband
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
- Life and Medical Sciences Institute, University of Bonn, Bonn, Germany
| | - Maria Schubert
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Franz Josef Gassner
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Roland Geisberger
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Richard Greil
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Nadja Zaborsky
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Jan Hasenauer
- Life and Medical Sciences Institute, University of Bonn, Bonn, Germany
| |
Collapse
|
7
|
Wasilewska K, Gambin T, Rydzanicz M, Szczałuba K, Płoski R. Postzygotic mutations and where to find them - Recent advances and future implications in the field of non-neoplastic somatic mosaicism. MUTATION RESEARCH. REVIEWS IN MUTATION RESEARCH 2022; 790:108426. [PMID: 35690331 DOI: 10.1016/j.mrrev.2022.108426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 05/05/2022] [Accepted: 06/03/2022] [Indexed: 01/01/2023]
Abstract
The technological progress of massively parallel sequencing (MPS) has triggered a remarkable development in the research on postzygotic mutations. Although the overwhelming majority of studies in the field focus on oncogenesis, non-neoplastic diseases are attracting more and more attention. The aim of this review was to summarize some of the most recent findings in the field of somatic mosaicism in diseases other than neoplastic events. We discuss the abundance and role of postzygotic mutations, with a special emphasis on disorders which occur only in a mosaic form (obligatory mosaic diseases; OMDs). Based on the list of OMDs compiled from the published literature and three databases (OMIM, Orphanet and MosaicBase), we demonstrate the prevalence of cancer-related genes across OMDs and suggest other sources to further explore OMDs and OMD-related genes. Additionally, we comment on some practical aspects related to mosaic diseases, such as approaches to tissue sampling, the MPS coverage required to detect variants at a very low frequency, as well as on bioinformatic and molecular tools dedicated to detect somatic mutations in MPS data.
Collapse
Affiliation(s)
- Krystyna Wasilewska
- Department of Medical Genetics, Medical University of Warsaw, ul. Pawińskiego 3c, 02-106 Warsaw, Poland
| | - Tomasz Gambin
- Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland
| | - Małgorzata Rydzanicz
- Department of Medical Genetics, Medical University of Warsaw, ul. Pawińskiego 3c, 02-106 Warsaw, Poland
| | - Krzysztof Szczałuba
- Department of Medical Genetics, Medical University of Warsaw, ul. Pawińskiego 3c, 02-106 Warsaw, Poland
| | - Rafał Płoski
- Department of Medical Genetics, Medical University of Warsaw, ul. Pawińskiego 3c, 02-106 Warsaw, Poland.
| |
Collapse
|
8
|
Zemet R, Van den Veyver IB, Stankiewicz P. Parental mosaicism for apparent de novo genetic variants: Scope, detection, and counseling challenges. Prenat Diagn 2022; 42:811-821. [PMID: 35394072 PMCID: PMC9995893 DOI: 10.1002/pd.6144] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Revised: 04/04/2022] [Accepted: 04/04/2022] [Indexed: 11/07/2022]
Abstract
The disease burden of de novo mutations (DNMs) has been evidenced only recently when the common application of next-generation sequencing technologies enabled their reliable and affordable detection through family-based clinical exome or genome sequencing. Implementation of exome sequencing into prenatal diagnostics revealed that up to 63% of pathogenic or likely pathogenic variants associated with fetal structural anomalies are apparently de novo, primarily for autosomal dominant disorders. Apparent DNMs have been considered to primarily occur as germline or zygotic events, with consequently negligible recurrence risks. However, there is now evidence that a considerable proportion of them are in fact inherited from a parent mosaic for the variant. Here, we review the burden of DNMs in prenatal diagnostics and the influence of parental mosaicism on the interpretation of apparent DNMs and discuss the challenges with detecting and quantifying parental mosaicism and its effect on recurrence risk. We also describe new bioinformatic and technological tools developed to assess mosaicism and discuss how they improve the accuracy of reproductive risk counseling when parental mosaicism is detected.
Collapse
Affiliation(s)
- Roni Zemet
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
| | - Ignatia B Van den Veyver
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.,Department of Obstetrics and Gynecology, Baylor College of Medicine, Houston, Texas, USA.,Texas Children's Hospital, Houston, Texas, USA
| | - Paweł Stankiewicz
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
| |
Collapse
|
9
|
Dodani DD, Nguyen MH, Morin RD, Marra MA, Corbett RD. Combinatorial and Machine Learning Approaches for Improved Somatic Variant Calling From Formalin-Fixed Paraffin-Embedded Genome Sequence Data. Front Genet 2022; 13:834764. [PMID: 35571031 PMCID: PMC9092826 DOI: 10.3389/fgene.2022.834764] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 03/18/2022] [Indexed: 11/13/2022] Open
Abstract
Formalin fixation of paraffin-embedded tissue samples is a well-established method for preserving tissue and is routinely used in clinical settings. Although formalin-fixed, paraffin-embedded (FFPE) tissues are deemed crucial for research and clinical applications, the fixation process results in molecular damage to nucleic acids, thus confounding their use in genome sequence analysis. Methods to improve genomic data quality from FFPE tissues have emerged, but there remains significant room for improvement. Here, we use whole-genome sequencing (WGS) data from matched Fresh Frozen (FF) and FFPE tissue samples to optimize a sensitive and precise FFPE single nucleotide variant (SNV) calling approach. We present methods to reduce the prevalence of false-positive SNVs by applying combinatorial techniques to five publicly available variant callers. We also introduce FFPolish, a novel variant classification method that efficiently classifies FFPE-specific false-positive variants. Our combinatorial and statistical techniques improve precision and F1 scores compared to the results of publicly available tools when tested individually.
Collapse
Affiliation(s)
- Dollina D Dodani
- The Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, Canada
| | - Matthew H Nguyen
- The Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, Canada
| | - Ryan D Morin
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Provincial Health Services Authority, Vancouver, BC, Canada.,Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Marco A Marra
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Provincial Health Services Authority, Vancouver, BC, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Richard D Corbett
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Provincial Health Services Authority, Vancouver, BC, Canada
| |
Collapse
|
10
|
Kapur P, Gao M, Zhong H, Chintalapati S, Mitui M, Barnes S, Zhou Q, Miyata J, Carrillo D, Malladi V, Rakheja D, Pedrosa I, Xu L, Kinch L, Brugarolas J. Germline and sporadic mTOR pathway mutations in low-grade oncocytic tumor of the kidney. Mod Pathol 2022; 35:333-343. [PMID: 34538873 PMCID: PMC9817016 DOI: 10.1038/s41379-021-00896-6] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 08/03/2021] [Accepted: 08/04/2021] [Indexed: 01/11/2023]
Abstract
Low-grade oncocytic tumor (LOT) of the kidney is a recently described entity with poorly understood pathogenesis. Using next-generation sequencing (NGS) and complementary approaches, we provide insight into its biology. We describe 22 LOT corresponding to 7 patients presenting with a median age of 75 years (range 63-86 years) and male to female ratio 2:5. All 22 tumors demonstrated prototypical microscopic features. Tumors were well-circumscribed and solid. They were composed of sheets of tumor cells in compact nests. Tumor cells had eosinophilic cytoplasm, round to oval nuclei (without nuclear membrane irregularities), focal subtle perinuclear halos, and occasional binucleation. Sharply delineated edematous stromal islands were often observed. Tumor cells were positive for PAX8, negative for CD117, and exhibited diffuse and strong cytokeratin-7 expression. Six patients presented with pT1 tumors. At a median follow-up of 29 months, four patients were alive without recurrence (three patients had died from unrelated causes). All tumors were originally classified as chromophobe renal cell carcinoma, eosinophilic variant (chRCC-eo). While none of the patients presented with known syndromic features, one patient with multiple bilateral LOTs was subsequently found to have a likely pathogenic germline TSC1 mutation. Somatic, likely activating, mutations in MTOR and RHEB were identified in all other evaluable LOTs. As assessed by phospho-S6 and phospho-4E-BP1, mTOR complex 1 (mTORC1) was activated across all cases but to different extent. MTOR mutant LOT exhibited lower levels of mTORC1 activation, possibly related to mTORC1 dimerization and the preservation of a wild-type MTOR copy (retained chromosome 1). Supporting its distinction from related entities, gene expression analyses showed that LOT clustered separately from classic chRCC, chRCC-eo, and RO. In summary, converging mTORC1 pathway mutations, mTORC1 complex activation, and a distinctive gene expression signature along with characteristic phenotypic features support LOT designation as a distinct entity with both syndromic and non-syndromic cases associated with an indolent course.
Collapse
Affiliation(s)
- Payal Kapur
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, USA. .,Department of Urology, University of Texas Southwestern Medical Center, Dallas, TX, USA. .,Kidney Cancer Program at Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA.
| | - Ming Gao
- Kidney Cancer Program at Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, 75390,Department of Hematology-Oncology Division of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - Hua Zhong
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, 75390,Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - Suneetha Chintalapati
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - Midori Mitui
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - Spencer Barnes
- Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - Qinbo Zhou
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - Jeffrey Miyata
- Kidney Cancer Program at Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, 75390,Department of Hematology-Oncology Division of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - Deyssy Carrillo
- Kidney Cancer Program at Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, 75390,Department of Hematology-Oncology Division of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - Venkat Malladi
- Kidney Cancer Program at Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, 75390,Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - Dinesh Rakheja
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, 75390,Kidney Cancer Program at Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - Ivan Pedrosa
- Kidney Cancer Program at Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, 75390,Department of Radiology, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - Lin Xu
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - Lisa Kinch
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, 75390
| | - James Brugarolas
- Kidney Cancer Program at Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA. .,Department of Hematology-Oncology Division of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
11
|
Rudd ML, Hansen NF, Zhang X, Urick ME, Zhang S, Merino MJ, National Institutes of Health Intramural Sequencing Center Comparative Sequencing Program, Mullikin JC, Brody LC, Bell DW. KLF3 and PAX6 are candidate driver genes in late-stage, MSI-hypermutated endometrioid endometrial carcinomas. PLoS One 2022; 17:e0251286. [PMID: 35081118 PMCID: PMC8791453 DOI: 10.1371/journal.pone.0251286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 11/05/2021] [Indexed: 12/24/2022] Open
Abstract
Endometrioid endometrial carcinomas (EECs) are the most common histological subtype of uterine cancer. Late-stage disease is an adverse prognosticator for EEC. The purpose of this study was to analyze EEC exome mutation data to identify late-stage-specific statistically significantly mutated genes (SMGs), which represent candidate driver genes potentially associated with disease progression. We exome sequenced 15 late-stage (stage III or IV) non-ultramutated EECs and paired non-tumor DNAs; somatic variants were called using Strelka, Shimmer, SomaticSniper and MuTect. Additionally, somatic mutation calls were extracted from The Cancer Genome Atlas (TCGA) data for 66 late-stage and 270 early-stage (stage I or II) non-ultramutated EECs. MutSigCV (v1.4) was used to annotate SMGs in the two late-stage cohorts and to derive p-values for all mutated genes in the early-stage cohort. To test whether late-stage SMGs are statistically significantly mutated in early-stage tumors, q-values for late-stage SMGs were re-calculated from the MutSigCV (v1.4) early-stage p-values, adjusting for the number of late-stage SMGs tested. We identified 14 SMGs in the combined late-stage EEC cohorts. When the 14 late-stage SMGs were examined in the TCGA early-stage data, only Krüppel-like factor 3 (KLF3) and Paired box 6 (PAX6) failed to reach significance as early-stage SMGs, despite the inclusion of enough early-stage cases to ensure adequate statistical power. Within TCGA, nonsynonymous mutations in KLF3 and PAX6 were, respectively, exclusive or nearly exclusive to the microsatellite instability (MSI)-hypermutated molecular subgroup and were dominated by insertions-deletions at homopolymer tracts. In conclusion, our findings are hypothesis-generating and suggest that KLF3 and PAX6, which encode transcription factors, are MSI target genes and late-stage-specific SMGs in EEC.
Collapse
Affiliation(s)
- Meghan L. Rudd
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Nancy F. Hansen
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Xiaolu Zhang
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Mary Ellen Urick
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Suiyuan Zhang
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Maria J. Merino
- Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | | | - James C. Mullikin
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Rockville, Maryland, United States of America
| | - Lawrence C. Brody
- Medical Genomics and Metabolic Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Daphne W. Bell
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| |
Collapse
|
12
|
Chang TC, Xu K, Cheng Z, Wu G. Somatic and Germline Variant Calling from Next-Generation Sequencing Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:37-54. [DOI: 10.1007/978-3-030-91836-1_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
13
|
Ahmed Z, Renart EG, Zeeshan S. Genomics pipelines to investigate susceptibility in whole genome and exome sequenced data for variant discovery, annotation, prediction and genotyping. PeerJ 2021; 9:e11724. [PMID: 34395068 PMCID: PMC8320519 DOI: 10.7717/peerj.11724] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 06/14/2021] [Indexed: 12/12/2022] Open
Abstract
Over the last few decades, genomics is leading toward audacious future, and has been changing our views about conducting biomedical research, studying diseases, and understanding diversity in our society across the human species. The whole genome and exome sequencing (WGS/WES) are two of the most popular next-generation sequencing (NGS) methodologies that are currently being used to detect genetic variations of clinical significance. Investigating WGS/WES data for the variant discovery and genotyping is based on the nexus of different data analytic applications. Although several bioinformatics applications have been developed, and many of those are freely available and published. Timely finding and interpreting genetic variants are still challenging tasks among diagnostic laboratories and clinicians. In this study, we are interested in understanding, evaluating, and reporting the current state of solutions available to process the NGS data of variable lengths and types for the identification of variants, alleles, and haplotypes. Residing within the scope, we consulted high quality peer reviewed literature published in last 10 years. We were focused on the standalone and networked bioinformatics applications proposed to efficiently process WGS and WES data, and support downstream analysis for gene-variant discovery, annotation, prediction, and interpretation. We have discussed our findings in this manuscript, which include but not are limited to the set of operations, workflow, data handling, involved tools, technologies and algorithms and limitations of the assessed applications.
Collapse
Affiliation(s)
- Zeeshan Ahmed
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA.,Department of Medicine, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Eduard Gibert Renart
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Saman Zeeshan
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| |
Collapse
|
14
|
Pemov A, Hansen NF, Sindiri S, Patidar R, Higham CS, Dombi E, Miettinen MM, Fetsch P, Brems H, Chandrasekharappa SC, Jones K, Zhu B, Wei JS, Mullikin JC, Wallace MR, Khan J, Legius E, Widemann BC, Stewart DR. Low mutation burden and frequent loss of CDKN2A/B and SMARCA2, but not PRC2, define premalignant neurofibromatosis type 1-associated atypical neurofibromas. Neuro Oncol 2021; 21:981-992. [PMID: 30722027 DOI: 10.1093/neuonc/noz028] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Neurofibromatosis type 1 (NF1) is a tumor-predisposition disorder caused by germline mutations in NF1. NF1 patients have an 8-16% lifetime risk of developing a malignant peripheral nerve sheath tumor (MPNST), a highly aggressive soft-tissue sarcoma, often arising from preexisting benign plexiform neurofibromas (PNs) and atypical neurofibromas (ANFs). ANFs are distinct from both PN and MPNST, representing an intermediate step in malignant transformation. METHODS In the first comprehensive genomic analysis of ANF originating from multiple patients, we performed tumor/normal whole-exome sequencing (WES) of 16 ANFs. In addition, we conducted WES of 3 MPNSTs, copy-number meta-analysis of 26 ANFs and 28 MPNSTs, and whole transcriptome sequencing analysis of 5 ANFs and 5 MPNSTs. RESULTS We identified a low number of mutations (median 1, range 0-5) in the exomes of ANFs (only NF1 somatic mutations were recurrent), and frequent deletions of CDKN2A/B (69%) and SMARCA2 (42%). We determined that polycomb repressor complex 2 (PRC2) genes EED and SUZ12 were frequently mutated, deleted, or downregulated in MPNSTs but not in ANFs. Our pilot gene expression study revealed upregulated NRAS, MDM2, CCND1/2/3, and CDK4/6 in ANFs and MPNSTs, and overexpression of EZH2 in MPNSTs only. CONCLUSIONS The PN-ANF transition is primarily driven by the deletion of CDKN2A/B. Further progression from ANF to MPNST likely involves broad chromosomal rearrangements and frequent inactivation of the PRC2 genes, loss of the DNA repair genes, and copy-number increase of signal transduction and cell-cycle and pluripotency self-renewal genes.
Collapse
Affiliation(s)
- Alexander Pemov
- Clinical Genetics Branch, DCEG, NCI, National Institutes of Health (NIH), Rockville, Maryland, USA
| | - Nancy F Hansen
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, NIH, Rockville, Maryland, USA
| | - Sivasish Sindiri
- Genetics Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland, USA
| | - Rajesh Patidar
- Genetics Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland, USA.,Molecular Characterization & Clinical Assay Development Laboratory, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc, Frederick, Maryland, USA
| | - Christine S Higham
- Children's National Medical Center, Washington, DC, USA.,Pediatric Oncology Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland, USA
| | - Eva Dombi
- Pediatric Oncology Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland, USA
| | | | | | - Hilde Brems
- Department of Human Genetics, Catholic University Leuven, Leuven, Belgium
| | - Settara C Chandrasekharappa
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, NIH, Rockville, Maryland, USA
| | - Kristine Jones
- Cancer Genomics Research Laboratory, DCEG, NIH, Rockville, Maryland, USA
| | - Bin Zhu
- Cancer Genomics Research Laboratory, DCEG, NIH, Rockville, Maryland, USA
| | - Jun S Wei
- Genetics Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland, USA
| | | | | | - James C Mullikin
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, NIH, Rockville, Maryland, USA.,NISC, National Human Genome Research Institute, NIH, Rockville, Maryland, USA
| | - Margaret R Wallace
- Department of Molecular Genetics and Microbiology, UF Genetics Institute, UF Health Cancer Center, University of Florida, Gainesville, Florida, USA
| | - Javed Khan
- Genetics Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland, USA
| | - Eric Legius
- Department of Human Genetics, Catholic University Leuven, Leuven, Belgium
| | - Brigitte C Widemann
- Pediatric Oncology Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland, USA
| | - Douglas R Stewart
- Clinical Genetics Branch, DCEG, NCI, National Institutes of Health (NIH), Rockville, Maryland, USA
| |
Collapse
|
15
|
SoRelle JA, Wachsmann M, Cantarel BL. Assembling and Validating Bioinformatic Pipelines for Next-Generation Sequencing Clinical Assays. Arch Pathol Lab Med 2020; 144:1118-1130. [PMID: 32045276 DOI: 10.5858/arpa.2019-0476-ra] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/09/2019] [Indexed: 11/06/2022]
Abstract
CONTEXT.— Clinical next-generation sequencing (NGS) is being rapidly adopted, but analysis and interpretation of large data sets prompt new challenges for a clinical laboratory setting. Clinical NGS results rely heavily on the bioinformatics pipeline for identifying genetic variation in complex samples. The choice of bioinformatics algorithms, genome assembly, and genetic annotation databases are important for determining genetic alterations associated with disease. The analysis methods are often tuned to the assay to maximize accuracy. Once a pipeline has been developed, it must be validated to determine accuracy and reproducibility for samples similar to real-world cases. In silico proficiency testing or institutional data exchange will ensure consistency among clinical laboratories. OBJECTIVE.— To provide molecular pathologists a step-by-step guide to bioinformatics analysis and validation design in order to navigate the regulatory and validation standards of implementing a bioinformatic pipeline as a part of a new clinical NGS assay. DATA SOURCES.— This guide uses published studies on genomic analysis, bioinformatics methods, and methods comparison studies to inform the reader on what resources, including open source software tools and databases, are available for genetic variant detection and interpretation. CONCLUSIONS.— This review covers 4 key concepts: (1) bioinformatic analysis design for detecting genetic variation, (2) the resources for assessing genetic effects, (3) analysis validation assessment experiments and data sets, including a diverse set of samples to mimic real-world challenges that assess accuracy and reproducibility, and (4) if concordance between clinical laboratories will be improved by proficiency testing designed to test bioinformatic pipelines.
Collapse
Affiliation(s)
- Jeffrey A SoRelle
- Department of Pathology (SoRelle, Wachsmann), University of Texas Southwestern Medical Center, Dallas
| | - Megan Wachsmann
- Department of Pathology (SoRelle, Wachsmann), University of Texas Southwestern Medical Center, Dallas
| | - Brandi L Cantarel
- Bioinformatics Core Facility (Cantarel), University of Texas Southwestern Medical Center, Dallas.,Department of Bioinformatics (Cantarel), University of Texas Southwestern Medical Center, Dallas.,University of Texas Southwestern Medical Center, Dallas
| |
Collapse
|
16
|
Zhu M, Li L, Lu T, Yoo H, Zhu J, Gopal P, Wang SC, Porempka MR, Rich NE, Kagan S, Odewole M, Renteria V, Waljee AK, Wang T, Singal AG, Yopp AC, Zhu H. Uncovering Biological Factors That Regulate Hepatocellular Carcinoma Growth Using Patient-Derived Xenograft Assays. Hepatology 2020; 72:1085-1101. [PMID: 31899548 PMCID: PMC7332388 DOI: 10.1002/hep.31096] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 12/07/2019] [Indexed: 12/29/2022]
Abstract
BACKGROUND AND AIMS Several major factors limit our understanding of hepatocellular carcinoma (HCC). First, human HCCs are infrequently biopsied for diagnosis and thus are not often biologically interrogated. Second, HCC initiation and progression are strongly influenced by the cirrhotic microenvironment, and the exact contributions of intrinsic and extrinsic tumor factors are unclear. A powerful approach to examine the personalized biology of liver cancers and the influence of host tissues is with patient-derived xenograft (PDX) models. In Asia, HCCs from patients with hepatitis B virus have been efficiently converted into PDXs, but few parallel efforts from the west have been reported. APPROACH AND RESULTS In a large-scale analysis, we implanted 93 HCCs and 8 cholangiocarcinomas (CCAs) to systematically analyze host factors and to define an optimized platform for PDX development from both surgical and biopsy samples. NOD Scid IL-2Rγ-/- (NSG) mice that had undergone partial hepatectomy (PHx) represented the best combination of engraftability, growth, and passageability, but overall rates were low and indicative of a unique intrinsic biology for HCCs in the United States. PDX models preserved the histology and genetic features of parental tumors, and ultimately, eight models were usable for preclinical studies. Intriguingly, HCC PDXs were differentially sensitive to regorafenib and sorafenib, and CCA PDXs were also highly sensitive to regorafenib. CONCLUSIONS PDX models functionalize early and advanced stage HCCs and revealed unique biological features of liver cancers from the United States.
Collapse
Affiliation(s)
- Min Zhu
- Children’s Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Lin Li
- Children’s Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Tianshi Lu
- Children’s Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA, 75390, USA
| | - Hyesun Yoo
- Department of Statistics, University of Michigan, Ann Arbor, MI, USA. Michigan Integrated Center for Health Analytics and Medical Prediction (MiCHAMP), Ann Arbor, MI, USA
| | - Ji Zhu
- Department of Statistics, University of Michigan, Ann Arbor, MI, USA. Michigan Integrated Center for Health Analytics and Medical Prediction (MiCHAMP), Ann Arbor, MI, USA
| | - Purva Gopal
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Sam C. Wang
- Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Matthew R. Porempka
- Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Nicole E. Rich
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Sofia Kagan
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Mobolaji Odewole
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Veronica Renteria
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Akbar K. Waljee
- VA Center for Clinical Management Research, VA Ann Arbor Health Care System, Ann Arbor, MI, USA
- Department of Internal Medicine, Division of Gastroenterology and Hepatology, Michigan Medicine and Michigan Integrated Center for Health Analytics and Medical Prediction (MiCHAMP), Ann Arbor, MI, USA
| | - Tao Wang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA, 75390, USA
| | - Amit G. Singal
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Adam C. Yopp
- Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Hao Zhu
- Children’s Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Lead contact: Hao Zhu, , Phone: (214) 648-2850
| |
Collapse
|
17
|
Chung AS, Mettlen M, Ganguly D, Lu T, Wang T, Brekken RA, Hsiehchen D, Zhu H. Immune Checkpoint Inhibition is Safe and Effective for Liver Cancer Prevention in a Mouse Model of Hepatocellular Carcinoma. Cancer Prev Res (Phila) 2020; 13:911-922. [PMID: 32839204 DOI: 10.1158/1940-6207.capr-20-0200] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Revised: 06/24/2020] [Accepted: 08/11/2020] [Indexed: 12/24/2022]
Abstract
Cirrhosis is a high-risk state for hepatocellular carcinoma (HCC) development and represents an opportunity to prevent cancer. In the precancerous state of cirrhosis, there is an accumulation of neoantigens that may be specifically targetable through immunotherapy. We asked whether immune checkpoint inhibition could prevent tumorigenesis in a mouse model of diethylnitrosamine and carbon tetrachloride-induced HCC. We found that initiation of anti-PD-1 therapy prior to tumorigenesis could prevent up to 46% of liver tumors. This significant reduction in tumor burden was accompanied by infiltration of CD4+ Th cells and CD8+ cytotoxic T cells into the liver parenchyma. Importantly, anti-PD-1 therapy did not exacerbate liver dysfunction or worsen overall health in this liver disease model. Given the safety and preservation of quality of life observed with long-term immunotherapy use, an immunotherapy chemoprevention strategy is likely associated with a low risk-to-benefit ratio and high value care in select patients. These results encourage a prevention trial in cirrhotic patients with the highest risk of developing HCC.See related Spotlight by Mohammed et al., p. 897.
Collapse
Affiliation(s)
- Andrew S Chung
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Marcel Mettlen
- Department of Cell Biology, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Debolina Ganguly
- Department of Surgery, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Tianshi Lu
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Tao Wang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas.,Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Rolf A Brekken
- Department of Surgery, University of Texas Southwestern Medical Center, Dallas, Texas
| | - David Hsiehchen
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Hao Zhu
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, Texas.
| |
Collapse
|
18
|
Pemov A, Dewan R, Hansen NF, Chandrasekharappa SC, Ray-Chaudhury A, Jones K, Luo W, Heiss JD, Mullikin JC, Chittiboina P, Stewart DR, Asthagiri AR. Comparative clinical and genomic analysis of neurofibromatosis type 2-associated cranial and spinal meningiomas. Sci Rep 2020; 10:12563. [PMID: 32724039 PMCID: PMC7387487 DOI: 10.1038/s41598-020-69074-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Accepted: 05/19/2020] [Indexed: 12/31/2022] Open
Abstract
Neurofibromatosis type 2 (NF2) is an autosomal dominant Mendelian tumor predisposition disorder caused by germline pathogenic variants in the tumor suppressor NF2. Meningiomas are the second most common neoplasm in NF2, often occurring in multiple intracranial and spinal locations within the same patient. In this prospective longitudinal study, we assessed volumes and growth rates of ten spinal and ten cranial benign meningiomas in seven NF2 patients that concluded with surgical resection and performed whole-exome sequencing and copy-number variant (CNV) analysis of the tumors. Our comparison of the volume and the growth rate of NF2-associated spinal and cranial meningiomas point to the differences in timing of tumor initiation and/or to the differences in tumor progression (e.g., non-linear, saltatory growth) at these two anatomical locations. Genomic investigation of these tumors revealed that somatic inactivation of NF2 is the principal and perhaps the only driver of tumor initiation; and that tumor progression likely occurs via accumulation of CNVs, rather than point mutations. Results of this study contribute to a better understanding of NF2-associated meningiomas clinical behavior and their genetic underpinnings.
Collapse
Affiliation(s)
- Alexander Pemov
- Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Rockville, MD, USA
| | - Ramita Dewan
- Surgical Neurology Branch, National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD, USA.,Neuromuscular Disease Research Section, National Institute On Aging, NIH, Bethesda, MD, USA
| | - Nancy F Hansen
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD, USA
| | - Settara C Chandrasekharappa
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD, USA
| | - Abhik Ray-Chaudhury
- Surgical Neurology Branch, National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD, USA
| | - Kristine Jones
- Frederick National Laboratory for Cancer Research, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
| | - Wen Luo
- Frederick National Laboratory for Cancer Research, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
| | - John D Heiss
- Surgical Neurology Branch, National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD, USA
| | - James C Mullikin
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD, USA.,NIH Intramural Sequencing Center, National Human Genome Research Institute, NIH, Rockville, MD, USA
| | - Prashant Chittiboina
- Surgical Neurology Branch, National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD, USA
| | - Douglas R Stewart
- Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Rockville, MD, USA.
| | - Ashok R Asthagiri
- Surgical Neurology Branch, National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD, USA. .,Department of Neurological Surgery, University of Virginia School of Medicine, Charlottesville, VA, USA.
| |
Collapse
|
19
|
de Schaetzen van Brienen L, Larmuseau M, Van der Eecken K, De Ryck F, Robbe P, Schuh A, Fostier J, Ost P, Marchal K. Comparative analysis of somatic variant calling on matched FF and FFPE WGS samples. BMC Med Genomics 2020; 13:94. [PMID: 32631411 PMCID: PMC7336445 DOI: 10.1186/s12920-020-00746-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Accepted: 06/22/2020] [Indexed: 02/04/2023] Open
Abstract
Background Research grade Fresh Frozen (FF) DNA material is not yet routinely collected in clinical practice. Many hospitals, however, collect and store Formalin Fixed Paraffin Embedded (FFPE) tumor samples. Consequently, the sample size of whole genome cancer cohort studies could be increased tremendously by including FFPE samples, although the presence of artefacts might obfuscate the variant calling. To assess whether FFPE material can be used for cohort studies, we performed an in-depth comparison of somatic SNVs called on matching FF and FFPE Whole Genome Sequence (WGS) samples extracted from the same tumor. Methods Four variant callers (i.e. Strelka2, Mutect2, VarScan2 and Shimmer) were used to call somatic variants on matching FF and FFPE WGS samples from a metastatic prostate tumor. Using the variants identified by these callers, we developed a heuristic to maximize the overlap between the FF and its FFPE counterpart in terms of sensitivity and precision. The proposed variant calling approach was then validated on nine matched primary samples. Finally, we assessed what fraction of the discrepancy could be attributed to intra-tumor heterogeneity (ITH), by comparing the overlap in clonal and subclonal somatic variants. Results We first compared variants between an FF and an FFPE sample from a metastatic prostate tumor, showing that on average 50% of the calls in the FF are recovered in the FFPE sample, with notable differences between callers. Combining the variants of the different callers using a simple heuristic, increases both the precision and the sensitivity of the variant calling. Validating the heuristic on nine additional matched FF-FFPE samples, resulted in an average F1-score of 0.58 and an outperformance of any of the individual callers. In addition, we could show that part of the discrepancy between the FF and the FFPE samples can be attributed to ITH. Conclusion This study illustrates that when using the correct variant calling strategy, the majority of clonal SNVs can be recovered in an FFPE sample with high precision and sensitivity. These results suggest that somatic variants derived from WGS of FFPE material can be used in cohort studies.
Collapse
Affiliation(s)
- Louise de Schaetzen van Brienen
- Department of Plant Biotechnology and Bioinformatics, Department of Information Technology, IDLab, imec, iGent Toren, Ghent, Belgium
| | - Maarten Larmuseau
- Department of Plant Biotechnology and Bioinformatics, Department of Information Technology, IDLab, imec, iGent Toren, Ghent, Belgium
| | - Kim Van der Eecken
- Department of Human Structure and Repair, Ghent University Hospital, Ghent, Belgium
| | - Frederic De Ryck
- Department of Vascular Surgery, Ghent University Hospital, Ghent, Belgium
| | - Pauline Robbe
- Oxford National Institute of Health Research (NIHR) Biomedical Research Centre, University of Oxford, Oxford, United Kingdom.,Division of Genomic Medicine, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Anna Schuh
- Oxford National Institute of Health Research (NIHR) Biomedical Research Centre, University of Oxford, Oxford, United Kingdom
| | - Jan Fostier
- Department of Plant Biotechnology and Bioinformatics, Department of Information Technology, IDLab, imec, iGent Toren, Ghent, Belgium
| | - Piet Ost
- Department of Radiotherapy, Ghent University Hospital, Ghent, Belgium
| | - Kathleen Marchal
- Department of Plant Biotechnology and Bioinformatics, Department of Information Technology, IDLab, imec, iGent Toren, Ghent, Belgium. .,Department of Genetics, University of Pretoria, Pretoria, SA, South Africa.
| |
Collapse
|
20
|
Cao C, Mak L, Jin G, Gordon P, Ye K, Long Q. PRESM: personalized reference editor for somatic mutation discovery in cancer genomics. Bioinformatics 2020; 35:1445-1452. [PMID: 30247633 DOI: 10.1093/bioinformatics/bty812] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Revised: 08/27/2018] [Accepted: 09/19/2018] [Indexed: 12/16/2022] Open
Abstract
MOTIVATION Accurate detection of somatic mutations is a crucial step toward understanding cancer. Various tools have been developed to detect somatic mutations from cancer genome sequencing data by mapping reads to a universal reference genome and inferring likelihoods from complex statistical models. However, read mapping is frequently obstructed by mismatches between germline and somatic mutations on a read and the reference genome. Previous attempts to develop personalized genome tools are not compatible with downstream statistical models for somatic mutation detection. RESULTS We present PRESM, a tool that builds personalized reference genomes by integrating germline mutations into the reference genome. The aforementioned obstacle is circumvented by using a two-step germline substitution procedure, maintaining positional fidelity using an innovative workaround. Reads derived from tumor tissue can be positioned more accurately along a personalized reference than a universal reference due to the reduced genetic distance between the subject (tumor genome) and the target (the personalized genome). Application of PRESM's personalized genome reduced false-positive (FP) somatic mutation calls by as much as 55.5%, and facilitated the discovery of a novel somatic point mutation on a germline insertion in PDE1A, a phosphodiesterase associated with melanoma. Moreover, all improvements in calling accuracy were achieved without parameter optimization, as PRESM itself is parameter-free. Hence, similar increases in read mapping and decreases in the FP rate will persist when PRESM-built genomes are applied to any user-provided dataset. AVAILABILITY AND IMPLEMENTATION The software is available at https://github.com/precisionomics/PRESM. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chen Cao
- Departments of Biochemistry & Molecular Biology and Medical Genetics, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, Canada
| | - Lauren Mak
- Departments of Biochemistry & Molecular Biology and Medical Genetics, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, Canada
| | - Guangxu Jin
- Department of Cancer Biology, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Paul Gordon
- Departments of Biochemistry & Molecular Biology and Medical Genetics, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, Canada
| | - Kai Ye
- Department of Bioinformatics, Electronic and Information Engineering School, Xi'an Jiaotong University, Xi'an, China
| | - Quan Long
- Departments of Biochemistry & Molecular Biology and Medical Genetics, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, Canada
| |
Collapse
|
21
|
Singla N, Xie Z, Zhang Z, Gao M, Yousuf Q, Onabolu O, McKenzie T, Tcheuyap VT, Ma Y, Choi J, McKay R, Christie A, Torras OR, Bowman IA, Margulis V, Pedrosa I, Przybycin C, Wang T, Kapur P, Rini B, Brugarolas J. Pancreatic tropism of metastatic renal cell carcinoma. JCI Insight 2020; 5:134564. [PMID: 32271170 DOI: 10.1172/jci.insight.134564] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Accepted: 03/04/2020] [Indexed: 12/30/2022] Open
Abstract
Renal cell carcinoma (RCC) is characterized by a particularly broad metastatic swath, and, enigmatically, when the pancreas is a destination, the disease is associated with improved survival. Intrigued by this observation, we sought to characterize the clinical behavior, therapeutic implications, and underlying biology. While pancreatic metastases (PM) are infrequent, we identified 31 patients across 2 institutional cohorts and show that improved survival is independent of established prognostic variables, that these tumors are exquisitely sensitive to antiangiogenic agents and resistant to immune checkpoint inhibitors (ICIs), and that they are characterized by a distinctive biology. Primary tumors of patients with PM exhibited frequent PBRM1 mutations, 3p loss, and 5q amplification, along with a lower frequency of aggressive features such as BAP1 mutations and loss of 9p, 14q, and 4q. Gene expression analyses revealed constrained evolution with remarkable uniformity, reduced effector T cell gene signatures, and increased angiogenesis. Similar findings were observed histopathologically. Thus, RCC metastatic to the pancreas is characterized by indolent biology, heightened angiogenesis, and an uninflamed stroma, likely underlying its good prognosis, sensitivity to antiangiogenic therapies, and refractoriness to ICI. These data suggest that metastatic organotropism may be an indicator of a particular biology with prognostic and treatment implications for patients.
Collapse
Affiliation(s)
- Nirmish Singla
- Kidney Cancer Program, Simmons Comprehensive Cancer Center.,Department of Urology, and
| | - Zhiqun Xie
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Ze Zhang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Ming Gao
- Kidney Cancer Program, Simmons Comprehensive Cancer Center
| | | | | | | | | | - Yuanqing Ma
- Kidney Cancer Program, Simmons Comprehensive Cancer Center
| | - Jacob Choi
- Department of Hematology and Medical Oncology, Cleveland Clinic Taussig Cancer Institute, Cleveland, Ohio, USA
| | - Renee McKay
- Kidney Cancer Program, Simmons Comprehensive Cancer Center.,Department of Internal Medicine
| | - Alana Christie
- Kidney Cancer Program, Simmons Comprehensive Cancer Center.,Division of Biostatistics, Department of Clinical Sciences, and
| | | | - Isaac A Bowman
- Kidney Cancer Program, Simmons Comprehensive Cancer Center.,Department of Internal Medicine
| | - Vitaly Margulis
- Kidney Cancer Program, Simmons Comprehensive Cancer Center.,Department of Urology, and
| | - Ivan Pedrosa
- Kidney Cancer Program, Simmons Comprehensive Cancer Center.,Department of Urology, and.,Department of Radiology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Christopher Przybycin
- Department of Pathology, Cleveland Clinic Lerner College of Medicine, Cleveland, Ohio, USA
| | - Tao Wang
- Kidney Cancer Program, Simmons Comprehensive Cancer Center.,Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Payal Kapur
- Kidney Cancer Program, Simmons Comprehensive Cancer Center.,Department of Pathology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Brian Rini
- Department of Hematology and Medical Oncology, Cleveland Clinic Taussig Cancer Institute, Cleveland, Ohio, USA
| | - James Brugarolas
- Kidney Cancer Program, Simmons Comprehensive Cancer Center.,Department of Internal Medicine
| |
Collapse
|
22
|
Zhang W, Williams TA, Bhagwath AS, Hiermann JS, Peacock CD, Watkins DN, Ding P, Park JY, Montgomery EA, Forastiere AA, Jie C, Cantarel BL, Pham TH, Wang DH. GEAMP, a novel gastroesophageal junction carcinoma cell line derived from a malignant pleural effusion. J Transl Med 2020; 100:16-26. [PMID: 31292541 PMCID: PMC6920545 DOI: 10.1038/s41374-019-0278-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Revised: 05/06/2019] [Accepted: 05/06/2019] [Indexed: 02/06/2023] Open
Abstract
Gastroesophageal junction (GEJ) cancer remains a clinically significant disease in Western countries due to its increasing incidence, which mirrors that of esophageal cancer, and poor prognosis. To develop novel and effective approaches for prevention, early detection, and treatment of patients with GEJ cancer, a better understanding of the mechanisms driving pathogenesis and malignant progression of this disease is required. These efforts have been limited by the small number of available cell lines and appropriate preclinical animal models for in vitro and in vivo studies. We have established and characterized a novel GEJ cancer cell line, GEAMP, derived from the malignant pleural effusion of a previously treated GEJ cancer patient. Comprehensive genetic analyses confirmed a clonal relationship between GEAMP cells and the primary tumor. Targeted next-generation sequencing identified 56 nonsynonymous alterations in 51 genes including TP53 and APC, which are commonly altered in GEJ cancer. In addition, multiple copy-number alterations were found including EGFR and K-RAS gene amplifications and loss of CDKN2A and CDKN2B. Histological examination of subcutaneous flank xenografts in nude and NOD-SCID mice showed a carcinoma with mixed squamous and glandular differentiation, suggesting GEAMP cells contain a subpopulation with multipotent potential. Finally, pharmacologic inhibition of the EGFR signaling pathway led to downregulation of key downstream kinases and inhibition of cell proliferation in vitro. Thus, GEAMP represents a valuable addition to the limited number of bona fide GEJ cancer cell lines.
Collapse
Affiliation(s)
- Wei Zhang
- Esophageal Diseases Center and Division of Hematology-Oncology, Department of Internal Medicine and the Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Taylor A. Williams
- Esophageal Diseases Center and Division of Hematology-Oncology, Department of Internal Medicine and the Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Ankur S. Bhagwath
- Esophageal Diseases Center and Division of Hematology-Oncology, Department of Internal Medicine and the Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Jared S. Hiermann
- Department of Biomedical Engineering, University of Minnesota, Minneapolis, MN, USA
| | - Craig D. Peacock
- Translational Hematology and Oncology Research, Cleveland Clinic, Cleveland, OH, USA
| | - D. Neil Watkins
- The Kinghorn Cancer Centre, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
| | - Peiguo Ding
- Esophageal Diseases Center and Division of Hematology-Oncology, Department of Internal Medicine and the Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Jason Y. Park
- Department of Pathology and the Eugene McDermott Center for Human Growth and Development, UT Southwestern Medical Center, Dallas, TX, USA
| | - Elizabeth A. Montgomery
- Division of Gastrointestinal and Liver Pathology, Department of Pathology, Johns Hopkins Hospital, Baltimore, MD, USA
| | - Arlene A. Forastiere
- Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Chunfa Jie
- Department of Biochemistry and Nutrition, Des Moines University, Des Moines, IA, USA
| | - Brandi L. Cantarel
- Bioinformatics Core Facility, Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Thai H. Pham
- Esophageal Diseases Center and Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA,VA North Texas Health Care System, Dallas, TX, USA
| | - David H. Wang
- Esophageal Diseases Center and Division of Hematology-Oncology, Department of Internal Medicine and the Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA,VA North Texas Health Care System, Dallas, TX, USA
| |
Collapse
|
23
|
Bartha Á, Győrffy B. Comprehensive Outline of Whole Exome Sequencing Data Analysis Tools Available in Clinical Oncology. Cancers (Basel) 2019; 11:E1725. [PMID: 31690036 PMCID: PMC6895801 DOI: 10.3390/cancers11111725] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 10/31/2019] [Accepted: 11/01/2019] [Indexed: 12/17/2022] Open
Abstract
Whole exome sequencing (WES) enables the analysis of all protein coding sequences in the human genome. This technology enables the investigation of cancer-related genetic aberrations that are predominantly located in the exonic regions. WES delivers high-throughput results at a reasonable price. Here, we review analysis tools enabling utilization of WES data in clinical and research settings. Technically, WES initially allows the detection of single nucleotide variants (SNVs) and copy number variations (CNVs), and data obtained through these methods can be combined and further utilized. Variant calling algorithms for SNVs range from standalone tools to machine learning-based combined pipelines. Tools for CNV detection compare the number of reads aligned to a dedicated segment. Both SNVs and CNVs help to identify mutations resulting in pharmacologically druggable alterations. The identification of homologous recombination deficiency enables the use of PARP inhibitors. Determining microsatellite instability and tumor mutation burden helps to select patients eligible for immunotherapy. To pave the way for clinical applications, we have to recognize some limitations of WES, including its restricted ability to detect CNVs, low coverage compared to targeted sequencing, and the missing consensus regarding references and minimal application requirements. Recently, Galaxy became the leading platform in non-command line-based WES data processing. The maturation of next-generation sequencing is reinforced by Food and Drug Administration (FDA)-approved methods for cancer screening, detection, and follow-up. WES is on the verge of becoming an affordable and sufficiently evolved technology for everyday clinical use.
Collapse
Affiliation(s)
- Áron Bartha
- Semmelweis University, Department of Bioinformatics and 2nd Department of Pediatrics, H-1094 Budapest, Hungary.
- TTK Cancer Biomarker Research Group, Institute of Enzymology, Magyar tudósokkörútja 2., H-1117 Budapest, Hungary.
| | - Balázs Győrffy
- Semmelweis University, Department of Bioinformatics and 2nd Department of Pediatrics, H-1094 Budapest, Hungary.
- TTK Cancer Biomarker Research Group, Institute of Enzymology, Magyar tudósokkörútja 2., H-1117 Budapest, Hungary.
| |
Collapse
|
24
|
Richters MM, Xia H, Campbell KM, Gillanders WE, Griffith OL, Griffith M. Best practices for bioinformatic characterization of neoantigens for clinical utility. Genome Med 2019; 11:56. [PMID: 31462330 PMCID: PMC6714459 DOI: 10.1186/s13073-019-0666-2] [Citation(s) in RCA: 147] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 08/16/2019] [Indexed: 12/13/2022] Open
Abstract
Neoantigens are newly formed peptides created from somatic mutations that are capable of inducing tumor-specific T cell recognition. Recently, researchers and clinicians have leveraged next generation sequencing technologies to identify neoantigens and to create personalized immunotherapies for cancer treatment. To create a personalized cancer vaccine, neoantigens must be computationally predicted from matched tumor-normal sequencing data, and then ranked according to their predicted capability in stimulating a T cell response. This candidate neoantigen prediction process involves multiple steps, including somatic mutation identification, HLA typing, peptide processing, and peptide-MHC binding prediction. The general workflow has been utilized for many preclinical and clinical trials, but there is no current consensus approach and few established best practices. In this article, we review recent discoveries, summarize the available computational tools, and provide analysis considerations for each step, including neoantigen prediction, prioritization, delivery, and validation methods. In addition to reviewing the current state of neoantigen analysis, we provide practical guidance, specific recommendations, and extensive discussion of critical concepts and points of confusion in the practice of neoantigen characterization for clinical use. Finally, we outline necessary areas of development, including the need to improve HLA class II typing accuracy, to expand software support for diverse neoantigen sources, and to incorporate clinical response data to improve neoantigen prediction algorithms. The ultimate goal of neoantigen characterization workflows is to create personalized vaccines that improve patient outcomes in diverse cancer types.
Collapse
Affiliation(s)
- Megan M Richters
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Forest Park Avenue, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - Huiming Xia
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA
- McDonnell Genome Institute, Forest Park Avenue, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - Katie M Campbell
- Division of Hematology and Oncology, Medical Plaza Driveway, Department of Medicine, University of California, Los Angeles, Los Angeles, CA, 90024, USA
| | - William E Gillanders
- Department of Surgery, South Euclid Avenue, Washington University School of Medicine, St. Louis, MO, 63110, USA
- Siteman Cancer Center, Parkview Place, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Obi L Griffith
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA.
- McDonnell Genome Institute, Forest Park Avenue, Washington University School of Medicine, St. Louis, MO, 63108, USA.
- Siteman Cancer Center, Parkview Place, Washington University School of Medicine, St. Louis, MO, 63110, USA.
- Department of Genetics, South Euclid Avenue, Washington University School of Medicine, St. Louis, MO, 63110, USA.
| | - Malachi Griffith
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA.
- McDonnell Genome Institute, Forest Park Avenue, Washington University School of Medicine, St. Louis, MO, 63108, USA.
- Siteman Cancer Center, Parkview Place, Washington University School of Medicine, St. Louis, MO, 63110, USA.
- Department of Genetics, South Euclid Avenue, Washington University School of Medicine, St. Louis, MO, 63110, USA.
| |
Collapse
|
25
|
Zhu M, Lu T, Jia Y, Luo X, Gopal P, Li L, Odewole M, Renteria V, Singal AG, Jang Y, Ge K, Wang SC, Sorouri M, Parekh JR, MacConmara MP, Yopp AC, Wang T, Zhu H. Somatic Mutations Increase Hepatic Clonal Fitness and Regeneration in Chronic Liver Disease. Cell 2019; 177:608-621.e12. [PMID: 30955891 PMCID: PMC6519461 DOI: 10.1016/j.cell.2019.03.026] [Citation(s) in RCA: 175] [Impact Index Per Article: 29.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 02/20/2019] [Accepted: 03/11/2019] [Indexed: 12/19/2022]
Abstract
Normal tissues accumulate genetic changes with age, but it is unknown if somatic mutations promote clonal expansion of non-malignant cells in the setting of chronic degenerative diseases. Exome sequencing of diseased liver samples from 82 patients revealed a complex mutational landscape in cirrhosis. Additional ultra-deep sequencing identified recurrent mutations in PKD1, PPARGC1B, KMT2D, and ARID1A. The number and size of mutant clones increased as a function of fibrosis stage and tissue damage. To interrogate the functional impact of mutated genes, a pooled in vivo CRISPR screening approach was established. In agreement with sequencing results, examination of 147 genes again revealed that loss of Pkd1, Kmt2d, and Arid1a promoted clonal expansion. Conditional heterozygous deletion of these genes in mice was also hepatoprotective in injury assays. Pre-malignant somatic alterations are often viewed through the lens of cancer, but we show that mutations can promote regeneration, likely independent of carcinogenesis.
Collapse
Affiliation(s)
- Min Zhu
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Tianshi Lu
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA; Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA, 75390
| | - Yuemeng Jia
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Xin Luo
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA; Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Purva Gopal
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Lin Li
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Mobolaji Odewole
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Veronica Renteria
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Amit G Singal
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | | | - Kai Ge
- NIDDK, NIH, Bethesda, MD 20892, USA
| | - Sam C Wang
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA; Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Mahsa Sorouri
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Justin R Parekh
- Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Malcolm P MacConmara
- Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Adam C Yopp
- Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Tao Wang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA, 75390; Kidney Cancer Program, Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA, 75390.
| | - Hao Zhu
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA.
| |
Collapse
|
26
|
Calling Variants in the Clinic: Informed Variant Calling Decisions Based on Biological, Clinical, and Laboratory Variables. Comput Struct Biotechnol J 2019; 17:561-569. [PMID: 31049166 PMCID: PMC6482431 DOI: 10.1016/j.csbj.2019.04.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Revised: 03/12/2019] [Accepted: 04/03/2019] [Indexed: 01/10/2023] Open
Abstract
Deep sequencing genomic analysis is becoming increasingly common in clinical research and practice, enabling accurate identification of diagnostic, prognostic, and predictive determinants. Variant calling, distinguishing between true mutations and experimental errors, is a central task of genomic analysis and often requires sophisticated statistical, computational, and/or heuristic techniques. Although variant callers seek to overcome noise inherent in biological experiments, variant calling can be significantly affected by outside factors including those used to prepare, store, and analyze samples. The goal of this review is to discuss known experimental features, such as sample preparation, library preparation, and sequencing, alongside diverse biological and clinical variables, and evaluate their effect on variant caller selection and optimization.
Collapse
|
27
|
Saini N, Gordenin DA. Somatic mutation load and spectra: A record of DNA damage and repair in healthy human cells. ENVIRONMENTAL AND MOLECULAR MUTAGENESIS 2018; 59:672-686. [PMID: 30152078 PMCID: PMC6188803 DOI: 10.1002/em.22215] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Revised: 06/07/2018] [Accepted: 06/11/2018] [Indexed: 05/31/2023]
Abstract
Somatic genome instability is a hallmark of cancer genomes and has been linked to aging and a variety of other pathologies. Large-scale cancer genome and exome sequencing have revealed that mutation load and spectra in cancers can be influenced by environmental exposures, the anatomical site of exposures, and tissue type. There is now an abundance of data favoring the hypothesis that a substantial portion of the mutations in cancers originate prior to carcinogenesis in stem cells of the healthy individual. Rapid advances in sequencing of noncancer cells from healthy humans have shown that their mutation loads and spectra resemble cancer data. Similar to cancer genomes, mutation profiles of healthy cells show marked intra-individual variation, thus providing a metric of the various factors-environmental and endogenous-involved in mutagenesis in these individuals. This review focuses on the current methodologies to measure mutation loads and to determine mutation signatures for evaluating the environmental and endogenous sources of DNA damage in human somatic cells. We anticipate that in future, such large-scale studies aimed at exploring the landscapes of somatic mutations across different cell types in healthy people would provide a valuable resource for designing personalized preventative strategies against diseases associated with somatic genome instability. Environ. Mol. Mutagen. 59:672-686, 2018. Published 2018. This article is a U.S. Government work and is in the public domain in the USA.
Collapse
Affiliation(s)
- Natalie Saini
- Genome Integrity and Structural Biology Laboratory, National Institute of Environmental Health Sciences, US National Institutes of Health, Research Triangle Park, North Carolina, USA
| | - Dmitry A. Gordenin
- Genome Integrity and Structural Biology Laboratory, National Institute of Environmental Health Sciences, US National Institutes of Health, Research Triangle Park, North Carolina, USA
| |
Collapse
|
28
|
Robbe P, Popitsch N, Knight SJL, Antoniou P, Becq J, He M, Kanapin A, Samsonova A, Vavoulis DV, Ross MT, Kingsbury Z, Cabes M, Ramos SDC, Page S, Dreau H, Ridout K, Jones LJ, Tuff-Lacey A, Henderson S, Mason J, Buffa FM, Verrill C, Maldonado-Perez D, Roxanis I, Collantes E, Browning L, Dhar S, Damato S, Davies S, Caulfield M, Bentley DR, Taylor JC, Turnbull C, Schuh A. Clinical whole-genome sequencing from routine formalin-fixed, paraffin-embedded specimens: pilot study for the 100,000 Genomes Project. Genet Med 2018; 20:1196-1205. [PMID: 29388947 PMCID: PMC6520241 DOI: 10.1038/gim.2017.241] [Citation(s) in RCA: 109] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2017] [Accepted: 11/06/2017] [Indexed: 12/16/2022] Open
Abstract
PURPOSE Fresh-frozen (FF) tissue is the optimal source of DNA for whole-genome sequencing (WGS) of cancer patients. However, it is not always available, limiting the widespread application of WGS in clinical practice. We explored the viability of using formalin-fixed, paraffin-embedded (FFPE) tissues, available routinely for cancer patients, as a source of DNA for clinical WGS. METHODS We conducted a prospective study using DNAs from matched FF, FFPE, and peripheral blood germ-line specimens collected from 52 cancer patients (156 samples) following routine diagnostic protocols. We compared somatic variants detected in FFPE and matching FF samples. RESULTS We found the single-nucleotide variant agreement reached 71% across the genome and somatic copy-number alterations (CNAs) detection from FFPE samples was suboptimal (0.44 median correlation with FF) due to nonuniform coverage. CNA detection was improved significantly with lower reverse crosslinking temperature in FFPE DNA extraction (80 °C or 65 °C depending on the methods). Our final data showed somatic variant detection from FFPE for clinical decision making is possible. We detected 98% of clinically actionable variants (including 30/31 CNAs). CONCLUSION We present the first prospective WGS study of cancer patients using FFPE specimens collected in a routine clinical environment proving WGS can be applied in the clinic.
Collapse
Affiliation(s)
- Pauline Robbe
- Oxford Molecular Diagnostics Centre, Radcliffe Department of Medicine, University of Oxford, Oxford, UK.
| | - Niko Popitsch
- Wellcome Trust Centre of Human Genetics, University of Oxford, Old Road Campus Research Building, Oxford, UK
| | - Samantha J L Knight
- Wellcome Trust Centre of Human Genetics, University of Oxford, Old Road Campus Research Building, Oxford, UK
| | - Pavlos Antoniou
- Oxford Molecular Diagnostics Centre, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Jennifer Becq
- Illumina Cambridge Ltd., Chesterford Research Park, Saffron Walden, UK
| | - Miao He
- Illumina Cambridge Ltd., Chesterford Research Park, Saffron Walden, UK
| | | | | | - Dimitrios V Vavoulis
- Oxford Molecular Diagnostics Centre, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Mark T Ross
- Illumina Cambridge Ltd., Chesterford Research Park, Saffron Walden, UK
| | - Zoya Kingsbury
- Illumina Cambridge Ltd., Chesterford Research Park, Saffron Walden, UK
| | - Maite Cabes
- Oxford Molecular Diagnostics Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Trust, Oxford, UK
| | - Sara D C Ramos
- Oxford Molecular Diagnostics Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Trust, Oxford, UK
| | - Suzanne Page
- Oxford Molecular Diagnostics Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Trust, Oxford, UK
| | - Helene Dreau
- Oxford Molecular Diagnostics Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Trust, Oxford, UK
| | - Kate Ridout
- Oxford Molecular Diagnostics Centre, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Louise J Jones
- Genomics England, William Harvey Research Institute, Queen Mary University of London, London, UK
| | - Alice Tuff-Lacey
- Genomics England, William Harvey Research Institute, Queen Mary University of London, London, UK
| | - Shirley Henderson
- Oxford Molecular Diagnostics Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Trust, Oxford, UK
| | - Joanne Mason
- Genomics England, William Harvey Research Institute, Queen Mary University of London, London, UK
| | - Francesca M Buffa
- Computational Biology and Integrative Genomics, Department of Oncology, University of Oxford, Oxford, UK
| | - Clare Verrill
- Nuffield Department of Surgical Sciences, University of Oxford, John Radcliffe Hospital, Oxford, UK
| | - David Maldonado-Perez
- Department of Cellular Pathology, Oxford University Hospital Foundation Trust, Oxford, UK
| | - Ioannis Roxanis
- Department of Cellular Pathology, Oxford University Hospital Foundation Trust, Oxford, UK
| | - Elena Collantes
- Department of Cellular Pathology, Oxford University Hospital Foundation Trust, Oxford, UK
| | - Lisa Browning
- Department of Cellular Pathology, Oxford University Hospital Foundation Trust, Oxford, UK
| | - Sunanda Dhar
- Department of Cellular Pathology, Oxford University Hospital Foundation Trust, Oxford, UK
| | - Stephen Damato
- Department of Cellular Pathology, Oxford University Hospital Foundation Trust, Oxford, UK
| | - Susan Davies
- Department of Cellular Pathology, Oxford University Hospital Foundation Trust, Oxford, UK
| | - Mark Caulfield
- Genomics England, William Harvey Research Institute, Queen Mary University of London, London, UK
- NIHR Biomedical Research Centre at Barts Health NHS Trust, London, UK
| | - David R Bentley
- Illumina Cambridge Ltd., Chesterford Research Park, Saffron Walden, UK
| | - Jenny C Taylor
- Wellcome Trust Centre of Human Genetics, University of Oxford, Old Road Campus Research Building, Oxford, UK
- NIHR Comprehensive Biomedical Research Centre, Oxford, UK
| | - Clare Turnbull
- Genomics England, William Harvey Research Institute, Queen Mary University of London, London, UK
- Department of Cellular Pathology, Oxford University Hospital Foundation Trust, Oxford, UK
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
| | - Anna Schuh
- Oxford Molecular Diagnostics Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Trust, Oxford, UK
- NIHR Comprehensive Biomedical Research Centre, Oxford, UK
- Oxford Molecular Diagnostics Centre, Department of Oncology, University of Oxford, Oxford, UK
| |
Collapse
|
29
|
Semeraro R, Orlandini V, Magi A. Xome-Blender: A novel cancer genome simulator. PLoS One 2018; 13:e0194472. [PMID: 29621252 PMCID: PMC5886411 DOI: 10.1371/journal.pone.0194472] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 02/05/2018] [Indexed: 11/18/2022] Open
Abstract
The adoption of next generation sequencing based methods in cancer research allowed for the investigation of the complex genetic structure of tumor samples. In the last few years, considerable importance was given to the research of somatic variants and several computational approaches were developed for this purpose. Despite continuous improvements to these programs, the validation of their results it’s a hard challenge due to multiple sources of error. To overcome this drawback different simulation approaches are used to generate synthetic samples but they are often based on the addition of artificial mutations that mimic the complexity of genomic variations. For these reasons, we developed a novel software, Xome-Blender, that generates synthetic cancer genomes with user defined features such as the number of subclones, the number of somatic variants and the presence of copy number alterations (CNAs), without the addition of any synthetic element. The singularity of our method is the “morphological approach” used to generate mutation events. To demonstrate the power of our tool we used it to address the hard challenge of evaluating the performance of nine state-of-the-art somatic variant calling methods for small and large variants (VarScan2, MuTect, Shimmer, BCFtools, Strelka, EXCAVATOR2, Control-FREEC and CopywriteR). Through these analyses we observed that by using Xome-Blender data it is possible to appraise small differences between their performance and we have designated VarScan2 and EXCAVATOR2 as best tool for this kind of applications. Xome-Blender is unix-based, licensed under the GPLv3 and freely available at https://github.com/rsemeraro/XomeBlender.
Collapse
Affiliation(s)
- Roberto Semeraro
- Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
- * E-mail:
| | - Valerio Orlandini
- Medical Genetics Unit, Meyer Children’s University Hospital, Florence, Italy
| | - Alberto Magi
- Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
| |
Collapse
|
30
|
Chen YC, Sudre G, Sharp W, Donovan F, Chandrasekharappa SC, Hansen N, Elnitski L, Shaw P. Neuroanatomic, epigenetic and genetic differences in monozygotic twins discordant for attention deficit hyperactivity disorder. Mol Psychiatry 2018; 23:683-690. [PMID: 28322272 PMCID: PMC5914518 DOI: 10.1038/mp.2017.45] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Revised: 01/10/2017] [Accepted: 01/17/2017] [Indexed: 12/18/2022]
Abstract
The study of monozygotic twins discordant for attention deficit hyperactivity disorder can elucidate mechanisms that contribute to the disorder, which affects ~7% of children. First, using in vivo neuroanatomic imaging on 14 pairs of monozygotic twins (mean age 9.7, s.d. 1.9 years), we found that discordance for the disorder is mirrored by differing dimensions of deep brain structures (the striatum and cerebellum), but not the cerebral cortex. Next, using whole-blood DNA from the same twins, we found a significant enrichment of epigenetic differences in genes expressed in these 'discordant' brain structures. Specifically, there is differential methylation of probes lying in the shore and shelf and enhancer regions of striatal and cerebellar genes. Notably, gene sets pertaining to the cerebral cortex (which did not differ in volume between affected and unaffected twins) were not enriched by differentially methylated probes. Genotypic differences between the twin pairs-such as copy number and rare, single-nucleotide variants-did not contribute to phenotypic discordance. Pathway analyses of the genes implicated by the most differentially methylated probes implicated γ-aminobutyric acid (GABA), dopamine and serotonin neurotransmitter systems. The study illustrates how neuroimaging can help guide the search for epigenomic mechanisms in neurodevelopmental disorders.
Collapse
Affiliation(s)
- Yun-Ching Chen
- Genomic Functional Analysis Section, Translational and Functional Genomics Branch, NHGRI/NIH, Bethesda
| | - Gustavo Sudre
- Neurobehavioral Clinical Research Section, Social and Behavioral Research Branch, NHGRI/NIH, Bethesda
| | - Wendy Sharp
- Neurobehavioral Clinical Research Section, Social and Behavioral Research Branch, NHGRI/NIH, Bethesda
| | - Frank Donovan
- Genomics Core and Cancer Genomics Unit, Cancer Genetics and Comparative Genomics Branch, NHGRI/NIH, Bethesda
| | | | | | - Laura Elnitski
- Genomic Functional Analysis Section, Translational and Functional Genomics Branch, NHGRI/NIH, Bethesda
| | - Philip Shaw
- Neurobehavioral Clinical Research Section, Social and Behavioral Research Branch, NHGRI/NIH, Bethesda
| |
Collapse
|
31
|
Xu C. A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data. Comput Struct Biotechnol J 2018; 16:15-24. [PMID: 29552334 PMCID: PMC5852328 DOI: 10.1016/j.csbj.2018.01.003] [Citation(s) in RCA: 153] [Impact Index Per Article: 21.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Revised: 01/20/2018] [Accepted: 01/28/2018] [Indexed: 02/06/2023] Open
Abstract
Detection of somatic mutations holds great potential in cancer treatment and has been a very active research field in the past few years, especially since the breakthrough of the next-generation sequencing technology. A collection of variant calling pipelines have been developed with different underlying models, filters, input data requirements, and targeted applications. This review aims to enumerate these unique features of the state-of-the-art variant callers, in the hope to provide a practical guide for selecting the appropriate pipeline for specific applications. We will focus on the detection of somatic single nucleotide variants, ranging from traditional variant callers based on whole genome or exome sequencing of paired tumor-normal samples to recent low-frequency variant callers designed for targeted sequencing protocols with unique molecular identifiers. The variant callers have been extensively benchmarked with inconsistent performances across these studies. We will review the reference materials, datasets, and performance metrics that have been used in the benchmarking studies. In the end, we will discuss emerging trends and future directions of the variant calling algorithms.
Collapse
Affiliation(s)
- Chang Xu
- Life Science Research and Foundation, Qiagen Sciences, Inc., 6951 Executive Way, Frederick, Maryland 21703, USA
| |
Collapse
|
32
|
Krøigård AB, Larsen MJ, Lænkholm AV, Knoop AS, Jensen JD, Bak M, Mollenhauer J, Thomassen M, Kruse TA. Identification of metastasis driver genes by massive parallel sequencing of successive steps of breast cancer progression. PLoS One 2018; 13:e0189887. [PMID: 29293529 PMCID: PMC5749725 DOI: 10.1371/journal.pone.0189887] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Accepted: 12/04/2017] [Indexed: 12/17/2022] Open
Abstract
Cancer results from alterations at essential genomic sites and is characterized by uncontrolled cell proliferation, invasion and metastasis. Identification of driver genes of metastatic progression is essential, as metastases, not primary tumors, are fatal. To gain insight into the mutational concordance between different steps of malignant progression we performed exome sequencing and validation with targeted deep sequencing of successive steps of malignant progression from pre-invasive stages to asynchronous distant metastases in six breast cancer patients. Using the ratio of non-synonymous to synonymous mutations, a surprisingly large number of cancer driver genes, ranging between 3 and 145, were estimated to confer a selective advantage in the studied primary tumors. We report a substantial amount of metastasis specific mutations and a number of novel putative metastasis driver genes. Most notable are the DCC, ABCA13, TIAM2, CREBBP, BCL6B and ZNF185 genes, mainly mutated exclusively in metastases and highly likely driver genes of metastatic progression. We find different genes and pathways to be affected at different steps of malignant progression. The Adherens junction pathway is affected in four of the six studied patients and this pathway most likely plays a vital role in the metastatic process.
Collapse
Affiliation(s)
- Anne Bruun Krøigård
- Department of Clinical Genetics, Odense University Hospital, Odense, Denmark
- Human Genetics, Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
- * E-mail:
| | - Martin Jakob Larsen
- Department of Clinical Genetics, Odense University Hospital, Odense, Denmark
- Human Genetics, Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
| | | | - Ann S. Knoop
- Department of Oncology, Rigshospitalet, Copenhagen, Denmark
| | | | - Martin Bak
- Department of Pathology, Odense University Hospital, Odense, Denmark
| | - Jan Mollenhauer
- Lundbeckfonden Center of Excellence NanoCAN, University of Southern Denmark, Odense, Denmark
- Molecular Oncology Group, Institute of Molecular Medicine, University of Southern Denmark, Odense, Denmark
| | - Mads Thomassen
- Department of Clinical Genetics, Odense University Hospital, Odense, Denmark
- Human Genetics, Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
- Lundbeckfonden Center of Excellence NanoCAN, University of Southern Denmark, Odense, Denmark
| | - Torben A. Kruse
- Department of Clinical Genetics, Odense University Hospital, Odense, Denmark
- Human Genetics, Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
- Lundbeckfonden Center of Excellence NanoCAN, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
33
|
Bohnert R, Vivas S, Jansen G. Comprehensive benchmarking of SNV callers for highly admixed tumor data. PLoS One 2017; 12:e0186175. [PMID: 29020110 PMCID: PMC5636151 DOI: 10.1371/journal.pone.0186175] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 09/26/2017] [Indexed: 12/30/2022] Open
Abstract
Precision medicine attempts to individualize cancer therapy by matching tumor-specific genetic changes with effective targeted therapies. A crucial first step in this process is the reliable identification of cancer-relevant variants, which is considerably complicated by the impurity and heterogeneity of clinical tumor samples. We compared the impact of admixture of non-cancerous cells and low somatic allele frequencies on the sensitivity and precision of 19 state-of-the-art SNV callers. We studied both whole exome and targeted gene panel data and up to 13 distinct parameter configurations for each tool. We found vast differences among callers. Based on our comprehensive analyses we recommend joint tumor-normal calling with MuTect, EBCall or Strelka for whole exome somatic variant calling, and HaplotypeCaller or FreeBayes for whole exome germline calling. For targeted gene panel data on a single tumor sample, LoFreqStar performed best. We further found that tumor impurity and admixture had a negative impact on precision, and in particular, sensitivity in whole exome experiments. At admixture levels of 60% to 90% sometimes seen in pathological biopsies, sensitivity dropped significantly, even when variants were originally present in the tumor at 100% allele frequency. Sensitivity to low-frequency SNVs improved with targeted panel data, but whole exome data allowed more efficient identification of germline variants. Effective somatic variant calling requires high-quality pathological samples with minimal admixture, a consciously selected sequencing strategy, and the appropriate variant calling tool with settings optimized for the chosen type of data.
Collapse
|
34
|
Teer JK, Zhang Y, Chen L, Welsh EA, Cress WD, Eschrich SA, Berglund AE. Evaluating somatic tumor mutation detection without matched normal samples. Hum Genomics 2017; 11:22. [PMID: 28870239 PMCID: PMC5584341 DOI: 10.1186/s40246-017-0118-2] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2017] [Accepted: 08/24/2017] [Indexed: 12/30/2022] Open
Abstract
Background Observations of recurrent somatic mutations in tumors have led to identification and definition of signaling and other pathways that are important for cancer progression and therapeutic targeting. As tumor cells contain both an individual’s inherited genetic variants and somatic mutations, challenges arise in distinguishing these events in massively parallel sequencing datasets. Typically, both a tumor sample and a “normal” sample from the same individual are sequenced and compared; variants observed only in the tumor are considered to be somatic mutations. However, this approach requires two samples for each individual. Results We evaluate a method of detecting somatic mutations in tumor samples for which only a subset of normal samples are available. We describe tuning of the method for detection of mutations in tumors, filtering to remove inherited variants, and comparison of detected mutations to several matched tumor/normal analysis methods. Filtering steps include the use of population variation datasets to remove inherited variants as well a subset of normal samples to remove technical artifacts. We then directly compare mutation detection with tumor-only and tumor-normal approaches using the same sets of samples. Comparisons are performed using an internal targeted gene sequencing dataset (n = 3380) as well as whole exome sequencing data from The Cancer Genome Atlas project (n = 250). Tumor-only mutation detection shows similar recall (43–60%) but lesser precision (20–21%) to current matched tumor/normal approaches (recall 43–73%, precision 30–82%) when compared to a “gold-standard” tumor/normal approach. The inclusion of a small pool of normal samples improves precision, although many variants are still uniquely detected in the tumor-only analysis. Conclusions A detailed method for somatic mutation detection without matched normal samples enables study of larger numbers of tumor samples, as well as tumor samples for which a matched normal is not available. As sensitivity/recall is similar to tumor/normal mutation detection but precision is lower, tumor-only detection is more appropriate for classification of samples based on known mutations. Although matched tumor-normal analysis is preferred due to higher precision, we demonstrate that mutation detection without matched normal samples is possible for certain applications. Electronic supplementary material The online version of this article (10.1186/s40246-017-0118-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jamie K Teer
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA.
| | - Yonghong Zhang
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Lu Chen
- Department of Molecular Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Eric A Welsh
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - W Douglas Cress
- Department of Molecular Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Steven A Eschrich
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Anders E Berglund
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| |
Collapse
|
35
|
Krøigård AB, Larsen MJ, Brasch-Andersen C, Lænkholm AV, Knoop AS, Jensen JD, Bak M, Mollenhauer J, Thomassen M, Kruse TA. Genomic Analyses of Breast Cancer Progression Reveal Distinct Routes of Metastasis Emergence. Sci Rep 2017; 7:43813. [PMID: 28276460 PMCID: PMC5343450 DOI: 10.1038/srep43813] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Accepted: 01/31/2017] [Indexed: 01/06/2023] Open
Abstract
A main controversy in cancer research is whether metastatic abilities are present in the most advanced clone of the primary tumor or result from independently acquired aberrations in early disseminated cancer cells as suggested by the linear and the parallel progression models, respectively. The genetic concordance between different steps of malignant progression is mostly unexplored as very few studies have included cancer samples separated by both space and time. We applied whole exome sequencing and targeted deep sequencing to 26 successive samples from six patients with metastatic estrogen receptor (ER)-positive breast cancer. Our data provide support for both linear and parallel progression towards metastasis. We report for the first time evidence of metastasis-to-metastasis seeding in breast cancer. Our results point to three distinct routes of metastasis emergence. This may have profound clinical implications and provides substantial novel molecular insights into the timing and mutational evolution of breast cancer metastasis.
Collapse
Affiliation(s)
- Anne Bruun Krøigård
- Department of Clinical Genetics, Odense University Hospital, Odense, Denmark.,Human Genetics, Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
| | - Martin Jakob Larsen
- Department of Clinical Genetics, Odense University Hospital, Odense, Denmark.,Human Genetics, Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
| | - Charlotte Brasch-Andersen
- Department of Clinical Genetics, Odense University Hospital, Odense, Denmark.,Human Genetics, Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
| | | | - Ann S Knoop
- Department of Oncology, Rigshospitalet, Copenhagen, Denmark
| | | | - Martin Bak
- Department of Pathology, Odense University Hospital, Odense, Denmark
| | - Jan Mollenhauer
- Lundbeckfonden Center of Excellence NanoCAN, Odense, Denmark.,Molecular Oncology Group, Institute of Molecular Medicine, University of Southern Denmark, Odense, Denmark
| | - Mads Thomassen
- Department of Clinical Genetics, Odense University Hospital, Odense, Denmark.,Human Genetics, Institute of Clinical Research, University of Southern Denmark, Odense, Denmark.,Lundbeckfonden Center of Excellence NanoCAN, Odense, Denmark
| | - Torben A Kruse
- Department of Clinical Genetics, Odense University Hospital, Odense, Denmark.,Human Genetics, Institute of Clinical Research, University of Southern Denmark, Odense, Denmark.,Lundbeckfonden Center of Excellence NanoCAN, Odense, Denmark
| |
Collapse
|
36
|
Kansler ER, Verma A, Langdon EM, Simon-Vermot T, Yin A, Lee W, Attiyeh M, Elemento O, White RM. Melanoma genome evolution across species. BMC Genomics 2017; 18:136. [PMID: 28173755 PMCID: PMC5297047 DOI: 10.1186/s12864-017-3518-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2016] [Accepted: 01/26/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cancer genomes evolve in both space and time, which contributes to the genetic heterogeneity that underlies tumor progression and drug resistance. In human melanoma, identifying mechanistically important events in tumor evolution is hampered due to the high background mutation rate from ultraviolet (UV) light. Cross-species oncogenomics is a powerful tool for identifying these core events, in which transgenically well-defined animal models of cancer are compared to human cancers to identify key conserved alterations. RESULTS We use a zebrafish model of tumor progression and drug resistance for cross-species genomic analysis in melanoma. Zebrafish transgenic tumors are initiated with just 2 genetic lesions, BRAFV600E and p53-/-, yet take 4-6 months to appear, at which time whole genome sequencing demonstrated >3,000 new mutations. An additional 4-month exposure to the BRAF inhibitor vemurafenib resulted in a highly drug resistant tumor that showed 3 additional new DNA mutations in the genes BUB1B, PINK1, and COL16A1. These genetic changes in drug resistance are accompanied by a massive reorganization of the transcriptome, with differential RNA expression of over 800 genes, centered on alterations in cAMP and PKA signaling. By comparing both the DNA and mRNA changes to a large panel of human melanomas, we find that there is a highly significant enrichment of these alterations in human patients with vemurafenib resistant disease. CONCLUSIONS Our results suggest that targeting of alterations that are conserved between zebrafish and humans may offer new avenues for therapeutic intervention. The approaches described here will be broadly applicable to the diverse array of cancer models available in the zebrafish, which can be used to inform human cancer genomics.
Collapse
Affiliation(s)
- Emily R Kansler
- Memorial Sloan Kettering Cancer Center, Cancer Biology & Genetics, New York, USA
| | - Akanksha Verma
- Weill-Cornell Medical College, Institute for Computational Biomedicine, New York, USA
| | - Erin M Langdon
- Memorial Sloan Kettering Cancer Center, Cancer Biology & Genetics, New York, USA
| | - Theresa Simon-Vermot
- Memorial Sloan Kettering Cancer Center, Cancer Biology & Genetics, New York, USA
| | - Alexandra Yin
- Memorial Sloan Kettering Cancer Center, Cancer Biology & Genetics, New York, USA
| | - William Lee
- Memorial Sloan Kettering Cancer Center, Computational Biology, New York, USA
| | - Marc Attiyeh
- Memorial Sloan Kettering Cancer Center, The David M. Rubenstein Center for Pancreatic Cancer Research, New York, USA
| | - Olivier Elemento
- Weill-Cornell Medical College, Institute for Computational Biomedicine, New York, USA
| | - Richard M White
- Memorial Sloan Kettering Cancer Center, Cancer Biology & Genetics, New York, USA. .,Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, USA.
| |
Collapse
|
37
|
iPSCs and fibroblast subclones from the same fibroblast population contain comparable levels of sequence variations. Proc Natl Acad Sci U S A 2017; 114:1964-1969. [PMID: 28167771 DOI: 10.1073/pnas.1616035114] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Genome integrity of induced pluripotent stem cells (iPSCs) has been extensively studied in recent years, but it is still unclear whether iPSCs contain more genomic variations than cultured somatic cells. One important question is the origin of genomic variations detected in iPSCs-whether iPSC reprogramming induces such variations. Here, we undertook a unique approach by deriving fibroblast subclones and clonal iPSC lines from the same fibroblast population and applied next-generation sequencing to compare genomic variations in these lines. Targeted deep sequencing of parental fibroblasts revealed that most variants detected in clonal iPSCs and fibroblast subclones were rare variants inherited from the parental fibroblasts. Only a small number of variants remained undetectable in the parental fibroblasts, which were thus likely to be de novo. Importantly, the clonal iPSCs and fibroblast subclones contained comparable numbers of de novo variants. Collectively, our data suggest that iPSC reprogramming is not mutagenic.
Collapse
|
38
|
do Valle ÍF, Giampieri E, Simonetti G, Padella A, Manfrini M, Ferrari A, Papayannidis C, Zironi I, Garonzi M, Bernardi S, Delledonne M, Martinelli G, Remondini D, Castellani G. Optimized pipeline of MuTect and GATK tools to improve the detection of somatic single nucleotide polymorphisms in whole-exome sequencing data. BMC Bioinformatics 2016; 17:341. [PMID: 28185561 PMCID: PMC5123378 DOI: 10.1186/s12859-016-1190-7] [Citation(s) in RCA: 79] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Detecting somatic mutations in whole exome sequencing data of cancer samples has become a popular approach for profiling cancer development, progression and chemotherapy resistance. Several studies have proposed software packages, filters and parametrizations. However, many research groups reported low concordance among different methods. We aimed to develop a pipeline which detects a wide range of single nucleotide mutations with high validation rates. We combined two standard tools - Genome Analysis Toolkit (GATK) and MuTect - to create the GATK-LODN method. As proof of principle, we applied our pipeline to exome sequencing data of hematological (Acute Myeloid and Acute Lymphoblastic Leukemias) and solid (Gastrointestinal Stromal Tumor and Lung Adenocarcinoma) tumors. We performed experiments on simulated data to test the sensitivity and specificity of our pipeline. RESULTS The software MuTect presented the highest validation rate (90 %) for mutation detection, but limited number of somatic mutations detected. The GATK detected a high number of mutations but with low specificity. The GATK-LODN increased the performance of the GATK variant detection (from 5 of 14 to 3 of 4 confirmed variants), while preserving mutations not detected by MuTect. However, GATK-LODN filtered more variants in the hematological samples than in the solid tumors. Experiments in simulated data demonstrated that GATK-LODN increased both specificity and sensitivity of GATK results. CONCLUSION We presented a pipeline that detects a wide range of somatic single nucleotide variants, with good validation rates, from exome sequencing data of cancer samples. We also showed the advantage of combining standard algorithms to create the GATK-LODN method, that increased specificity and sensitivity of GATK results. This pipeline can be helpful in discovery studies aimed to profile the somatic mutational landscape of cancer genomes.
Collapse
Affiliation(s)
- Ítalo Faria do Valle
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy
- CAPES Foundation, Ministry of Education of Brazil, Brasília, DF, Brazil
| | - Enrico Giampieri
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy
| | - Giorgia Simonetti
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Antonella Padella
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Marco Manfrini
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Anna Ferrari
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Cristina Papayannidis
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Isabella Zironi
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy
| | - Marianna Garonzi
- Department of Biotechnology, University of Verona, Verona, Italy
| | - Simona Bernardi
- Unit of Blood Diseases and Stem Cell Transplantation, Department of Clinical and Experimental Sciences, University of Brescia, Brescia, Italy
| | - Massimo Delledonne
- Department of Biotechnology, University of Verona, Verona, Italy
- Personal Genomics, Verona, Italy
| | - Giovanni Martinelli
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Daniel Remondini
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy.
| | - Gastone Castellani
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy
| |
Collapse
|
39
|
Krøigård AB, Thomassen M, Lænkholm AV, Kruse TA, Larsen MJ. Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data. PLoS One 2016; 11:e0151664. [PMID: 27002637 PMCID: PMC4803342 DOI: 10.1371/journal.pone.0151664] [Citation(s) in RCA: 115] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Accepted: 03/02/2016] [Indexed: 02/03/2023] Open
Abstract
Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths.
Collapse
Affiliation(s)
- Anne Bruun Krøigård
- Department of Clinical Genetics, Odense University Hospital, Sdr. Boulevard 29, 5000, Odense, Denmark
- Human Genetics, Institute of Clinical Research, University of Southern Denmark, Winsløvparken 19, 5000, Odense, Denmark
- * E-mail:
| | - Mads Thomassen
- Department of Clinical Genetics, Odense University Hospital, Sdr. Boulevard 29, 5000, Odense, Denmark
- Human Genetics, Institute of Clinical Research, University of Southern Denmark, Winsløvparken 19, 5000, Odense, Denmark
| | - Anne-Vibeke Lænkholm
- Department of Pathology, Slagelse Hospital, Ingemannsvej 18, 4200, Slagelse, Denmark
| | - Torben A. Kruse
- Department of Clinical Genetics, Odense University Hospital, Sdr. Boulevard 29, 5000, Odense, Denmark
- Human Genetics, Institute of Clinical Research, University of Southern Denmark, Winsløvparken 19, 5000, Odense, Denmark
| | - Martin Jakob Larsen
- Department of Clinical Genetics, Odense University Hospital, Sdr. Boulevard 29, 5000, Odense, Denmark
- Human Genetics, Institute of Clinical Research, University of Southern Denmark, Winsløvparken 19, 5000, Odense, Denmark
| |
Collapse
|
40
|
Budeus B, Timm J, Hoffmann D. SeqFeatR for the Discovery of Feature-Sequence Associations. PLoS One 2016; 11:e0146409. [PMID: 26731669 PMCID: PMC4701496 DOI: 10.1371/journal.pone.0146409] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 12/15/2015] [Indexed: 12/20/2022] Open
Abstract
Specific selection pressures often lead to specifically mutated genomes. The open source software SeqFeatR has been developed to identify associations between mutation patterns in biological sequences and specific selection pressures ("features"). For instance, SeqFeatR has been used to discover in viral protein sequences new T cell epitopes for hosts of given HLA types. SeqFeatR supports frequentist and Bayesian methods for the discovery of statistical sequence-feature associations. Moreover, it offers novel ways to visualize results of the statistical analyses and to relate them to further properties. In this article we demonstrate various functions of SeqFeatR with real data. The most frequently used set of functions is also provided by a web server. SeqFeatR is implemented as R package and freely available from the R archive CRAN (http://cran.r-project.org/web/packages/SeqFeatR/index.html). The package includes a tutorial vignette. The software is distributed under the GNU General Public License (version 3 or later). The web server URL is https://seqfeatr.zmb.uni-due.de.
Collapse
Affiliation(s)
- Bettina Budeus
- Research Group Bioinformatics, Faculty of Biology, University of Duisburg-Essen, Essen, NRW, Germany
| | - Jörg Timm
- Institute for Virology, University Hospital Düsseldorf, Düsseldorf, NRW, Germany
| | - Daniel Hoffmann
- Research Group Bioinformatics, Faculty of Biology, University of Duisburg-Essen, Essen, NRW, Germany
- * E-mail:
| |
Collapse
|
41
|
Big Data and Cancer Research. BIG DATA ANALYTICS 2016. [DOI: 10.1007/978-81-322-3628-3_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
42
|
Bao R, Hernandez K, Huang L, Kang W, Bartom E, Onel K, Volchenboum S, Andrade J. ExScalibur: A High-Performance Cloud-Enabled Suite for Whole Exome Germline and Somatic Mutation Identification. PLoS One 2015; 10:e0135800. [PMID: 26271043 PMCID: PMC4535852 DOI: 10.1371/journal.pone.0135800] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 07/27/2015] [Indexed: 12/30/2022] Open
Abstract
Whole exome sequencing has facilitated the discovery of causal genetic variants associated with human diseases at deep coverage and low cost. In particular, the detection of somatic mutations from tumor/normal pairs has provided insights into the cancer genome. Although there is an abundance of publicly-available software for the detection of germline and somatic variants, concordance is generally limited among variant callers and alignment algorithms. Successful integration of variants detected by multiple methods requires in-depth knowledge of the software, access to high-performance computing resources, and advanced programming techniques. We present ExScalibur, a set of fully automated, highly scalable and modulated pipelines for whole exome data analysis. The suite integrates multiple alignment and variant calling algorithms for the accurate detection of germline and somatic mutations with close to 99% sensitivity and specificity. ExScalibur implements streamlined execution of analytical modules, real-time monitoring of pipeline progress, robust handling of errors and intuitive documentation that allows for increased reproducibility and sharing of results and workflows. It runs on local computers, high-performance computing clusters and cloud environments. In addition, we provide a data analysis report utility to facilitate visualization of the results that offers interactive exploration of quality control files, read alignment and variant calls, assisting downstream customization of potential disease-causing mutations. ExScalibur is open-source and is also available as a public image on Amazon cloud.
Collapse
Affiliation(s)
- Riyue Bao
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
| | - Kyle Hernandez
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
| | - Lei Huang
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
| | - Wenjun Kang
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
| | - Elizabeth Bartom
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
| | - Kenan Onel
- Department of Pediatrics, The University of Chicago, Chicago, Illinois, United States of America
| | - Samuel Volchenboum
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
- Department of Pediatrics, The University of Chicago, Chicago, Illinois, United States of America
- Computation Institute, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (JA); (SV)
| | - Jorge Andrade
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (JA); (SV)
| |
Collapse
|
43
|
Bao R, Huang L, Andrade J, Tan W, Kibbe WA, Jiang H, Feng G. Review of current methods, applications, and data management for the bioinformatics analysis of whole exome sequencing. Cancer Inform 2014; 13:67-82. [PMID: 25288881 PMCID: PMC4179624 DOI: 10.4137/cin.s13779] [Citation(s) in RCA: 85] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2014] [Revised: 07/06/2014] [Accepted: 07/07/2014] [Indexed: 12/21/2022] Open
Abstract
The advent of next-generation sequencing technologies has greatly promoted advances in the study of human diseases at the genomic, transcriptomic, and epigenetic levels. Exome sequencing, where the coding region of the genome is captured and sequenced at a deep level, has proven to be a cost-effective method to detect disease-causing variants and discover gene targets. In this review, we outline the general framework of whole exome sequence data analysis. We focus on established bioinformatics tools and applications that support five analytical steps: raw data quality assessment, pre-processing, alignment, post-processing, and variant analysis (detection, annotation, and prioritization). We evaluate the performance of open-source alignment programs and variant calling tools using simulated and benchmark datasets, and highlight the challenges posed by the lack of concordance among variant detection tools. Based on these results, we recommend adopting multiple tools and resources to reduce false positives and increase the sensitivity of variant calling. In addition, we briefly discuss the current status and solutions for big data management, analysis, and summarization in the field of bioinformatics.
Collapse
Affiliation(s)
- Riyue Bao
- Center for Research Informatics, The University of Chicago, Chicago, IL, USA
| | - Lei Huang
- Center for Research Informatics, The University of Chicago, Chicago, IL, USA
| | - Jorge Andrade
- Center for Research Informatics, The University of Chicago, Chicago, IL, USA
| | - Wei Tan
- IBM Thomas J. Watson Research Center, Yorktown Heights, New York, USA
| | - Warren A Kibbe
- Biomedical Informatics Center (NUBIC), Clinical and Translational Sciences Institute (NUCATS), Northwestern University, Chicago, IL, USA
| | - Hongmei Jiang
- Department of Statistics, Northwestern University, Evanston, IL, USA
| | - Gang Feng
- Biomedical Informatics Center (NUBIC), Clinical and Translational Sciences Institute (NUCATS), Northwestern University, Chicago, IL, USA
| |
Collapse
|
44
|
Teer JK. An improved understanding of cancer genomics through massively parallel sequencing. Transl Cancer Res 2014; 3:243-259. [PMID: 26146607 PMCID: PMC4486294 DOI: 10.3978/j.issn.2218-676x.2014.05.05] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
DNA sequencing technology advances have enabled genetic investigation of more samples in a shorter time than has previously been possible. Furthermore, the ability to analyze and understand large sequencing datasets has improved due to concurrent advances in sequence data analysis methods and software tools. Constant improvements to both technology and analytic approaches in this fast moving field are evidenced by many recent publications of computational methods, as well as biological results linking genetic events to human disease. Cancer in particular has been the subject of intense investigation, owing to the genetic underpinnings of this complex collection of diseases. New massively-parallel sequencing (MPS) technologies have enabled the investigation of thousands of samples, divided across tens of different tumor types, resulting in new driver gene identification, mutagenic pattern characterization, and other newly uncovered features of tumor biology. This review will focus both on methods and recent results: current analytical approaches to DNA and RNA sequencing will be presented followed by a review of recent pan-cancer sequencing studies. This overview of methods and results will not only highlight the recent advances in cancer genomics, but also the methods and tools used to accomplish these advancements in a constantly and rapidly improving field.
Collapse
Affiliation(s)
- Jamie K Teer
- , H. Lee Moffitt Cancer Center and Research Institute, 12902 Magnolia Dr., Tampa, FL 33612, Tel: 813-745-2650
| |
Collapse
|
45
|
Kim SY, Jacob L, Speed TP. Combining calls from multiple somatic mutation-callers. BMC Bioinformatics 2014; 15:154. [PMID: 24885750 PMCID: PMC4035752 DOI: 10.1186/1471-2105-15-154] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2014] [Accepted: 05/12/2014] [Indexed: 11/29/2022] Open
Abstract
Background Accurate somatic mutation-calling is essential for insightful mutation analyses in cancer studies. Several mutation-callers are publicly available and more are likely to appear. Nonetheless, mutation-calling is still challenging and there is unlikely to be one established caller that systematically outperforms all others. Therefore, fully utilizing multiple callers can be a powerful way to construct a list of final calls for one’s research. Results Using a set of mutations from multiple callers that are impartially validated, we present a statistical approach for building a combined caller, which can be applied to combine calls in a wider dataset generated using a similar protocol. Using the mutation outputs and the validation data from The Cancer Genome Atlas endometrial study (6,746 sites), we demonstrate how to build a statistical model that predicts the probability of each call being a somatic mutation, based on the detection status of multiple callers and a few associated features. Conclusion The approach allows us to build a combined caller across the full range of stringency levels, which outperforms all of the individual callers.
Collapse
Affiliation(s)
- Su Yeon Kim
- Department of Statistics, University of California at Berkeley, Berkeley CA 94720, USA.
| | | | | |
Collapse
|
46
|
Cantarel BL, Weaver D, McNeill N, Zhang J, Mackey AJ, Reese J. BAYSIC: a Bayesian method for combining sets of genome variants with improved specificity and sensitivity. BMC Bioinformatics 2014; 15:104. [PMID: 24725768 PMCID: PMC3999887 DOI: 10.1186/1471-2105-15-104] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2013] [Accepted: 03/31/2014] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Accurate genomic variant detection is an essential step in gleaning medically useful information from genome data. However, low concordance among variant-calling methods reduces confidence in the clinical validity of whole genome and exome sequence data, and confounds downstream analysis for applications in genome medicine.Here we describe BAYSIC (BAYeSian Integrated Caller), which combines SNP variant calls produced by different methods (e.g. GATK, FreeBayes, Atlas, SamTools, etc.) into a more accurate set of variant calls. BAYSIC differs from majority voting, consensus or other ad hoc intersection-based schemes for combining sets of genome variant calls. Unlike other classification methods, the underlying BAYSIC model does not require training using a "gold standard" of true positives. Rather, with each new dataset, BAYSIC performs an unsupervised, fully Bayesian latent class analysis to estimate false positive and false negative error rates for each input method. The user specifies a posterior probability threshold according to the user's tolerance for false positive and false negative errors; lowering the posterior probability threshold allows the user to trade specificity for sensitivity while raising the threshold increases specificity in exchange for sensitivity. RESULTS We assessed the performance of BAYSIC in comparison to other variant detection methods using ten low coverage (~5X) samples from The 1000 Genomes Project, a tumor/normal exome pair (40X), and exome sequences (40X) from positive control samples previously identified to contain clinically relevant SNPs. We demonstrated BAYSIC's superior variant-calling accuracy, both for somatic mutation detection and germline variant detection. CONCLUSIONS BAYSIC provides a method for combining sets of SNP variant calls produced by different variant calling programs. The integrated set of SNP variant calls produced by BAYSIC improves the sensitivity and specificity of the variant calls used as input. In addition to combining sets of germline variants, BAYSIC can also be used to combine sets of somatic mutations detected in the context of tumor/normal sequencing experiments.
Collapse
|
47
|
Bromberg Y. Building a genome analysis pipeline to predict disease risk and prevent disease. J Mol Biol 2013; 425:3993-4005. [PMID: 23928561 DOI: 10.1016/j.jmb.2013.07.038] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Revised: 07/26/2013] [Accepted: 07/28/2013] [Indexed: 12/24/2022]
Abstract
Reduced costs and increased speed and accuracy of sequencing can bring the genome-based evaluation of individual disease risk to the bedside. While past efforts have identified a number of actionable mutations, the bulk of genetic risk remains hidden in sequence data. The biggest challenge facing genomic medicine today is the development of new techniques to predict the specifics of a given human phenome (set of all expressed phenotypes) encoded by each individual variome (full set of genome variants) in the context of the given environment. Numerous tools exist for the computational identification of the functional effects of a single variant. However, the pipelines taking advantage of full genomic, exomic, transcriptomic (and other) sequences have only recently become a reality. This review looks at the building of methodologies for predicting "variome"-defined disease risk. It also discusses some of the challenges for incorporating such a pipeline into everyday medical practice.
Collapse
Affiliation(s)
- Y Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08873, USA.
| |
Collapse
|