1
|
Chung YS, Kang S, Kim J, Lee S, Kim S. CLEMENT: genomic decomposition and reconstruction of non-tumor subclones. Nucleic Acids Res 2024; 52:e62. [PMID: 38922688 DOI: 10.1093/nar/gkae527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 05/27/2024] [Accepted: 06/12/2024] [Indexed: 06/28/2024] Open
Abstract
Genome-level clonal decomposition of a single specimen has been widely studied; however, it is mostly limited to cancer research. In this study, we developed a new algorithm CLEMENT, which conducts accurate decomposition and reconstruction of multiple subclones in genome sequencing of non-tumor (normal) samples. CLEMENT employs the Expectation-Maximization (EM) algorithm with optimization strategies specific to non-tumor subclones, including false variant call identification, non-disparate clone fuzzy clustering, and clonal allele fraction confinement. In the simulation and in vitro cell line mixture data, CLEMENT outperformed current cancer decomposition algorithms in estimating the number of clones (root-mean-square-error = 0.58-0.78 versus 1.43-3.34) and in the variant-clone membership agreement (∼85.5% versus 70.1-76.7%). Additional testing on human multi-clonal normal tissue sequencing confirmed the accurate identification of subclones that originated from different cell types. Clone-level analysis, including mutational burden and signatures, provided a new understanding of normal-tissue composition. We expect that CLEMENT will serve as a crucial tool in the currently emerging field of non-tumor genome analysis.
Collapse
Affiliation(s)
- Young-Soo Chung
- Department of Biomedical Systems Informatics, Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Seungseok Kang
- Department of Biomedical Systems Informatics, Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Jisu Kim
- DataShape team, Inria Saclay Île-De-France, Palaiseau 91120, France
- Department of Statistics, Seoul National University, Seoul 08826, Republic of Korea
| | - Sangbo Lee
- Department of Biomedical Systems Informatics, Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Sangwoo Kim
- Department of Biomedical Systems Informatics, Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| |
Collapse
|
2
|
Chen B, Yang Y, Wang X, Yang W, Lu Y, Wang D, Zhuo E, Tang Y, Su J, Tang G, Shao S, Gu K. mRNA vaccine development and applications: A special focus on tumors (Review). Int J Oncol 2024; 65:81. [PMID: 38994758 PMCID: PMC11251742 DOI: 10.3892/ijo.2024.5669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Accepted: 05/20/2024] [Indexed: 07/13/2024] Open
Abstract
Cancer is characterized by unlimited proliferation and metastasis, and traditional therapeutic strategies usually result in the acquisition of drug resistance, thus highlighting the need for more personalized treatment. mRNA vaccines transfer the gene sequences of exogenous target antigens into human cells through transcription and translation to stimulate the body to produce specific immune responses against the encoded proteins, so as to enable the body to obtain immune protection against said antigens; this approach may be adopted for personalized cancer therapy. Since the recent coronavirus pandemic, the development of mRNA vaccines has seen substantial progress and widespread adoption. In the present review, the development of mRNA vaccines, their mechanisms of action, factors influencing their function and the current clinical applications of the vaccine are discussed. A focus is placed on the application of mRNA vaccines in cancer, with the aim of highlighting unique advances and the remaining challenges of this novel and promising therapeutic approach.
Collapse
Affiliation(s)
- Bangjie Chen
- Department of Oncology, The First Affiliated Hospital of Anhui Medical University, Hefei, Anhui 230022, P.R. China
| | - Yipin Yang
- Department of Oncology, The First Affiliated Hospital of Anhui Medical University, Hefei, Anhui 230022, P.R. China
| | - Xinyi Wang
- Department of Radiation Oncology, The First Affiliated Hospital of Anhui Medical University, Hefei, Anhui 230022, P.R. China
| | - Wenzhi Yang
- First Clinical Medical College, Anhui Medical University, Hefei, Anhui 230022, P.R. China
| | - You Lu
- First Clinical Medical College, Anhui Medical University, Hefei, Anhui 230022, P.R. China
| | - Daoyue Wang
- Department of Oncology, The First Affiliated Hospital of Anhui Medical University, Hefei, Anhui 230022, P.R. China
| | - Enba Zhuo
- Department of Anesthesiology, The First Affiliated Hospital of Anhui Medical University, Hefei, Anhui 230022, P.R. China
| | - Yanchao Tang
- Department of Anesthesiology, The First Affiliated Hospital of Anhui Medical University, Hefei, Anhui 230022, P.R. China
| | - Junhong Su
- Department of Rehabilitation, The First Affiliated Hospital of Anhui Medical University, Hefei, Anhui 230022, P.R. China
| | - Guozheng Tang
- Department of Orthopedics, Lu'an Hospital of Anhui Medical University, Lu'an, Anhui 237008, P.R. China
| | - Song Shao
- Department of Orthopedics, Lu'an Hospital of Anhui Medical University, Lu'an, Anhui 237008, P.R. China
| | - Kangsheng Gu
- Department of Oncology, The First Affiliated Hospital of Anhui Medical University, Hefei, Anhui 230022, P.R. China
| |
Collapse
|
3
|
Atzeni R, Massidda M, Pieroni E, Rallo V, Pisu M, Angius A. A Novel Affordable and Reliable Framework for Accurate Detection and Comprehensive Analysis of Somatic Mutations in Cancer. Int J Mol Sci 2024; 25:8044. [PMID: 39125613 PMCID: PMC11311285 DOI: 10.3390/ijms25158044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 07/11/2024] [Accepted: 07/22/2024] [Indexed: 08/12/2024] Open
Abstract
Accurate detection and analysis of somatic variants in cancer involve multiple third-party tools with complex dependencies and configurations, leading to laborious, error-prone, and time-consuming data conversions. This approach lacks accuracy, reproducibility, and portability, limiting clinical application. Musta was developed to address these issues as an end-to-end pipeline for detecting, classifying, and interpreting cancer mutations. Musta is based on a Python command-line tool designed to manage tumor-normal samples for precise somatic mutation analysis. The core is a Snakemake-based workflow that covers all key cancer genomics steps, including variant calling, mutational signature deconvolution, variant annotation, driver gene detection, pathway analysis, and tumor heterogeneity estimation. Musta is easy to install on any system via Docker, with a Makefile handling installation, configuration, and execution, allowing for full or partial pipeline runs. Musta has been validated at the CRS4-NGS Core facility and tested on large datasets from The Cancer Genome Atlas and the Beijing Institute of Genomics. Musta has proven robust and flexible for somatic variant analysis in cancer. It is user-friendly, requiring no specialized programming skills, and enables data processing with a single command line. Its reproducibility ensures consistent results across users following the same protocol.
Collapse
Affiliation(s)
- Rossano Atzeni
- Center for Advanced Studies, Research and Development in Sardinia (CRS4), 09050 Pula, Italy; (R.A.); (E.P.); (M.P.)
| | - Matteo Massidda
- Department of Medical, Surgical and Experimental Sciences, University of Sassari, 07100 Sassari, Italy;
| | - Enrico Pieroni
- Center for Advanced Studies, Research and Development in Sardinia (CRS4), 09050 Pula, Italy; (R.A.); (E.P.); (M.P.)
| | - Vincenzo Rallo
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cittadella Universitaria di Cagliari, 09042 Monserrato, Italy;
| | - Massimo Pisu
- Center for Advanced Studies, Research and Development in Sardinia (CRS4), 09050 Pula, Italy; (R.A.); (E.P.); (M.P.)
| | - Andrea Angius
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cittadella Universitaria di Cagliari, 09042 Monserrato, Italy;
| |
Collapse
|
4
|
Xiao Y, Zhu Y, Chen J, Wu M, Wang L, Su L, Feng F, Hou Y. Overexpression of SYNGAP1 suppresses the proliferation of rectal adenocarcinoma via Wnt/β-Catenin signaling pathway. Discov Oncol 2024; 15:135. [PMID: 38679635 PMCID: PMC11056356 DOI: 10.1007/s12672-024-00997-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 04/24/2024] [Indexed: 05/01/2024] Open
Abstract
Rectal adenocarcinoma (READ) is a common malignant tumor of the digestive tract. Growing studies have confirmed Ras GTPase-activating proteins are involved in the progression of several tumors. This study aimed to explore the expression and function of Ras GTPase-activating proteins in READ. In this study, we analyzed RNA sequencing data from 165 patients with READ and 789 normal tissue samples, identifying 5603 differentially expressed genes (DEGs), including 2937 upregulated genes and 2666 downregulated genes. Moreover, we also identified two dysregulated genes, RASA4 and SYNGAP1, among six Ras GTPase-activating proteins. High NF1 expression was associated with longer overall survival, while high SYNGAP1 expression showed a trend towards extended overall survival. Further analysis revealed the mutation frequency and copy number variations of Ras GTPase-activating proteins in various cancer samples. Additionally, DNA methylation analysis demonstrated a negative correlation between DNA methylation of Ras GTPase-activating proteins and their expression. Moreover, among Ras GTPase-activating proteins, we focused on SYNGAP1, and experimental validation confirmed that the overexpression of SYNGAP1 in READ significantly suppressed READ cell proliferation and increased apoptosis via regulating the Wnt/β-Catenin signaling pathway. These findings underscored the potential significance of SYNGAP1 in READ and provide new insights for further research and treatment.
Collapse
Affiliation(s)
- Yun Xiao
- Department of Oncology and Hematology, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, China
| | - Ying Zhu
- Department of Oncology and Hematology, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, China
| | - Jiaojiao Chen
- Department of Oncology and Hematology, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, China
| | - Mei Wu
- Department of Oncology and Hematology, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, China
| | - Lan Wang
- Department of Oncology and Hematology, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, China
| | - Li Su
- Department of Oncology and Hematology, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, China
| | - Fei Feng
- Department of Oncology and Hematology, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, China.
| | - Yanli Hou
- Department of Oncology and Hematology, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, China.
| |
Collapse
|
5
|
Fan Y, Guo Y, Liu Y, Xiao S, Gao G, Zhang X, Wu G. Study on the aging status of insulators based on hyperspectral imaging technology. OPTICS EXPRESS 2024; 32:5072-5087. [PMID: 38439243 DOI: 10.1364/oe.506030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 01/19/2024] [Indexed: 03/06/2024]
Abstract
The acidic environment is one of the main factors leading to the aging of silicone rubber (SiR) insulators. Aging can reduce the surface hydrophobicity and pollution flashover resistance of insulators, threatening the safe and stable operation of the power grid. Therefore, evaluating the aging state of insulators is essential to prevent flashover accidents on the transmission line. This paper is based on an optical hyperspectral imaging (HSI) technology for pixel-level assessment of insulator aging status. Firstly, the SiR samples were artificially aged in three typical acidic solutions with different concentrations of HNO3, H2SO4, and HCl, and six aging grades of SiR samples were prepared. The HSI of SiR at each aging grade was extracted using a hyperspectral imager. To reduce the calculation complexity and eliminate the interference of useless information in the band, this paper proposes a joint random forest- principal component analysis (RF-PCA) dimensionality reduction method to reduce the original 256-dimensional hyperspectral data to 7 dimensions. Finally, to capture local features in hyperspectral images more effectively and retain the most significant information of the spectral lines, a convolutional neural network (CNN) was used to build a classification model for pixel-level assessment of the SiR's aging state of and visual prediction of insulators' defects. The research method in this paper provides an important guarantee for the timely detection of safety hazards in the power grid.
Collapse
|
6
|
Pablo-Fontecha V, Hernández-Illán E, Reparaz A, Asensio E, Morata J, Tonda R, Lahoz S, Parra C, Lozano JJ, García-Heredia A, Martínez-Roca A, Beltran S, Balaguer F, Jover R, Castells A, Trullàs R, Podlesniy P, Camps J. Quantification of rare somatic single nucleotide variants by droplet digital PCR using SuperSelective primers. Sci Rep 2023; 13:18997. [PMID: 37923774 PMCID: PMC10624686 DOI: 10.1038/s41598-023-39874-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 08/01/2023] [Indexed: 11/06/2023] Open
Abstract
Somatic single-nucleotide variants (SNVs) occur every time a cell divides, appearing even in healthy tissues at low frequencies. These mutations may accumulate as neutral variants during aging, or eventually, promote the development of neoplasia. Here, we present the SP-ddPCR, a droplet digital PCR (ddPCR) based approach that utilizes customized SuperSelective primers aiming at quantifying the proportion of rare SNVs. For that purpose, we selected five potentially pathogenic variants identified by whole-exome sequencing (WES) occurring at low variant allele frequency (VAF) in at-risk colon healthy mucosa of patients diagnosed with colorectal cancer or advanced adenoma. Additionally, two APC SNVs detected in two cancer lesions were added to the study for WES-VAF validation. SuperSelective primers were designed to quantify SNVs at low VAFs both in silico and in clinical samples. In addition to the two APC SNVs in colonic lesions, SP-ddPCR confirmed the presence of three out of five selected SNVs in the normal colonic mucosa with allelic frequencies ≤ 5%. Moreover, SP-ddPCR showed the presence of two potentially pathogenic variants in the distal normal mucosa of patients with colorectal carcinoma. In summary, SP-ddPCR offers a rapid and feasible methodology to validate next-generation sequencing data and accurately quantify rare SNVs, thus providing a potential tool for diagnosis and stratification of at-risk patients based on their mutational profiling.
Collapse
Affiliation(s)
- Verónica Pablo-Fontecha
- Translational Colorectal Cancer Genomics, Gastrointestinal and Pancreatic Oncology Team, Institut D'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clínic de Barcelona, Rosselló 149-153, 4th Floor, 08036, Barcelona, Spain
- Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas (CIBERNED), 28029, Madrid, Spain
| | - Eva Hernández-Illán
- Translational Colorectal Cancer Genomics, Gastrointestinal and Pancreatic Oncology Team, Institut D'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clínic de Barcelona, Rosselló 149-153, 4th Floor, 08036, Barcelona, Spain
| | - Andrea Reparaz
- Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas (CIBERNED), 28029, Madrid, Spain
- Neurobiology Unit, Institut d'Investigacions Biomèdiques de Barcelona (IIBB-CSIC), Institut D'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), 08036, Barcelona, Spain
| | - Elena Asensio
- Translational Colorectal Cancer Genomics, Gastrointestinal and Pancreatic Oncology Team, Institut D'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clínic de Barcelona, Rosselló 149-153, 4th Floor, 08036, Barcelona, Spain
| | - Jordi Morata
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08028, Barcelona, Spain
| | - Raúl Tonda
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08028, Barcelona, Spain
| | - Sara Lahoz
- Translational Colorectal Cancer Genomics, Gastrointestinal and Pancreatic Oncology Team, Institut D'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clínic de Barcelona, Rosselló 149-153, 4th Floor, 08036, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), 28029, Madrid, Spain
| | - Carolina Parra
- Translational Colorectal Cancer Genomics, Gastrointestinal and Pancreatic Oncology Team, Institut D'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clínic de Barcelona, Rosselló 149-153, 4th Floor, 08036, Barcelona, Spain
| | - Juan José Lozano
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), 28029, Madrid, Spain
| | - Anabel García-Heredia
- Servicio de Medicina Digestiva, Hospital General Universitario de Alicante, Instituto de Investigación Sanitaria y Biomédica de Alicante (ISABIAL), 03010, Alicante, Spain
| | - Alejandro Martínez-Roca
- Servicio de Medicina Digestiva, Hospital General Universitario de Alicante, Instituto de Investigación Sanitaria y Biomédica de Alicante (ISABIAL), 03010, Alicante, Spain
| | - Sergi Beltran
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08028, Barcelona, Spain
| | - Francesc Balaguer
- Translational Colorectal Cancer Genomics, Gastrointestinal and Pancreatic Oncology Team, Institut D'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clínic de Barcelona, Rosselló 149-153, 4th Floor, 08036, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), 28029, Madrid, Spain
| | - Rodrigo Jover
- Servicio de Medicina Digestiva, Hospital General Universitario de Alicante, Instituto de Investigación Sanitaria y Biomédica de Alicante (ISABIAL), 03010, Alicante, Spain
| | - Antoni Castells
- Translational Colorectal Cancer Genomics, Gastrointestinal and Pancreatic Oncology Team, Institut D'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clínic de Barcelona, Rosselló 149-153, 4th Floor, 08036, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), 28029, Madrid, Spain
| | - Ramon Trullàs
- Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas (CIBERNED), 28029, Madrid, Spain
- Neurobiology Unit, Institut d'Investigacions Biomèdiques de Barcelona (IIBB-CSIC), Institut D'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), 08036, Barcelona, Spain
| | - Petar Podlesniy
- Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas (CIBERNED), 28029, Madrid, Spain
- Neurobiology Unit, Institut d'Investigacions Biomèdiques de Barcelona (IIBB-CSIC), Institut D'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), 08036, Barcelona, Spain
| | - Jordi Camps
- Translational Colorectal Cancer Genomics, Gastrointestinal and Pancreatic Oncology Team, Institut D'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clínic de Barcelona, Rosselló 149-153, 4th Floor, 08036, Barcelona, Spain.
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), 28029, Madrid, Spain.
- Unitat de Biologia Cel·lular i Genètica Mèdica, Departament de Biologia Cel·lular, Fisiologia i Immunologia, Facultat de Medicina, Universitat Autònoma de Barcelona, 08193, Bellaterra, Spain.
| |
Collapse
|
7
|
Fan R, Ji X, Li J, Cui Q, Cui C. Defining the single base importance of human mRNAs and lncRNAs. Brief Bioinform 2023; 24:bbad321. [PMID: 37668090 DOI: 10.1093/bib/bbad321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 07/28/2023] [Accepted: 08/16/2023] [Indexed: 09/06/2023] Open
Abstract
As the fundamental unit of a gene and its transcripts, nucleotides have enormous impacts on the gene function and evolution, and thus on phenotypes and diseases. In order to identify the key nucleotides of one specific gene, it is quite crucial to quantitatively measure the importance of each base on the gene. However, there are still no sequence-based methods of doing that. Here, we proposed Base Importance Calculator (BIC), an algorithm to calculate the importance score of each single base based on sequence information of human mRNAs and long noncoding RNAs (lncRNAs). We then confirmed its power by applying BIC to three different tasks. Firstly, we revealed that BIC can effectively evaluate the pathogenicity of both genes and single bases through single nucleotide variations. Moreover, the BIC score in The Cancer Genome Atlas somatic mutations is able to predict the prognosis of some cancers. Finally, we show that BIC can also precisely predict the transmissibility of SARS-CoV-2. The above results indicate that BIC is a useful tool for evaluating the single base importance of human mRNAs and lncRNAs.
Collapse
Affiliation(s)
- Rui Fan
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, State Key Lab of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Xiangwen Ji
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, State Key Lab of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, State Key Lab of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
- School of Sports Medicine, Wuhan Institute of Physical Education, No.461 Luoyu Rd. Wuchang District, Wuhan 430079, Hubei Province, China
| | - Chunmei Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, State Key Lab of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| |
Collapse
|
8
|
Zhou S, Jin Q, Yao H, Ying J, Tian L, Jiang X, Yang Y, Jiang X, Gao W, Zhang W, Zhu Y, Cao W. Pain-Related Gene Solute Carrier Family 24 Member 3 Is a Prognostic Biomarker and Correlated with Immune Infiltrates in Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma: A Study via Integrated Bioinformatics Analyses and Experimental Verification. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2023; 2023:4164232. [PMID: 36798148 PMCID: PMC9928512 DOI: 10.1155/2023/4164232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 10/13/2022] [Accepted: 11/24/2022] [Indexed: 02/10/2023]
Abstract
The aim of this study was to explore cervical carcinoma and screen a suitable gene as the biomarker used for prognosis evaluation as well as pain therapy. Low expression levels of solute carrier family 24 member 3 (SLC24A3) was involved in the appearance and development of numerous malignancies. Nevertheless, the prognostic value of SLC24A3 expression with cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC) patients remains uncertain. During the present study, SLC24A3 expression in CESC was retrieved from TCGA, GEO, and MSigDB databases. Based on TCGA and GEO profiles, we performed survival and difference analyses about SLC24A3 both in two GEO (GSE44001 and GSE63514) and TCGA-CESC cohorts (all p < 0.05), indicating that SLC24A3 was low expressed in tumors and associated with higher overall survival in CESC patients. Additionally, we programmed a series of analyses, including genomic profiling, enrichment analysis, immune infiltration analysis, and therapy-related analysis to identify the mechanism of the SLC24A3 in the process of cancer in CESC. Meanwhile, qRT-PCR was used to validate that the expression of SLC24A3 mRNA in Hela and SiHa cell lines was significantly lower than in PANC-1 and HUCEC cell lines. Our finding elucidated that the SLC24A3, a sodium-calcium regulator of cells, is an indispensable factor which can significantly influence the prognosis of patients with CESC and could provide novel clinical evidence to serve as a potential biological indicator for future diagnosis and pain therapy.
Collapse
Affiliation(s)
- Shuguang Zhou
- Department of Gynecology, Anhui Province Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
- Department of Gynecology, Anhui Medical University Affiliated Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
- Department of Gynecology, Linquan Maternity and Child Healthcare Hospital, Fuyang, Anhui 236400, China
| | - Qinqin Jin
- Department of Gynecology, Anhui Province Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
- Department of Gynecology, Anhui Medical University Affiliated Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
| | - Hui Yao
- Department of Gynecology, Anhui Province Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
- Department of Gynecology, Anhui Medical University Affiliated Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
| | - Jie Ying
- Department of Gynecology, Anhui Province Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
- Department of Gynecology, Anhui Medical University Affiliated Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
| | - Lu Tian
- Department of Gynecology, Anhui Province Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
- Department of Gynecology, Anhui Medical University Affiliated Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
| | - Xiya Jiang
- Department of Gynecology, Anhui Province Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
- Department of Gynecology, Anhui Medical University Affiliated Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
| | - Yinting Yang
- Department of Gynecology, Anhui Province Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
- Department of Gynecology, Anhui Medical University Affiliated Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
| | - Xiaomin Jiang
- Department of Gynecology, Anhui Province Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
- Department of Gynecology, Anhui Medical University Affiliated Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
| | - Wei Gao
- Department of Gynecology, Anhui Province Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
- Department of Gynecology, Anhui Medical University Affiliated Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
| | - Weiyu Zhang
- Department of Gynecology, Anhui Province Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
- Department of Gynecology, Anhui Medical University Affiliated Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
| | - Yuting Zhu
- Department of Gynecology, Anhui Province Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
- Department of Gynecology, Anhui Medical University Affiliated Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
| | - Wujun Cao
- Department of Clinical Laboratory, Anhui Province Maternity and Child Healthcare Hospital, Hefei, Anhui 230001, China
| |
Collapse
|
9
|
Li B, Hong L, Luo Y, Zhang B, Yu Z, Li W, Cao N, Huang Y, Xu D, Li Y, Tian Y. LPS-Induced Liver Injury of Magang Geese through Toll-like Receptor and MAPK Signaling Pathway. Animals (Basel) 2022; 13:ani13010127. [PMID: 36611736 PMCID: PMC9817723 DOI: 10.3390/ani13010127] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 12/24/2022] [Accepted: 12/26/2022] [Indexed: 12/30/2022] Open
Abstract
Lipopolysaccharide (LPS) is one of the main virulence factors of Gram-negative bacteria. In the process of waterfowl breeding, an inflammatory reaction due to LPS infection is easily produced, which leads to a decline in waterfowl performance. The liver plays a vital role in the immune response and the removal of toxic components. Therefore, it is necessary to study the mechanism of liver injury induced by LPS in goose. In this study, a total of 100 1-day-old goslings were randomly divided into a control group and LPS group after 3 days of pre-feeding. On days 21, 23, and 25 of the formal experiment, the control group was intraperitoneally injected with 0.5 mL normal saline, and the LPS group was intraperitoneally injected with LPS 2 mg/(kg body weight) once a day. On day 25 of the experiment, liver samples were collected 3 h after the injection of saline and LPS. The results of histopathology and biochemical indexes showed that the livers of the LPS group had liver morphological structure destruction and inflammatory cell infiltration, and the levels of ALT and AST were increased. Next, RNA sequencing analysis was used to determine the abundances and characteristics of the transcripts, as well as the associated somatic mutations and alternative splicing. We screened 727 differentially expressed genes (DEGs) with p < 0.05 and |log2(Fold Change)| ≥ 1, as the thresholds; GO and KEGG enrichment analysis showed that LPS-induced liver injury may be involved in the Toll-like receptor signaling pathway, MAPK signaling pathway, NOD-like receptor signaling pathway, FoxO, and PPAR signaling pathway. Finally, we intersected the genes enriched in the key pathway of LPS-induced liver injury with the top 50 key genes in protein−protein interaction networks to obtain 28 more critical genes. Among them, 17 genes were enriched in Toll-like signaling pathway and MAPK signaling pathway. Therefore, these results suggest that LPS-induced liver injury in geese may be the result of the joint action of Toll-like receptor, MAPK, NOD-like receptor, FoxO, and PPAR signaling pathway. Among them, the TLR7-mediated MAPK signaling pathway plays a major role.
Collapse
Affiliation(s)
- Bingxin Li
- College of Animal Science & Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
- Guangdong Province Key Laboratory of Waterfowl Healthy Breeding, Guangzhou 510225, China
| | - Longsheng Hong
- Guangdong Province Key Laboratory of Waterfowl Healthy Breeding, Guangzhou 510225, China
- College of Veterinary Medicine, South China Agricultural University, Guangzhou 510642, China
| | - Yindan Luo
- College of Animal Science & Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
| | - Bingqi Zhang
- College of Animal Science & Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
- Guangdong Province Key Laboratory of Waterfowl Healthy Breeding, Guangzhou 510225, China
| | - Ziyu Yu
- College of Animal Science & Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
- Guangdong Province Key Laboratory of Waterfowl Healthy Breeding, Guangzhou 510225, China
| | - Wanyan Li
- College of Animal Science & Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
- Guangdong Province Key Laboratory of Waterfowl Healthy Breeding, Guangzhou 510225, China
| | - Nan Cao
- College of Animal Science & Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
- Guangdong Province Key Laboratory of Waterfowl Healthy Breeding, Guangzhou 510225, China
| | - Yunmao Huang
- College of Animal Science & Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
- Guangdong Province Key Laboratory of Waterfowl Healthy Breeding, Guangzhou 510225, China
| | - Danning Xu
- College of Animal Science & Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
- Guangdong Province Key Laboratory of Waterfowl Healthy Breeding, Guangzhou 510225, China
| | - Yugu Li
- Guangdong Province Key Laboratory of Waterfowl Healthy Breeding, Guangzhou 510225, China
- College of Veterinary Medicine, South China Agricultural University, Guangzhou 510642, China
| | - Yunbo Tian
- College of Animal Science & Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
- Guangdong Province Key Laboratory of Waterfowl Healthy Breeding, Guangzhou 510225, China
- Correspondence:
| |
Collapse
|
10
|
Kang SY, Kim DG, Kim H, Cho YA, Ha SY, Kwon GY, Jang KT, Kim KM. Direct comparison of the next-generation sequencing and iTERT PCR methods for the diagnosis of TERT hotspot mutations in advanced solid cancers. BMC Med Genomics 2022; 15:25. [PMID: 35135543 PMCID: PMC8827275 DOI: 10.1186/s12920-022-01175-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 02/02/2022] [Indexed: 01/12/2023] Open
Abstract
Background Mutations in the telomerase reverse transcriptase (TERT) promoter region have been proposed as novel mechanisms for the transcriptional activation of telomerase. Two recurrent mutations in the TERT promoter, C228T and C250T, are prognostic biomarkers. Herein, we directly compared the commercially available iTERT PCR kit with NGS-based deep sequencing to validate the NGS results and determine the analytical sensitivity of the PCR kit.
Methods Of the 2032 advanced solid tumors diagnosed using the TruSight Oncology 500 NGS test, mutations in the TERT promoter region were detected in 103 cases, with 79 cases of C228T, 22 cases of C250T, and 2 cases of C228A hotspot mutations. TERT promoter mutations were detected from 31 urinary bladder, 19 pancreato-biliary, 22 hepatic, 12 malignant melanoma, and 12 other tumor samples. Results In all 103 TERT-mutated cases detected using NGS, the same DNA samples were also tested with the iTERT PCR/Sanger sequencing. PCR successfully verified the presence of the same mutations in all cases with 100% agreement. The average read depth of the TERT promoter region was 320.4, which was significantly lower than that of the other genes (mean, 743.5). Interestingly, NGS read depth was significantly higher at C250 compared to C228 (p < 0.001). Conclusions The NGS test results were validated by a PCR test and iTERT PCR/Sanger sequencing is sensitive for the identification of the TERT promoter mutations.
Collapse
Affiliation(s)
- So Young Kang
- Department of Pathology and Translational Genomics, Samsung Medical Center, Sungkyunkwan University School of Medicine, #81, Irwon-ro, Gangnam-Gu, Seoul, 06351, Korea
| | - Deok Geun Kim
- Department of Clinical Genomic Center, Samsung Medical Center, Seoul, Korea.,Department of Digital Health, Samsung Advanced Institute of Health Science and Technology, Sungkyunkwan University, Seoul, Korea
| | - Hyunjin Kim
- Department of Pathology and Translational Genomics, Samsung Medical Center, Sungkyunkwan University School of Medicine, #81, Irwon-ro, Gangnam-Gu, Seoul, 06351, Korea.,Center of Companion Diagnostics, Samsung Medical Center, Seoul, Republic of Korea
| | - Yoon Ah Cho
- Department of Pathology, Hallym University Sacred Heart Hospital, Hallym University College of Medicine, Seoul, Republic of Korea
| | - Sang Yun Ha
- Department of Pathology and Translational Genomics, Samsung Medical Center, Sungkyunkwan University School of Medicine, #81, Irwon-ro, Gangnam-Gu, Seoul, 06351, Korea
| | - Ghee Young Kwon
- Department of Pathology and Translational Genomics, Samsung Medical Center, Sungkyunkwan University School of Medicine, #81, Irwon-ro, Gangnam-Gu, Seoul, 06351, Korea
| | - Kee-Taek Jang
- Department of Pathology and Translational Genomics, Samsung Medical Center, Sungkyunkwan University School of Medicine, #81, Irwon-ro, Gangnam-Gu, Seoul, 06351, Korea.
| | - Kyoung-Mee Kim
- Department of Pathology and Translational Genomics, Samsung Medical Center, Sungkyunkwan University School of Medicine, #81, Irwon-ro, Gangnam-Gu, Seoul, 06351, Korea. .,Department of Clinical Genomic Center, Samsung Medical Center, Seoul, Korea. .,Center of Companion Diagnostics, Samsung Medical Center, Seoul, Republic of Korea.
| |
Collapse
|
11
|
Farswan A, Jena L, Kaur G, Gupta A, Gupta R, Rani L, Sharma A, Kumar L. Branching clonal evolution patterns predominate mutational landscape in multiple myeloma. Am J Cancer Res 2021; 11:5659-5679. [PMID: 34873486 PMCID: PMC8640818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Accepted: 09/27/2021] [Indexed: 06/13/2023] Open
Abstract
Multiple Myeloma (MM) arises from malignant transformation and deregulated proliferation of clonal plasma cells (PCs) harbouring heterogeneous molecular anomalies. The effect of evolving mutations on clone fitness and their cellular prevalence shapes the progressing myeloma genome and impacts clinical outcomes. Although clonal heterogeneity in MM is well established, which subclonal mutations emerge/persist/perish with progression in MM and which of these can be targeted therapeutically remains an open question. In line with this, we have sequenced pairwise whole exomes of 62 MM patients collected at two time points, i.e., at diagnosis and on progression. Somatic variants were called using a novel ensemble approach where a consensus was deduced from four variant callers (Illumina's Dragen, Strelka2, SomaticSniper and SpeedSeq) and actionable/druggable gene targets were identified. A marked intraclonal heterogeneity was observed. Branching evolution was observed among 72.58% patients, of whom 64.51% had low TMBs (<10) and 61.29% had 2 or more founder clones. The hypermutator patients (with high TMB levels ≥10 to ≤100) showed a significant decrease in their TMBs from diagnosis (median TMB 77.11) to progression (median TMB 31.22). A distinct temporal fall in subclonal driver mutations was identified recurrently across diagnosis to progression e.g., in PABPC1, BRAF, KRAS, CR1, DIS3 and ATM genes in 3 or more patients suggesting such patients could be treated early with target specific drugs like Vemurafenib/Cobimetinib. An analogous rise in driver mutations was observed in KMT2C, FOXD4L1, SP140, NRAS and other genes. A few drivers such as FAT4, IGLL5 and CDKN1A retained consistent distribution patterns at two time points. These findings are clinically relevant and point at consideration of evaluating multi time point subclonal mutational landscapes for designing better risk stratification strategies and tailoring time to time risk adapted combination therapies in future.
Collapse
Affiliation(s)
- Akanksha Farswan
- SBILab, Department of Electronics and Communication Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-D)Delhi 110020, India
| | - Lingaraja Jena
- Laboratory Oncology Unit, Dr. B.R.A. IRCH, All India Institute of Medical Sciences (AIIMS)New Delhi 110029, India
| | - Gurvinder Kaur
- Laboratory Oncology Unit, Dr. B.R.A. IRCH, All India Institute of Medical Sciences (AIIMS)New Delhi 110029, India
| | - Anubha Gupta
- SBILab, Department of Electronics and Communication Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-D)Delhi 110020, India
| | - Ritu Gupta
- Laboratory Oncology Unit, Dr. B.R.A. IRCH, All India Institute of Medical Sciences (AIIMS)New Delhi 110029, India
| | - Lata Rani
- Laboratory Oncology Unit, Dr. B.R.A. IRCH, All India Institute of Medical Sciences (AIIMS)New Delhi 110029, India
| | - Atul Sharma
- Department of Medical Oncology, Dr. B.R.A. IRCH, All India Institute of Medical Sciences (AIIMS)New Delhi 110029, India
| | - Lalit Kumar
- Department of Medical Oncology, Dr. B.R.A. IRCH, All India Institute of Medical Sciences (AIIMS)New Delhi 110029, India
| |
Collapse
|
12
|
Chen J, Guo JT. Structural and functional analysis of somatic coding and UTR indels in breast and lung cancer genomes. Sci Rep 2021; 11:21178. [PMID: 34707120 PMCID: PMC8551294 DOI: 10.1038/s41598-021-00583-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 10/14/2021] [Indexed: 11/24/2022] Open
Abstract
Insertions and deletions (Indels) represent one of the major variation types in the human genome and have been implicated in diseases including cancer. To study the features of somatic indels in different cancer genomes, we investigated the indels from two large samples of cancer types: invasive breast carcinoma (BRCA) and lung adenocarcinoma (LUAD). Besides mapping somatic indels in both coding and untranslated regions (UTRs) from the cancer whole exome sequences, we investigated the overlap between these indels and transcription factor binding sites (TFBSs), the key elements for regulation of gene expression that have been found in both coding and non-coding sequences. Compared to the germline indels in healthy genomes, somatic indels contain more coding indels with higher than expected frame-shift (FS) indels in cancer genomes. LUAD has a higher ratio of deletions and higher coding and FS indel rates than BRCA. More importantly, these somatic indels in cancer genomes tend to locate in sequences with important functions, which can affect the core secondary structures of proteins and have a bigger overlap with predicted TFBSs in coding regions than the germline indels. The somatic CDS indels are also enriched in highly conserved nucleotides when compared with germline CDS indels.
Collapse
Affiliation(s)
- Jing Chen
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| | - Jun-Tao Guo
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA.
| |
Collapse
|
13
|
Abstract
Minimizing false positives is a critical issue when variant calling as no method is without error. It is common practice to post-process a variant-call file (VCF) using hard filter criteria intended to discriminate true-positive (TP) from false-positive (FP) calls. These are applied on the simple principle that certain characteristics are disproportionately represented among the set of FP calls and that a user-chosen threshold can maximize the number detected. To provide guidance on this issue, this study empirically characterized all false SNP and indel calls made using real Illumina sequencing data from six disparate species and 166 variant-calling pipelines (the combination of 14 read aligners with up to 13 different variant callers, plus four ‘all-in-one’ pipelines). We did not seek to optimize filter thresholds but instead to draw attention to those filters of greatest efficacy and the pipelines to which they may most usefully be applied. In this respect, this study acts as a coda to our previous benchmarking evaluation of bacterial variant callers, and provides general recommendations for effective practice. The results suggest that, of the pipelines analysed in this study, the most straightforward way of minimizing false positives would simply be to use Snippy. We also find that a disproportionate number of false calls, irrespective of the variant-calling pipeline, are located in the vicinity of indels, and highlight this as an issue for future development.
Collapse
Affiliation(s)
- Stephen J Bush
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| |
Collapse
|
14
|
Kısakol B, Sarıhan Ş, Ergün MA, Baysan M. Detailed evaluation of cancer sequencing pipelines in different microenvironments and heterogeneity levels. ACTA ACUST UNITED AC 2021; 45:114-126. [PMID: 33907494 PMCID: PMC8068765 DOI: 10.3906/biy-2008-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 02/03/2021] [Indexed: 11/25/2022]
Abstract
The importance of next generation sequencing (NGS) rises in cancer research as accessing this key technology becomes easier for researchers. The sequence data created by NGS technologies must be processed by various bioinformatics algorithms within a pipeline in order to convert raw data to meaningful information. Mapping and variant calling are the two main steps of these analysis pipelines, and many algorithms are available for these steps. Therefore, detailed benchmarking of these algorithms in different scenarios is crucial for the efficient utilization of sequencing technologies. In this study, we compared the performance of twelve pipelines (three mapping and four variant discovery algorithms) with recommended settings to capture single nucleotide variants. We observed significant discrepancy in variant calls among tested pipelines for different heterogeneity levels in real and simulated samples with overall high specificity and low sensitivity. Additional to the individual evaluation of pipelines, we also constructed and tested the performance of pipeline combinations. In these analyses, we observed that certain pipelines complement each other much better than others and display superior performance than individual pipelines. This suggests that adhering to a single pipeline is not optimal for cancer sequencing analysis and sample heterogeneity should be considered in algorithm optimization.
Collapse
Affiliation(s)
- Batuhan Kısakol
- Department of Physiology and Medical Physics, Centre for Systems Medicine, Royal College of Surgeons in Ireland, Dublin Ireland
| | - Şahin Sarıhan
- Computer Engineering Department, Faculty of Engineering, Marmara University, İstanbul, Turkey Turkey
| | - Mehmet Arif Ergün
- Computer Engineering Department, Faculty of Computer and Informatics Engineering, İstanbul Technical University,İstanbul Turkey
| | - Mehmet Baysan
- Computer Engineering Department, Faculty of Computer and Informatics Engineering, İstanbul Technical University,İstanbul Turkey
| |
Collapse
|
15
|
Esprit A, de Mey W, Bahadur Shahi R, Thielemans K, Franceschini L, Breckpot K. Neo-Antigen mRNA Vaccines. Vaccines (Basel) 2020; 8:E776. [PMID: 33353155 PMCID: PMC7766040 DOI: 10.3390/vaccines8040776] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Revised: 12/14/2020] [Accepted: 12/16/2020] [Indexed: 12/12/2022] Open
Abstract
The interest in therapeutic cancer vaccines has caught enormous attention in recent years due to several breakthroughs in cancer research, among which the finding that successful checkpoint blockade treatments reinvigorate neo-antigen-specific T cells and that successful adoptive cell therapies are directed towards neo-antigens. Neo-antigens are cancer-specific antigens, which develop from somatic mutations in the cancer cell genome that can be highly immunogenic and are not subjected to central tolerance. As the majority of neo-antigens are unique to each patient's cancer, a vaccine technology that is flexible and potent is required to develop personalized neo-antigen vaccines. In vitro transcribed mRNA is such a technology platform and has been evaluated for delivery of neo-antigens to professional antigen-presenting cells both ex vivo and in vivo. In addition, strategies that support the activity of T cells in the tumor microenvironment have been developed. These represent a unique opportunity to ensure durable T cell activity upon vaccination. Here, we comprehensively review recent progress in mRNA-based neo-antigen vaccines, summarizing critical milestones that made it possible to bring the promise of therapeutic cancer vaccines within reach.
Collapse
Affiliation(s)
| | | | | | | | | | - Karine Breckpot
- Laboratory for Molecular and Cellular Therapy (LMCT), Department of Biomedical Sciences, Vrije Universiteit Brussel, B-1090 Brussels, Belgium; (A.E.); (W.d.M.); (R.B.S.); (K.T.); (L.F.)
| |
Collapse
|
16
|
Chen J, Guo JT. Comparative assessments of indel annotations in healthy and cancer genomes with next-generation sequencing data. BMC Med Genomics 2020; 13:170. [PMID: 33167946 PMCID: PMC7653722 DOI: 10.1186/s12920-020-00818-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 10/29/2020] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Insertion and deletion (indel) is one of the major variation types in human genomes. Accurate annotation of indels is of paramount importance in genetic variation analysis and investigation of their roles in human diseases. Previous studies revealed a high number of false positives from existing indel calling methods, which limits downstream analyses of the effects of indels on both healthy and disease genomes. In this study, we evaluated seven commonly used general indel calling programs for germline indels and four somatic indel calling programs through comparative analysis to investigate their common features and differences and to explore ways to improve indel annotation accuracy. METHODS In our comparative analysis, we adopted a more stringent evaluation approach by considering both the indel positions and the indel types (insertion or deletion sequences) between the samples and the reference set. In addition, we applied an efficient way to use a benchmark for improved performance comparisons for the general indel calling programs RESULTS: We found that germline indels in healthy genomes derived by combining several indel calling tools could help remove a large number of false positive indels from individual programs without compromising the number of true positives. The performance comparisons of somatic indel calling programs are more complicated due to the lack of a reliable and comprehensive benchmark. Nevertheless our results revealed large variations among the programs and among cancer types. CONCLUSIONS While more accurate indel calling programs are needed, we found that the performance for germline indel annotations can be improved by combining the results from several programs. In addition, well-designed benchmarks for both germline and somatic indels are key in program development and evaluations.
Collapse
Affiliation(s)
- Jing Chen
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, NC, 28223, USA
| | - Jun-Tao Guo
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, NC, 28223, USA.
| |
Collapse
|
17
|
Wang M, Luo W, Jones K, Bian X, Williams R, Higson H, Wu D, Hicks B, Yeager M, Zhu B. SomaticCombiner: improving the performance of somatic variant calling based on evaluation tests and a consensus approach. Sci Rep 2020; 10:12898. [PMID: 32732891 PMCID: PMC7393490 DOI: 10.1038/s41598-020-69772-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Accepted: 07/16/2020] [Indexed: 02/06/2023] Open
Abstract
It is challenging to identify somatic variants from high-throughput sequence reads due to tumor heterogeneity, sub-clonality, and sequencing artifacts. In this study, we evaluated the performance of eight primary somatic variant callers and multiple ensemble methods using both real and synthetic whole-genome sequencing, whole-exome sequencing, and deep targeted sequencing datasets with the NA12878 cell line. The test results showed that a simple consensus approach can significantly improve performance even with a limited number of callers and is more robust and stable than machine learning based ensemble approaches. To fully exploit the multi-callers, we also developed a software package, SomaticCombiner, that can combine multiple callers and integrates a new variant allelic frequency (VAF) adaptive majority voting approach, which can maintain sensitive detection for variants with low VAFs.
Collapse
Affiliation(s)
- Mingyi Wang
- Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, Frederick National Laboratory for Cancer Research, Frederick, MD, 20877, USA.
| | - Wen Luo
- Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, Frederick National Laboratory for Cancer Research, Frederick, MD, 20877, USA
| | - Kristine Jones
- Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, Frederick National Laboratory for Cancer Research, Frederick, MD, 20877, USA
| | - Xiaopeng Bian
- Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD, 20850, USA
| | - Russell Williams
- Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, Frederick National Laboratory for Cancer Research, Frederick, MD, 20877, USA
| | - Herbert Higson
- Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, Frederick National Laboratory for Cancer Research, Frederick, MD, 20877, USA
| | - Dongjing Wu
- Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, Frederick National Laboratory for Cancer Research, Frederick, MD, 20877, USA
| | - Belynda Hicks
- Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, Frederick National Laboratory for Cancer Research, Frederick, MD, 20877, USA
| | - Meredith Yeager
- Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, Frederick National Laboratory for Cancer Research, Frederick, MD, 20877, USA
| | - Bin Zhu
- Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, Frederick National Laboratory for Cancer Research, Frederick, MD, 20877, USA.
| |
Collapse
|
18
|
Wu C, Zhao X, Welsh M, Costello K, Cao K, Abou Tayoun A, Li M, Sarmady M. Using Machine Learning to Identify True Somatic Variants from Next-Generation Sequencing. Clin Chem 2020; 66:239-246. [PMID: 31672855 DOI: 10.1373/clinchem.2019.308213] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 08/19/2019] [Indexed: 12/25/2022]
Abstract
BACKGROUND Molecular profiling has become essential for tumor risk stratification and treatment selection. However, cancer genome complexity and technical artifacts make identification of real variants a challenge. Currently, clinical laboratories rely on manual screening, which is costly, subjective, and not scalable. We present a machine learning-based method to distinguish artifacts from bona fide single-nucleotide variants (SNVs) detected by next-generation sequencing from nonformalin-fixed paraffin-embedded tumor specimens. METHODS A cohort of 11278 SNVs identified through clinical sequencing of tumor specimens was collected and divided into training, validation, and test sets. Each SNV was manually inspected and labeled as either real or artifact as part of clinical laboratory workflow. A 3-class (real, artifact, and uncertain) model was developed on the training set, fine-tuned with the validation set, and then evaluated on the test set. Prediction intervals reflecting the certainty of the classifications were derived during the process to label "uncertain" variants. RESULTS The optimized classifier demonstrated 100% specificity and 97% sensitivity over 5587 SNVs of the test set. Overall, 1252 of 1341 true-positive variants were identified as real, 4143 of 4246 false-positive calls were deemed artifacts, whereas only 192 (3.4%) SNVs were labeled as "uncertain," with zero misclassification between the true positives and artifacts in the test set. CONCLUSIONS We presented a computational classifier to identify variant artifacts detected from tumor sequencing. Overall, 96.6% of the SNVs received definitive labels and thus were exempt from manual review. This framework could improve quality and efficiency of the variant review process in clinical laboratories.
Collapse
Affiliation(s)
- Chao Wu
- Division of Genomic Diagnostics, The Children's Hospital of Philadelphia, Philadelphia, PA
| | - Xiaonan Zhao
- Division of Genomic Diagnostics, The Children's Hospital of Philadelphia, Philadelphia, PA
| | - Mark Welsh
- Division of Genomic Diagnostics, The Children's Hospital of Philadelphia, Philadelphia, PA
| | | | - Kajia Cao
- Division of Genomic Diagnostics, The Children's Hospital of Philadelphia, Philadelphia, PA
| | - Ahmad Abou Tayoun
- Department of Genetics, Al Jalila Children's Specialty Hospital, Dubai, UAE
| | - Marilyn Li
- Division of Genomic Diagnostics, The Children's Hospital of Philadelphia, Philadelphia, PA.,Department of Pathology & Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, PA
| | - Mahdi Sarmady
- Division of Genomic Diagnostics, The Children's Hospital of Philadelphia, Philadelphia, PA.,Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA.,Department of Pathology & Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, PA
| |
Collapse
|
19
|
Huang W, Guo YA, Muthukumar K, Baruah P, Chang MM, Jacobsen Skanderup A. SMuRF: portable and accurate ensemble prediction of somatic mutations. Bioinformatics 2020; 35:3157-3159. [PMID: 30649191 PMCID: PMC6735703 DOI: 10.1093/bioinformatics/btz018] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2018] [Revised: 11/26/2018] [Accepted: 01/07/2019] [Indexed: 12/22/2022] Open
Abstract
Summary Somatic Mutation calling method using a Random Forest (SMuRF) integrates predictions and auxiliary features from multiple somatic mutation callers using a supervised machine learning approach. SMuRF is trained on community-curated matched tumor and normal whole genome sequencing data. SMuRF predicts both SNVs and indels with high accuracy in genome or exome-level sequencing data. Furthermore, the method is robust across multiple tested cancer types and predicts low allele frequency variants with high accuracy. In contrast to existing ensemble-based somatic mutation calling approaches, SMuRF works out-of-the-box and is orders of magnitudes faster. Availability and implementation The method is implemented in R and available at https://github.com/skandlab/SMuRF. SMuRF operates as an add-on to the community-developed bcbio-nextgen somatic variant calling pipeline. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Weitai Huang
- Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore.,Graduate School of Integrative Sciences and Engineering, National University of Singapore, Singapore, Singapore
| | - Yu Amanda Guo
- Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore
| | - Karthik Muthukumar
- Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore
| | - Probhonjon Baruah
- Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore
| | - Mei Mei Chang
- Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore
| | - Anders Jacobsen Skanderup
- Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore
| |
Collapse
|
20
|
Abstract
A standard strategy to discover somatic mutations in a cancer genome is to use next-generation sequencing (NGS) technologies to sequence the tumor tissue and its matched normal (commonly blood or adjacent normal tissue) for side-by-side comparison. However, when interrogating entire genomes (or even just the coding regions), the number of sequencing errors easily outnumbers the number of real somatic mutations by orders of magnitudes. Here, we describe SomaticSeq, which incorporates multiple somatic mutation detection algorithms and then uses machine learning to vastly improve the accuracy of the somatic mutation call sets.
Collapse
|
21
|
Wang Q, Kotoula V, Hsu PC, Papadopoulou K, Ho JWK, Fountzilas G, Giannoulatou E. Comparison of somatic variant detection algorithms using Ion Torrent targeted deep sequencing data. BMC Med Genomics 2019; 12:181. [PMID: 31874647 PMCID: PMC6929331 DOI: 10.1186/s12920-019-0636-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 11/25/2019] [Indexed: 12/20/2022] Open
Abstract
Background The application of next-generation sequencing in cancer has revealed the genomic landscape of many tumour types and is nowadays routinely used in research and clinical settings. Multiple algorithms have been developed to detect somatic variation from sequencing data using either paired tumour-blood or tumour-only samples. Most of these methods have been developed and evaluated for the identification of somatic variation using Illumina sequencing datasets of moderate coverage. However, a comprehensive evaluation of somatic variant detection algorithms on Ion Torrent targeted deep sequencing data has not been performed. Methods We have applied three somatic detection algorithms, Torrent Variant Caller, MuTect2 and VarScan2, on a large cohort of ovarian cancer patients comprising of 208 paired tumour-blood samples and 253 tumour-only samples sequenced deeply on Ion Torrent Proton platform across 330 amplicons. Subsequently, the concordance and performance of the three somatic variant callers were assessed. Results We have observed low concordance across the algorithms with only 0.5% of SNV and 0.02% of INDEL calls in common across all three methods. The intersection of all methods showed better performance when assessed using correlation with known mutational signatures, overlap with COSMIC variation and by examining the variant characteristics. The Torrent Variant Caller also performed well with the advantage of not eliminating a high number of variants that could lead to high type II error. Conclusions Our results suggest that caution should be taken when applying state-of-the-art somatic variant algorithms to Ion Torrent targeted deep sequencing data. Better quality control procedures and strategies that combine results from multiple methods should ensure that higher accuracy is achieved. This is essential to ensure that results from bioinformatics pipelines using Ion Torrent deep sequencing can be robustly applied in cancer research and in the clinic.
Collapse
Affiliation(s)
- Qing Wang
- Victor Chang Cardiac Research Institute, Darlinghurst, Australia
| | - Vassiliki Kotoula
- Department of Pathology, School of Health Sciences, Faculty of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece.,Laboratory of Molecular Oncology, Hellenic Foundation for Cancer Research, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Pei-Chen Hsu
- Victor Chang Cardiac Research Institute, Darlinghurst, Australia.,School of Computer Science and Engineering, UNSW, Sydney, Australia
| | - Kyriaki Papadopoulou
- Laboratory of Molecular Oncology, Hellenic Foundation for Cancer Research, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Joshua W K Ho
- Victor Chang Cardiac Research Institute, Darlinghurst, Australia.,School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, China.,St Vincent's Clinical School, UNSW, Sydney, Australia
| | - George Fountzilas
- Laboratory of Molecular Oncology, Hellenic Foundation for Cancer Research, Aristotle University of Thessaloniki, Thessaloniki, Greece.,Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Eleni Giannoulatou
- Victor Chang Cardiac Research Institute, Darlinghurst, Australia. .,St Vincent's Clinical School, UNSW, Sydney, Australia.
| |
Collapse
|
22
|
Cho Y, Lee S, Hong JH, Kim BJ, Hong WY, Jung J, Lee HB, Sung J, Kim HN, Kim HL, Jung J. Development of the variant calling algorithm, ADIScan, and its use to estimate discordant sequences between monozygotic twins. Nucleic Acids Res 2019; 46:e92. [PMID: 29873758 PMCID: PMC6125643 DOI: 10.1093/nar/gky445] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Accepted: 05/15/2018] [Indexed: 12/30/2022] Open
Abstract
Calling variants from next-generation sequencing (NGS) data or discovering discordant sequences between two NGS data sets is challenging. We developed a computer algorithm, ADIScan1, to call variants by comparing the fractions of allelic reads in a tester to the universal reference genome. We then created ADIScan2 by modifying the algorithm to directly compare two sets of NGS data and predict discordant sequences between two testers. ADIScan1 detected >99.7% of variants called by GATK with an additional 724 393 SNVs. ADIScan2 identified ∼500 candidates of discordant sequences in each of two pairs of the monozygotic twins. About 200 of these candidates were included in the ∼2800 predicted by VarScan2. We verified 66 true discordant sequences among the candidates that ADIScan2 and VarScan2 exclusively predicted. ADIScan2 detected many discordant sequences overlooked by VarScan2 and Mutect, which specialize in detecting low frequency mutations in genetically heterogeneous cancerous tissues. Numbers of verified sequences alone were >5 times more than expected based on recently estimated mutation rates from whole genome sequences. Estimated post-zygotic mutation rates were 1.68 × 10−7 in this study. ADIScan1 and 2 would complement existing tools in screening causative mutations of diverse genetic diseases and comparing two sets of genome sequences, respectively.
Collapse
Affiliation(s)
- Yangrae Cho
- Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon 34025, Republic of Korea.,DFTBA, CALS, Chonnam National University, Gwangju 61186, Republic of Korea
| | - Sunho Lee
- Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon 34025, Republic of Korea.,School of Computer Science and Engineering, Seoul National University, Seoul, 151-742, Republic of Korea
| | - Jong Hui Hong
- Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon 34025, Republic of Korea.,Research Institute of Pharmaceutical Sciences, College of Pharmacy, Seoul National University, Seoul 08826, Republic of Korea
| | - Byong Joon Kim
- Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon 34025, Republic of Korea
| | - Woon-Young Hong
- Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon 34025, Republic of Korea
| | - Jongcheol Jung
- Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon 34025, Republic of Korea
| | - Hyang Burm Lee
- DFTBA, CALS, Chonnam National University, Gwangju 61186, Republic of Korea
| | - Joohon Sung
- Complex Disease and Genome Epidemiology Branch, Department of Epidemiology, School of Public Health, Seoul National University, Seoul 08826, Republic of Korea
| | - Han-Na Kim
- Department of Biochemistry, School of Medicine, Ewha Woman's University, Seoul 07985, Republic of Korea
| | - Hyung-Lae Kim
- Department of Biochemistry, School of Medicine, Ewha Woman's University, Seoul 07985, Republic of Korea
| | - Jongsun Jung
- Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon 34025, Republic of Korea
| |
Collapse
|
23
|
Peng M, Mo Y, Wang Y, Wu P, Zhang Y, Xiong F, Guo C, Wu X, Li Y, Li X, Li G, Xiong W, Zeng Z. Neoantigen vaccine: an emerging tumor immunotherapy. Mol Cancer 2019; 18:128. [PMID: 31443694 PMCID: PMC6708248 DOI: 10.1186/s12943-019-1055-6] [Citation(s) in RCA: 438] [Impact Index Per Article: 73.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Accepted: 08/14/2019] [Indexed: 12/24/2022] Open
Abstract
Genetic instability of tumor cells often leads to the occurrence of a large number of mutations, and expression of non-synonymous mutations can produce tumor-specific antigens called neoantigens. Neoantigens are highly immunogenic as they are not expressed in normal tissues. They can activate CD4+ and CD8+ T cells to generate immune response and have the potential to become new targets of tumor immunotherapy. The development of bioinformatics technology has accelerated the identification of neoantigens. The combination of different algorithms to identify and predict the affinity of neoantigens to major histocompatibility complexes (MHCs) or the immunogenicity of neoantigens is mainly based on the whole-exome sequencing technology. Tumor vaccines targeting neoantigens mainly include nucleic acid, dendritic cell (DC)-based, tumor cell, and synthetic long peptide (SLP) vaccines. The combination with immune checkpoint inhibition therapy or radiotherapy and chemotherapy might achieve better therapeutic effects. Currently, several clinical trials have demonstrated the safety and efficacy of these vaccines. Further development of sequencing technologies and bioinformatics algorithms, as well as an improvement in our understanding of the mechanisms underlying tumor development, will expand the application of neoantigen vaccines in the future.
Collapse
Affiliation(s)
- Miao Peng
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, the Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Yongzhen Mo
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Yian Wang
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Pan Wu
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Yijie Zhang
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Fang Xiong
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Can Guo
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Xu Wu
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Yong Li
- DEPARTMENT OF MEDICINE, Comprehensive Cancer Center Baylor College of Medicine, Alkek Building, RM N720, Houston, Texas, USA
| | - Xiaoling Li
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Guiyuan Li
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, the Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Wei Xiong
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, the Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Zhaoyang Zeng
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China. .,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China. .,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, the Third Xiangya Hospital, Central South University, Changsha, Hunan, China.
| |
Collapse
|
24
|
Kumaran M, Subramanian U, Devarajan B. Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data. BMC Bioinformatics 2019; 20:342. [PMID: 31208315 PMCID: PMC6580603 DOI: 10.1186/s12859-019-2928-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2018] [Accepted: 05/31/2019] [Indexed: 12/30/2022] Open
Abstract
Background Whole exome sequencing (WES) is a cost-effective method that identifies clinical variants but it demands accurate variant caller tools. Currently available tools have variable accuracy in predicting specific clinical variants. But it may be possible to find the best combination of aligner-variant caller tools for detecting accurate single nucleotide variants (SNVs) and small insertion and deletion (InDels) separately. Moreover, many important aspects of InDel detection are overlooked while comparing the performance of tools, particularly its base pair length. Results We assessed the performance of variant calling pipelines using the combinations of four variant callers and five aligners on human NA12878 and simulated exome data. We used high confidence variant calls from Genome in a Bottle (GiaB) consortium for validation, and GRCh37 and GRCh38 as the human reference genome. Based on the performance metrics, both BWA and Novoalign aligners performed better with DeepVariant and SAMtools callers for detecting SNVs, and with DeepVariant and GATK for InDels. Furthermore, we obtained similar results on human NA24385 and NA24631 exome data from GiaB. Conclusion In this study, DeepVariant with BWA and Novoalign performed best for detecting accurate SNVs and InDels. The accuracy of variant calling was improved by merging the top performing pipelines. The results of our study provide useful recommendations for analysis of WES data in clinical genomics. Electronic supplementary material The online version of this article (10.1186/s12859-019-2928-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Manojkumar Kumaran
- Department of Bioinformatics, Aravind Medical Research Foundation, Madurai, Tamil Nadu, 625020, India.,School of Chemical and Biotechnology, SASTRA (Deemed to be University), Thanjavur, Tamil Nadu, 613401, India
| | - Umadevi Subramanian
- Department of Bioinformatics, Aravind Medical Research Foundation, Madurai, Tamil Nadu, 625020, India
| | - Bharanidharan Devarajan
- Department of Bioinformatics, Aravind Medical Research Foundation, Madurai, Tamil Nadu, 625020, India.
| |
Collapse
|
25
|
Anzar I, Sverchkova A, Stratford R, Clancy T. NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer. BMC Med Genomics 2019; 12:63. [PMID: 31096972 PMCID: PMC6524241 DOI: 10.1186/s12920-019-0508-5] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 04/22/2019] [Indexed: 12/30/2022] Open
Abstract
Background The accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogeneity coupled with the problem of sequencing and alignment artifacts, makes somatic variant calling a challenging task. Current variant filtering strategies, such as rule-based filtering and consensus voting of different algorithms, have previously helped to increase specificity, although comes at the cost of sensitivity. Methods In light of this, we have developed the NeoMutate framework which incorporates 7 supervised machine learning (ML) algorithms to exploit the strengths of multiple variant callers, using a non-redundant set of biological and sequence features. We benchmarked NeoMutate by simulating more than 10,000 bona fide cancer-related mutations into three well-characterized Genome in a Bottle (GIAB) reference samples. Results A robust and exhaustive evaluation of NeoMutate’s performance based on 5-fold cross validation experiments, in addition to 3 independent tests, demonstrated a substantially improved variant detection accuracy compared to any of its individual composite variant callers and consensus calling of multiple tools. Conclusions We show here that integrating multiple tools in an ensemble ML layer optimizes somatic variant detection rates, leading to a potentially improved variant selection framework for the diagnosis and treatment of cancer. Electronic supplementary material The online version of this article (10.1186/s12920-019-0508-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Irantzu Anzar
- OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway
| | - Angelina Sverchkova
- OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway
| | - Richard Stratford
- OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway
| | - Trevor Clancy
- OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway.
| |
Collapse
|
26
|
Kim J, Kim D, Lim JS, Maeng JH, Son H, Kang HC, Nam H, Lee JH, Kim S. The use of technical replication for detection of low-level somatic mutations in next-generation sequencing. Nat Commun 2019; 10:1047. [PMID: 30837471 PMCID: PMC6400950 DOI: 10.1038/s41467-019-09026-y] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 02/07/2019] [Indexed: 01/16/2023] Open
Abstract
Accurate genome-wide detection of somatic mutations with low variant allele frequency (VAF, <1%) has proven difficult, for which generalized, scalable methods are lacking. Herein, we describe a new computational method, called RePlow, that we developed to detect low-VAF somatic mutations based on simple, library-level replicates for next-generation sequencing on any platform. Through joint analysis of replicates, RePlow is able to remove prevailing background errors in next-generation sequencing analysis, facilitating remarkable improvement in the detection accuracy for low-VAF somatic mutations (up to ~99% reduction in false positives). The method is validated in independent cancer panel and brain tissue sequencing data. Our study suggests a new paradigm with which to exploit an overwhelming abundance of sequencing data for accurate variant detection. Somatic mutations of low allele frequencies are often difficult to detect. Here, the authors develop RePlow, a computational method that leverages technical replication for detecting low-level somatic mutations using next-generation sequencing.
Collapse
Affiliation(s)
- Junho Kim
- Department of Biomedical Systems Informatics and Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, South Korea
| | - Dachan Kim
- Department of Biomedical Systems Informatics and Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, South Korea
| | - Jae Seok Lim
- Graduate School of Medical Science and Engineering, KAIST, Daejeon, 34141, South Korea
| | - Ju Heon Maeng
- Department of Biomedical Systems Informatics and Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, South Korea
| | - Hyeonju Son
- Department of Biomedical Systems Informatics and Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, South Korea
| | - Hoon-Chul Kang
- Department of Pediatrics, Division of Pediatric Neurology, Pediatric Epilepsy Clinics, Severance Children's Hospital, Epilepsy Research Institute, Yonsei University College of Medicine, Seoul, 03722, South Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, 61005, South Korea
| | - Jeong Ho Lee
- Graduate School of Medical Science and Engineering, KAIST, Daejeon, 34141, South Korea.
| | - Sangwoo Kim
- Department of Biomedical Systems Informatics and Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, South Korea.
| |
Collapse
|
27
|
Deep convolutional neural networks for accurate somatic mutation detection. Nat Commun 2019; 10:1041. [PMID: 30833567 PMCID: PMC6399298 DOI: 10.1038/s41467-019-09027-x] [Citation(s) in RCA: 66] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Accepted: 02/14/2019] [Indexed: 12/14/2022] Open
Abstract
Accurate detection of somatic mutations is still a challenge in cancer analysis. Here we present NeuSomatic, the first convolutional neural network approach for somatic mutation detection, which significantly outperforms previous methods on different sequencing platforms, sequencing strategies, and tumor purities. NeuSomatic summarizes sequence alignments into small matrices and incorporates more than a hundred features to capture mutation signals effectively. It can be used universally as a stand-alone somatic mutation detection method or with an ensemble of existing methods to achieve the highest accuracy.
Collapse
|
28
|
isma: an R package for the integrative analysis of mutations detected by multiple pipelines. BMC Bioinformatics 2019; 20:107. [PMID: 30819096 PMCID: PMC6394085 DOI: 10.1186/s12859-019-2701-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Accepted: 02/22/2019] [Indexed: 01/07/2023] Open
Abstract
Background Recent comparative studies have brought to our attention how somatic mutation detection from next-generation sequencing data is still an open issue in bioinformatics, because different pipelines result in a low consensus. In this context, it is suggested to integrate results from multiple calling tools, but this operation is not trivial and the burden of merging, comparing, filtering and explaining the results demands appropriate software. Results We developed isma (integrative somatic mutation analysis), an R package for the integrative analysis of somatic mutations detected by multiple pipelines for matched tumor-normal samples. The package provides a series of functions to quantify the consensus, estimate the variability, underline outliers, integrate evidences from publicly available mutation catalogues and filter sites. We illustrate the capabilities of isma analysing breast cancer somatic mutations generated by The Cancer Genome Atlas (TCGA) using four pipelines. Conclusions Comparing different “points of view” on the same data, isma generates a unique mutation catalogue and a series of reports that underline common patterns, variability, as well as sites already catalogued by other studies (e.g. TCGA), so as to design and apply filtering strategies to screen more reliable sites. The package is available for non-commercial users at the URL https://www.itb.cnr.it/isma. Electronic supplementary material The online version of this article (10.1186/s12859-019-2701-0) contains supplementary material, which is available to authorized users.
Collapse
|
29
|
Abstract
This chapter contains a step-by-step protocol for identifying somatic SNPs and small Indels from next-generation sequencing data of tumor samples and matching normal samples. The workflow presented here is largely based on the Broad Institute's "Best Practices" guidelines and makes use of their Genome Analysis Toolkit (GATK) platform. Variants are annotated with population allele frequencies and curated resources such as GnomAD and ClinVar and curated effect predictions from dbNSFP using VCFtools, SnpEff, and SnpSift.
Collapse
Affiliation(s)
- Peter J Ulintz
- BRCF Bioinformatics Core, University of Michigan, Ann Arbor, MI, USA.
- Division of Hematology and Oncology, Department of Internal Medicine, University of Michigan, Ann Arbor, MI, USA.
| | - Weisheng Wu
- BRCF Bioinformatics Core, University of Michigan, Ann Arbor, MI, USA
| | - Chris M Gates
- BRCF Bioinformatics Core, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
30
|
Brody Y, Kimmerling RJ, Maruvka YE, Benjamin D, Elacqua JJ, Haradhvala NJ, Kim J, Mouw KW, Frangaj K, Koren A, Getz G, Manalis SR, Blainey PC. Quantification of somatic mutation flow across individual cell division events by lineage sequencing. Genome Res 2018; 28:1901-1918. [PMID: 30459213 PMCID: PMC6280753 DOI: 10.1101/gr.238543.118] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Accepted: 10/27/2018] [Indexed: 02/06/2023]
Abstract
Mutation data reveal the dynamic equilibrium between DNA damage and repair processes in cells and are indispensable to the understanding of age-related diseases, tumor evolution, and the acquisition of drug resistance. However, available genome-wide methods have a limited ability to resolve rare somatic variants and the relationships between these variants. Here, we present lineage sequencing, a new genome sequencing approach that enables somatic event reconstruction by providing quality somatic mutation call sets with resolution as high as the single-cell level in subject lineages. Lineage sequencing entails sampling single cells from a population and sequencing subclonal sample sets derived from these cells such that knowledge of relationships among the cells can be used to jointly call variants across the sample set. This approach integrates data from multiple sequence libraries to support each variant and precisely assigns mutations to lineage segments. We applied lineage sequencing to a human colon cancer cell line with a DNA polymerase epsilon (POLE) proofreading deficiency (HT115) and a human retinal epithelial cell line immortalized by constitutive telomerase expression (RPE1). Cells were cultured under continuous observation to link observed single-cell phenotypes with single-cell mutation data. The high sensitivity, specificity, and resolution of the data provide a unique opportunity for quantitative analysis of variation in mutation rate, spectrum, and correlations among variants. Our data show that mutations arrive with nonuniform probability across sublineages and that DNA lesion dynamics may cause strong correlations between certain mutations.
Collapse
Affiliation(s)
- Yehuda Brody
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Robert J Kimmerling
- MIT Department of Biological Engineering, Cambridge, Massachusetts 02139, USA
- Koch Institute for Integrative Cancer Research, MIT, Cambridge, Massachusetts 02139, USA
| | - Yosef E Maruvka
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- MGH Cancer Center and Department of Pathology, Boston, Massachusetts 02114, USA
| | - David Benjamin
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Joshua J Elacqua
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- MIT Department of Biological Engineering, Cambridge, Massachusetts 02139, USA
| | - Nicholas J Haradhvala
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- MGH Cancer Center and Department of Pathology, Boston, Massachusetts 02114, USA
| | - Jaegil Kim
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Kent W Mouw
- Harvard Medical School, Boston, Massachusetts 02115, USA
- Department of Radiation Oncology, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA
| | - Kristjana Frangaj
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Amnon Koren
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Gad Getz
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- MGH Cancer Center and Department of Pathology, Boston, Massachusetts 02114, USA
| | - Scott R Manalis
- MIT Department of Biological Engineering, Cambridge, Massachusetts 02139, USA
- Koch Institute for Integrative Cancer Research, MIT, Cambridge, Massachusetts 02139, USA
| | - Paul C Blainey
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- MIT Department of Biological Engineering, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
31
|
Semeraro R, Orlandini V, Magi A. Xome-Blender: A novel cancer genome simulator. PLoS One 2018; 13:e0194472. [PMID: 29621252 PMCID: PMC5886411 DOI: 10.1371/journal.pone.0194472] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 02/05/2018] [Indexed: 11/18/2022] Open
Abstract
The adoption of next generation sequencing based methods in cancer research allowed for the investigation of the complex genetic structure of tumor samples. In the last few years, considerable importance was given to the research of somatic variants and several computational approaches were developed for this purpose. Despite continuous improvements to these programs, the validation of their results it’s a hard challenge due to multiple sources of error. To overcome this drawback different simulation approaches are used to generate synthetic samples but they are often based on the addition of artificial mutations that mimic the complexity of genomic variations. For these reasons, we developed a novel software, Xome-Blender, that generates synthetic cancer genomes with user defined features such as the number of subclones, the number of somatic variants and the presence of copy number alterations (CNAs), without the addition of any synthetic element. The singularity of our method is the “morphological approach” used to generate mutation events. To demonstrate the power of our tool we used it to address the hard challenge of evaluating the performance of nine state-of-the-art somatic variant calling methods for small and large variants (VarScan2, MuTect, Shimmer, BCFtools, Strelka, EXCAVATOR2, Control-FREEC and CopywriteR). Through these analyses we observed that by using Xome-Blender data it is possible to appraise small differences between their performance and we have designated VarScan2 and EXCAVATOR2 as best tool for this kind of applications. Xome-Blender is unix-based, licensed under the GPLv3 and freely available at https://github.com/rsemeraro/XomeBlender.
Collapse
Affiliation(s)
- Roberto Semeraro
- Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
- * E-mail:
| | - Valerio Orlandini
- Medical Genetics Unit, Meyer Children’s University Hospital, Florence, Italy
| | - Alberto Magi
- Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
| |
Collapse
|
32
|
Xu C. A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data. Comput Struct Biotechnol J 2018; 16:15-24. [PMID: 29552334 PMCID: PMC5852328 DOI: 10.1016/j.csbj.2018.01.003] [Citation(s) in RCA: 153] [Impact Index Per Article: 21.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Revised: 01/20/2018] [Accepted: 01/28/2018] [Indexed: 02/06/2023] Open
Abstract
Detection of somatic mutations holds great potential in cancer treatment and has been a very active research field in the past few years, especially since the breakthrough of the next-generation sequencing technology. A collection of variant calling pipelines have been developed with different underlying models, filters, input data requirements, and targeted applications. This review aims to enumerate these unique features of the state-of-the-art variant callers, in the hope to provide a practical guide for selecting the appropriate pipeline for specific applications. We will focus on the detection of somatic single nucleotide variants, ranging from traditional variant callers based on whole genome or exome sequencing of paired tumor-normal samples to recent low-frequency variant callers designed for targeted sequencing protocols with unique molecular identifiers. The variant callers have been extensively benchmarked with inconsistent performances across these studies. We will review the reference materials, datasets, and performance metrics that have been used in the benchmarking studies. In the end, we will discuss emerging trends and future directions of the variant calling algorithms.
Collapse
Affiliation(s)
- Chang Xu
- Life Science Research and Foundation, Qiagen Sciences, Inc., 6951 Executive Way, Frederick, Maryland 21703, USA
| |
Collapse
|
33
|
Kotelnikova EA, Pyatnitskiy M, Paleeva A, Kremenetskaya O, Vinogradov D. Practical aspects of NGS-based pathways analysis for personalized cancer science and medicine. Oncotarget 2018; 7:52493-52516. [PMID: 27191992 PMCID: PMC5239569 DOI: 10.18632/oncotarget.9370] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Accepted: 04/18/2016] [Indexed: 12/17/2022] Open
Abstract
Nowadays, the personalized approach to health care and cancer care in particular is becoming more and more popular and is taking an important place in the translational medicine paradigm. In some cases, detection of the patient-specific individual mutations that point to a targeted therapy has already become a routine practice for clinical oncologists. Wider panels of genetic markers are also on the market which cover a greater number of possible oncogenes including those with lower reliability of resulting medical conclusions. In light of the large availability of high-throughput technologies, it is very tempting to use complete patient-specific New Generation Sequencing (NGS) or other "omics" data for cancer treatment guidance. However, there are still no gold standard methods and protocols to evaluate them. Here we will discuss the clinical utility of each of the data types and describe a systems biology approach adapted for single patient measurements. We will try to summarize the current state of the field focusing on the clinically relevant case-studies and practical aspects of data processing.
Collapse
Affiliation(s)
- Ekaterina A Kotelnikova
- Personal Biomedicine, Moscow, Russia.,A. A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia.,Institute Biomedical Research August Pi Sunyer (IDIBAPS), Hospital Clinic of Barcelona, Barcelona, Spain
| | - Mikhail Pyatnitskiy
- Personal Biomedicine, Moscow, Russia.,Orekhovich Institute of Biomedical Chemistry, Moscow, Russia.,Pirogov Russian National Research Medical University, Moscow, Russia
| | | | - Olga Kremenetskaya
- Personal Biomedicine, Moscow, Russia.,Center for Theoretical Problems of Physicochemical Pharmacology, Russian Academy of Sciences, Moscow, Russia
| | - Dmitriy Vinogradov
- Personal Biomedicine, Moscow, Russia.,A. A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia.,Lomonosov Moscow State University, Moscow, Russia
| |
Collapse
|
34
|
Bräunlein E, Krackhardt AM. Identification and Characterization of Neoantigens As Well As Respective Immune Responses in Cancer Patients. Front Immunol 2017; 8:1702. [PMID: 29250075 PMCID: PMC5714868 DOI: 10.3389/fimmu.2017.01702] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Accepted: 11/17/2017] [Indexed: 12/16/2022] Open
Abstract
Cancer immunotherapy has recently emerged as a powerful tool for the treatment of diverse advanced malignancies. In particular, therapeutic application of immune checkpoint modulators, such as anti-CTLA4 or anti-PD-1/PD-L1 antibodies, have shown efficacy in a broad range of malignant diseases. Although pharmacodynamics of these immune modulators are complex, recent studies strongly support the notion that altered peptide ligands presented on tumor cells representing neoantigens may play an essential role in tumor rejection by T cells activated by anti-CTLA4 and anti-PD-1 antibodies. Neoantigens may have diverse sources as viral and mutated proteins. Moreover, posttranslational modifications and altered antigen processing may also contribute to the neoantigenic peptide ligand landscape. Different approaches of target identification are currently applied in combination with subsequent characterization of autologous and non-self T-cell responses against such neoantigens. Additional efforts are required to elucidate key characteristics and interdependences of neoantigens, immunodominance, respective T-cell responses, and the tumor microenvironment in order to define decisive determinants involved in effective T-cell-mediated tumor rejection. This review focuses on our current knowledge of identification and characterization of such neoantigens as well as respective T-cell responses. It closes with challenges to be addressed in future relevant for further improvement of immunotherapeutic strategies in malignant diseases.
Collapse
Affiliation(s)
- Eva Bräunlein
- Medizinische Klinik III, Klinikum rechts der Isar, Technische Universität München, Munich, Germany
| | - Angela M Krackhardt
- Medizinische Klinik III, Klinikum rechts der Isar, Technische Universität München, Munich, Germany.,German Cancer Consortium of Translational Cancer Research (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
| |
Collapse
|
35
|
Bohnert R, Vivas S, Jansen G. Comprehensive benchmarking of SNV callers for highly admixed tumor data. PLoS One 2017; 12:e0186175. [PMID: 29020110 PMCID: PMC5636151 DOI: 10.1371/journal.pone.0186175] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 09/26/2017] [Indexed: 12/30/2022] Open
Abstract
Precision medicine attempts to individualize cancer therapy by matching tumor-specific genetic changes with effective targeted therapies. A crucial first step in this process is the reliable identification of cancer-relevant variants, which is considerably complicated by the impurity and heterogeneity of clinical tumor samples. We compared the impact of admixture of non-cancerous cells and low somatic allele frequencies on the sensitivity and precision of 19 state-of-the-art SNV callers. We studied both whole exome and targeted gene panel data and up to 13 distinct parameter configurations for each tool. We found vast differences among callers. Based on our comprehensive analyses we recommend joint tumor-normal calling with MuTect, EBCall or Strelka for whole exome somatic variant calling, and HaplotypeCaller or FreeBayes for whole exome germline calling. For targeted gene panel data on a single tumor sample, LoFreqStar performed best. We further found that tumor impurity and admixture had a negative impact on precision, and in particular, sensitivity in whole exome experiments. At admixture levels of 60% to 90% sometimes seen in pathological biopsies, sensitivity dropped significantly, even when variants were originally present in the tumor at 100% allele frequency. Sensitivity to low-frequency SNVs improved with targeted panel data, but whole exome data allowed more efficient identification of germline variants. Effective somatic variant calling requires high-quality pathological samples with minimal admixture, a consciously selected sequencing strategy, and the appropriate variant calling tool with settings optimized for the chosen type of data.
Collapse
|
36
|
Madubata CJ, Roshan-Ghias A, Chu T, Resnick S, Zhao J, Arnes L, Wang J, Rabadan R. Identification of potentially oncogenic alterations from tumor-only samples reveals Fanconi anemia pathway mutations in bladder carcinomas. NPJ Genom Med 2017; 2:29. [PMID: 29263839 PMCID: PMC5677944 DOI: 10.1038/s41525-017-0032-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 08/08/2017] [Accepted: 08/11/2017] [Indexed: 01/02/2023] Open
Abstract
Cancer is caused by germline and somatic mutations, which can share biological features such as amino acid change. However, integrated germline and somatic analysis remains uncommon. We present a framework that uses machine learning to learn features of recurrent somatic mutations to (1) predict somatic variants from tumor-only samples and (2) identify somatic-like germline variants for integrated analysis of tumor-normal DNA. Using data from 1769 patients from seven cancer types (bladder, glioblastoma, low-grade glioma, lung, melanoma, stomach, and pediatric glioma), we show that "somatic-like" germline variants are enriched for autosomal-dominant cancer-predisposition genes (p < 4.35 × 10-15), including TP53. Our framework identifies germline and somatic nonsense variants in BRCA2 and other Fanconi anemia genes in 11% (11/100) of bladder cancer cases, suggesting a potential genetic predisposition in these patients. The bladder carcinoma patients with Fanconi anemia nonsense variants display a BRCA-deficiency somatic mutation signature, suggesting treatment targeted to DNA repair.
Collapse
Affiliation(s)
- Chioma J Madubata
- Department of Systems Biology, Columbia University, New York, NY 10032 USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032 USA
| | - Alireza Roshan-Ghias
- Department of Systems Biology, Columbia University, New York, NY 10032 USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032 USA
| | - Timothy Chu
- Department of Systems Biology, Columbia University, New York, NY 10032 USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032 USA
| | - Samuel Resnick
- Department of Systems Biology, Columbia University, New York, NY 10032 USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032 USA
| | - Junfei Zhao
- Department of Systems Biology, Columbia University, New York, NY 10032 USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032 USA
| | - Luis Arnes
- Department of Systems Biology, Columbia University, New York, NY 10032 USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032 USA
| | - Jiguang Wang
- Department of Systems Biology, Columbia University, New York, NY 10032 USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032 USA
- Division of Life Science and Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Raul Rabadan
- Department of Systems Biology, Columbia University, New York, NY 10032 USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032 USA
| |
Collapse
|
37
|
Kohmoto T, Masuda K, Naruto T, Tange S, Shoda K, Hamada J, Saito M, Ichikawa D, Tajima A, Otsuji E, Imoto I. Construction of a combinatorial pipeline using two somatic variant calling methods for whole exome sequence data of gastric cancer. THE JOURNAL OF MEDICAL INVESTIGATION 2017; 64:233-240. [PMID: 28954988 DOI: 10.2152/jmi.64.233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
High-throughput next-generation sequencing is a powerful tool to identify the genotypic landscapes of somatic variants and therapeutic targets in various cancers including gastric cancer, forming the basis for personalized medicine in the clinical setting. Although the advent of many computational algorithms leads to higher accuracy in somatic variant calling, no standard method exists due to the limitations of each method. Here, we constructed a new pipeline. We combined two different somatic variant callers with different algorithms, Strelka and VarScan 2, and evaluated performance using whole exome sequencing data obtained from 19 Japanese cases with gastric cancer (GC); then, we characterized these tumors based on identified driver molecular alterations. More single nucleotide variants (SNVs) and small insertions/deletions were detected by Strelka and VarScan 2, respectively. SNVs detected by both tools showed higher accuracy for estimating somatic variants compared with those detected by only one of the two tools and accurately showed the mutation signature and mutations of driver genes reported for GC. Our combinatorial pipeline may have an advantage in detection of somatic mutations in GC and may be useful for further genomic characterization of Japanese patients with GC to improve the efficacy of GC treatments. J. Med. Invest. 64: 233-240, August, 2017.
Collapse
Affiliation(s)
- Tomohiro Kohmoto
- Department of Human Genetics, Graduate School of Biomedical Sciences, Tokushima University
| | - Kiyoshi Masuda
- Department of Human Genetics, Graduate School of Biomedical Sciences, Tokushima University
| | - Takuya Naruto
- Department of Human Genetics, Graduate School of Biomedical Sciences, Tokushima University
| | - Shoichiro Tange
- Department of Human Genetics, Graduate School of Biomedical Sciences, Tokushima University
| | - Katsutoshi Shoda
- Department of Human Genetics, Graduate School of Biomedical Sciences, Tokushima University.,Division of Digestive Surgery, Department of Surgery, Kyoto Prefectural University of Medicine
| | - Junichi Hamada
- Department of Human Genetics, Graduate School of Biomedical Sciences, Tokushima University.,Division of Digestive Surgery, Department of Surgery, Kyoto Prefectural University of Medicine
| | - Masako Saito
- Department of Human Genetics, Graduate School of Biomedical Sciences, Tokushima University
| | - Daisuke Ichikawa
- Division of Digestive Surgery, Department of Surgery, Kyoto Prefectural University of Medicine
| | - Atsushi Tajima
- Department of Human Genetics, Graduate School of Biomedical Sciences, Tokushima University.,Department of Bioinformatics and Genomics, Graduate School of Advanced Preventive Medical Sciences, Kanazawa University
| | - Eigo Otsuji
- Division of Digestive Surgery, Department of Surgery, Kyoto Prefectural University of Medicine
| | - Issei Imoto
- Department of Human Genetics, Graduate School of Biomedical Sciences, Tokushima University
| |
Collapse
|
38
|
Callari M, Sammut SJ, De Mattos-Arruda L, Bruna A, Rueda OM, Chin SF, Caldas C. Intersect-then-combine approach: improving the performance of somatic variant calling in whole exome sequencing data using multiple aligners and callers. Genome Med 2017; 9:35. [PMID: 28420412 PMCID: PMC5394620 DOI: 10.1186/s13073-017-0425-1] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Accepted: 03/24/2017] [Indexed: 02/02/2023] Open
Abstract
Bioinformatic analysis of genomic sequencing data to identify somatic mutations in cancer samples is far from achieving the required robustness and standardisation. In this study we generated a whole exome sequencing benchmark dataset using the platinum genome sample NA12878 and developed an intersect-then-combine (ITC) approach to increase the accuracy in calling single nucleotide variants (SNVs) and indels in tumour-normal pairs. We evaluated the effect of alignment, base quality recalibration, mutation caller and filtering on sensitivity and false positive rate. The ITC approach increased the sensitivity up to 17.1%, without increasing the false positive rate per megabase (FPR/Mb) and its validity was confirmed in a set of clinical samples.
Collapse
Affiliation(s)
- Maurizio Callari
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| | | | | | - Alejandra Bruna
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Oscar M. Rueda
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Suet-Feung Chin
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Carlos Caldas
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| |
Collapse
|
39
|
Zhang HQ, Li MH, Gao P, Lan PH, Fan B, Xiao X, Lu YJ, Chen GJ, Wang Z. Preliminary Application of Precision Genomic Medicine Detecting Gene Variation in Patients with Multifocal Osteosarcoma. Orthop Surg 2017; 8:129-38. [PMID: 27384721 DOI: 10.1111/os.12249] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/28/2015] [Accepted: 02/07/2016] [Indexed: 12/30/2022] Open
Abstract
OBJECTIVES The purpose of this study was to present our clinical experience of treating multifocal osteosarcoma (MFOS) in our center and gain more insight into the biology of this rare condition; in particular, to address with the help of precision genomic medicine the issue of whether the multiple osteosarcoma (OS) lesions in such patients are multi-centric or originate from one primary lesion and metastasize to other sites. Finally, we aimed to identify particular gene phenotypes and mutations that differentiate MFOS from OS with only one tumor. METHODS Clinical data of patients with MFOS treated at our center between June 2007 and October 2014 were collected and analyzed retrospectively. High throughput sequencing of the whole exome of normal tissue and multiple lesions had been performed on samples from two patients (HJF and JZ) diagnosed in 2014. To explore the particular gene phenotype and clinical significance of MFOS, these sequencing results were analyzed and compared with those from patients with osteosarcoma in a single site. Seven patients with MFOS (three male and four female; average age 19.71 ± 3.35 years were enrolled in this study. Two of these patients declined treatment and died after 4 and 6 months, respectively. The remaining patients received standard treatment comprising neoadjuvant chemotherapy, surgery and chemotherapy. The chemotherapy regimen was lobaplatin (45 mg/m(2) ), doxorubicin (60 mg/m(2) ) and ifosfamide (12 g/m(2) ). Patients were followed up every 3 months after completing treatment and evaluated by the Enneking and Response Evaluation Criteria in Solid Tumors scoring systems. RESULTS Up to the last follow-up on 1 December 2015, three patients were still alive. The event-free survival ranged from 4 to 144 weeks (median, 50.14 weeks), the mean (±SD) being 55.45 ± 45.47 weeks. Overall survival ranged from 16 to 388 weeks (median, 89 weeks; mean ± SD, 118.7 ± 147.7 weeks). The rates of mutation of the targeted drug-related genes were 133.5% ± 3.0% in the proximal tibia lesion and 113.1% ± 1.9% in the distal femur of patient HJF (P < 0.01) and 136.1% ± 10.8% in the proximal tibial lesion and 122.3% ± 5.5% in the proximal humerus of patient JZ (P = 0.0335). Furthermore, there were several anti-oncogenes in the somatic copy number variation lists analyzed from the two patients, especially TP53. However, no kataegis was found. CONCLUSIONS Early and radical surgery accompanied by appropriate chemotherapy is the optimal means of treating MFOS. These patients may benefit from precision genomic medicine.
Collapse
Affiliation(s)
- Hao-Qiang Zhang
- Department of Orthopaedics, Xijing Hospital, Forth Military Medical University, Xi'an, China
| | - Ming-Hui Li
- Department of Orthopaedics, Xijing Hospital, Forth Military Medical University, Xi'an, China
| | - Peng Gao
- Department of Orthopaedics, Xijing Hospital, Forth Military Medical University, Xi'an, China
| | - Ping-Heng Lan
- Department of Orthopaedics, Xijing Hospital, Forth Military Medical University, Xi'an, China
| | - Bo Fan
- Department of Orthopaedics, Xijing Hospital, Forth Military Medical University, Xi'an, China
| | - Xin Xiao
- Department of Orthopaedics, Xijing Hospital, Forth Military Medical University, Xi'an, China
| | - Ya-Jie Lu
- Department of Orthopaedics, Xijing Hospital, Forth Military Medical University, Xi'an, China
| | - Guo-Jing Chen
- Department of Orthopaedics, Xijing Hospital, Forth Military Medical University, Xi'an, China
| | - Zhen Wang
- Department of Orthopaedics, Xijing Hospital, Forth Military Medical University, Xi'an, China
| |
Collapse
|
40
|
Hofmann AL, Behr J, Singer J, Kuipers J, Beisel C, Schraml P, Moch H, Beerenwinkel N. Detailed simulation of cancer exome sequencing data reveals differences and common limitations of variant callers. BMC Bioinformatics 2017; 18:8. [PMID: 28049408 PMCID: PMC5209852 DOI: 10.1186/s12859-016-1417-7] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2016] [Accepted: 12/10/2016] [Indexed: 12/30/2022] Open
Abstract
Background Next-generation sequencing of matched tumor and normal biopsy pairs has become a technology of paramount importance for precision cancer treatment. Sequencing costs have dropped tremendously, allowing the sequencing of the whole exome of tumors for just a fraction of the total treatment costs. However, clinicians and scientists cannot take full advantage of the generated data because the accuracy of analysis pipelines is limited. This particularly concerns the reliable identification of subclonal mutations in a cancer tissue sample with very low frequencies, which may be clinically relevant. Results Using simulations based on kidney tumor data, we compared the performance of nine state-of-the-art variant callers, namely deepSNV, GATK HaplotypeCaller, GATK UnifiedGenotyper, JointSNVMix2, MuTect, SAMtools, SiNVICT, SomaticSniper, and VarScan2. The comparison was done as a function of variant allele frequencies and coverage. Our analysis revealed that deepSNV and JointSNVMix2 perform very well, especially in the low-frequency range. We attributed false positive and false negative calls of the nine tools to specific error sources and assigned them to processing steps of the pipeline. All of these errors can be expected to occur in real data sets. We found that modifying certain steps of the pipeline or parameters of the tools can lead to substantial improvements in performance. Furthermore, a novel integration strategy that combines the ranks of the variants yielded the best performance. More precisely, the rank-combination of deepSNV, JointSNVMix2, MuTect, SiNVICT and VarScan2 reached a sensitivity of 78% when fixing the precision at 90%, and outperformed all individual tools, where the maximum sensitivity was 71% with the same precision. Conclusions The choice of well-performing tools for alignment and variant calling is crucial for the correct interpretation of exome sequencing data obtained from mixed samples, and common pipelines are suboptimal. We were able to relate observed substantial differences in performance to the underlying statistical models of the tools, and to pinpoint the error sources of false positive and false negative calls. These findings might inspire new software developments that improve exome sequencing pipelines and further the field of precision cancer treatment. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1417-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ariane L Hofmann
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstr, Basel, 26, 4058, Switzerland.,Swiss Institute of Bioinformatics, Mattenstr, Basel, 26, 4058, Switzerland
| | - Jonas Behr
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstr, Basel, 26, 4058, Switzerland.,Swiss Institute of Bioinformatics, Mattenstr, Basel, 26, 4058, Switzerland
| | - Jochen Singer
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstr, Basel, 26, 4058, Switzerland.,Swiss Institute of Bioinformatics, Mattenstr, Basel, 26, 4058, Switzerland
| | - Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstr, Basel, 26, 4058, Switzerland.,Swiss Institute of Bioinformatics, Mattenstr, Basel, 26, 4058, Switzerland
| | - Christian Beisel
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstr, Basel, 26, 4058, Switzerland
| | - Peter Schraml
- Institute for Surgical Pathology, University Hospital Zurich, Schmelzbergstrasse 12, Zurich, 8091, Switzerland
| | - Holger Moch
- Institute for Surgical Pathology, University Hospital Zurich, Schmelzbergstrasse 12, Zurich, 8091, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstr, Basel, 26, 4058, Switzerland. .,Swiss Institute of Bioinformatics, Mattenstr, Basel, 26, 4058, Switzerland.
| |
Collapse
|
41
|
TruePrime is a novel method for whole-genome amplification from single cells based on TthPrimPol. Nat Commun 2016; 7:13296. [PMID: 27897270 PMCID: PMC5141293 DOI: 10.1038/ncomms13296] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Accepted: 09/21/2016] [Indexed: 01/29/2023] Open
Abstract
Sequencing of a single-cell genome requires DNA amplification, a process prone to introducing bias and errors into the amplified genome. Here we introduce a novel multiple displacement amplification (MDA) method based on the unique DNA primase features of Thermus thermophilus (Tth) PrimPol. TthPrimPol displays a potent primase activity preferring dNTPs as substrates unlike conventional primases. A combination of TthPrimPol's unique ability to synthesize DNA primers with the highly processive Phi29 DNA polymerase (Φ29DNApol) enables near-complete whole genome amplification from single cells. This novel method demonstrates superior breadth and evenness of genome coverage, high reproducibility, excellent single-nucleotide variant (SNV) detection rates with low allelic dropout (ADO) and low chimera formation as exemplified by sequencing HEK293 cells. Moreover, copy number variant (CNV) calling yields superior results compared with random primer-based MDA methods. The advantages of this method, which we named TruePrime, promise to facilitate and improve single-cell genomic analysis. Single cell genomic analysis needs DNA amplification with high fidelity and accuracy. Here, the authors devise a novel multiple displacement amplification method called TruePrime that is based in Thermus thermophilus PrimPol and Phi29 DNA polymerase, and demonstrate its utility and accuracy.
Collapse
|
42
|
Cai L, Yuan W, Zhang Z, He L, Chou KC. In-depth comparison of somatic point mutation callers based on different tumor next-generation sequencing depth data. Sci Rep 2016; 6:36540. [PMID: 27874022 PMCID: PMC5118795 DOI: 10.1038/srep36540] [Citation(s) in RCA: 75] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 10/17/2016] [Indexed: 12/26/2022] Open
Abstract
Four popular somatic single nucleotide variant (SNV) calling methods (Varscan, SomaticSniper, Strelka and MuTect2) were carefully evaluated on the real whole exome sequencing (WES, depth of ~50X) and ultra-deep targeted sequencing (UDT-Seq, depth of ~370X) data. The four tools returned poor consensus on candidates (only 20% of calls were with multiple hits by the callers). For both WES and UDT-Seq, MuTect2 and Strelka obtained the largest proportion of COSMIC entries as well as the lowest rate of dbSNP presence and high-alternative-alleles-in-control calls, demonstrating their superior sensitivity and accuracy. Combining different callers does increase reliability of candidates, but narrows the list down to very limited range of tumor read depth and variant allele frequency. Calling SNV on UDT-Seq data, which were of much higher read-depth, discovered additional true-positive variations, despite an even more tremendous growth in false positive predictions. Our findings not only provide valuable benchmark for state-of-the-art SNV calling methods, but also shed light on the access to more accurate SNV identification in the future.
Collapse
Affiliation(s)
- Lei Cai
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Shanghai Key Laboratory of Psychotic Disorders (No.13dz2260500), Shanghai Jiao Tong University, Shanghai, 200030, China.,Gordon Life Science Institute, Boston, Massachusetts, 02478, USA
| | - Wei Yuan
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Shanghai Key Laboratory of Psychotic Disorders (No.13dz2260500), Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Zhou Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Shanghai Key Laboratory of Psychotic Disorders (No.13dz2260500), Shanghai Jiao Tong University, Shanghai, 200030, China.,Institute of Biliary Tract Disease, Xinhua Hospital, Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China
| | - Lin He
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Shanghai Key Laboratory of Psychotic Disorders (No.13dz2260500), Shanghai Jiao Tong University, Shanghai, 200030, China.,Women's Hospital School Of Medicine Zhejiang University, Hangzhou, 310006, China
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Boston, Massachusetts, 02478, USA.,Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| |
Collapse
|
43
|
do Valle ÍF, Giampieri E, Simonetti G, Padella A, Manfrini M, Ferrari A, Papayannidis C, Zironi I, Garonzi M, Bernardi S, Delledonne M, Martinelli G, Remondini D, Castellani G. Optimized pipeline of MuTect and GATK tools to improve the detection of somatic single nucleotide polymorphisms in whole-exome sequencing data. BMC Bioinformatics 2016; 17:341. [PMID: 28185561 PMCID: PMC5123378 DOI: 10.1186/s12859-016-1190-7] [Citation(s) in RCA: 75] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Detecting somatic mutations in whole exome sequencing data of cancer samples has become a popular approach for profiling cancer development, progression and chemotherapy resistance. Several studies have proposed software packages, filters and parametrizations. However, many research groups reported low concordance among different methods. We aimed to develop a pipeline which detects a wide range of single nucleotide mutations with high validation rates. We combined two standard tools - Genome Analysis Toolkit (GATK) and MuTect - to create the GATK-LODN method. As proof of principle, we applied our pipeline to exome sequencing data of hematological (Acute Myeloid and Acute Lymphoblastic Leukemias) and solid (Gastrointestinal Stromal Tumor and Lung Adenocarcinoma) tumors. We performed experiments on simulated data to test the sensitivity and specificity of our pipeline. RESULTS The software MuTect presented the highest validation rate (90 %) for mutation detection, but limited number of somatic mutations detected. The GATK detected a high number of mutations but with low specificity. The GATK-LODN increased the performance of the GATK variant detection (from 5 of 14 to 3 of 4 confirmed variants), while preserving mutations not detected by MuTect. However, GATK-LODN filtered more variants in the hematological samples than in the solid tumors. Experiments in simulated data demonstrated that GATK-LODN increased both specificity and sensitivity of GATK results. CONCLUSION We presented a pipeline that detects a wide range of somatic single nucleotide variants, with good validation rates, from exome sequencing data of cancer samples. We also showed the advantage of combining standard algorithms to create the GATK-LODN method, that increased specificity and sensitivity of GATK results. This pipeline can be helpful in discovery studies aimed to profile the somatic mutational landscape of cancer genomes.
Collapse
Affiliation(s)
- Ítalo Faria do Valle
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy
- CAPES Foundation, Ministry of Education of Brazil, Brasília, DF, Brazil
| | - Enrico Giampieri
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy
| | - Giorgia Simonetti
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Antonella Padella
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Marco Manfrini
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Anna Ferrari
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Cristina Papayannidis
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Isabella Zironi
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy
| | - Marianna Garonzi
- Department of Biotechnology, University of Verona, Verona, Italy
| | - Simona Bernardi
- Unit of Blood Diseases and Stem Cell Transplantation, Department of Clinical and Experimental Sciences, University of Brescia, Brescia, Italy
| | - Massimo Delledonne
- Department of Biotechnology, University of Verona, Verona, Italy
- Personal Genomics, Verona, Italy
| | - Giovanni Martinelli
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Daniel Remondini
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy.
| | - Gastone Castellani
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy
| |
Collapse
|
44
|
Krøigård AB, Thomassen M, Lænkholm AV, Kruse TA, Larsen MJ. Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data. PLoS One 2016; 11:e0151664. [PMID: 27002637 PMCID: PMC4803342 DOI: 10.1371/journal.pone.0151664] [Citation(s) in RCA: 115] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Accepted: 03/02/2016] [Indexed: 02/03/2023] Open
Abstract
Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths.
Collapse
Affiliation(s)
- Anne Bruun Krøigård
- Department of Clinical Genetics, Odense University Hospital, Sdr. Boulevard 29, 5000, Odense, Denmark
- Human Genetics, Institute of Clinical Research, University of Southern Denmark, Winsløvparken 19, 5000, Odense, Denmark
- * E-mail:
| | - Mads Thomassen
- Department of Clinical Genetics, Odense University Hospital, Sdr. Boulevard 29, 5000, Odense, Denmark
- Human Genetics, Institute of Clinical Research, University of Southern Denmark, Winsløvparken 19, 5000, Odense, Denmark
| | - Anne-Vibeke Lænkholm
- Department of Pathology, Slagelse Hospital, Ingemannsvej 18, 4200, Slagelse, Denmark
| | - Torben A. Kruse
- Department of Clinical Genetics, Odense University Hospital, Sdr. Boulevard 29, 5000, Odense, Denmark
- Human Genetics, Institute of Clinical Research, University of Southern Denmark, Winsløvparken 19, 5000, Odense, Denmark
| | - Martin Jakob Larsen
- Department of Clinical Genetics, Odense University Hospital, Sdr. Boulevard 29, 5000, Odense, Denmark
- Human Genetics, Institute of Clinical Research, University of Southern Denmark, Winsløvparken 19, 5000, Odense, Denmark
| |
Collapse
|
45
|
Warr A, Robert C, Hume D, Archibald AL, Deeb N, Watson M. Identification of Low-Confidence Regions in the Pig Reference Genome (Sscrofa10.2). Front Genet 2015; 6:338. [PMID: 26640477 PMCID: PMC4662242 DOI: 10.3389/fgene.2015.00338] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 11/12/2015] [Indexed: 01/09/2023] Open
Abstract
Many applications of high throughput sequencing rely on the availability of an accurate reference genome. Variant calling often produces large data sets that cannot be realistically validated and which may contain large numbers of false-positives. Errors in the reference assembly increase the number of false-positives. While resources are available to aid in the filtering of variants from human data, for other species these do not yet exist and strict filtering techniques must be employed which are more likely to exclude true-positives. This work assesses the accuracy of the pig reference genome (Sscrofa10.2) using whole genome sequencing reads from the Duroc sow whose genome the assembly was based on. Indicators of structural variation including high regional coverage, unexpected insert sizes, improper pairing and homozygous variants were used to identify low quality (LQ) regions of the assembly. Low coverage (LC) regions were also identified and analyzed separately. The LQ regions covered 13.85% of the genome, the LC regions covered 26.6% of the genome and combined (LQLC) they covered 33.07% of the genome. Over half of dbSNP variants were located in the LQLC regions. Of copy number variable regions identified in a previous study, 86.3% were located in the LQLC regions. The regions were also enriched for gene predictions from RNA-seq data with 42.98% falling in the LQLC regions. Excluding variants in the LQ, LC, or LQLC from future analyses will help reduce the number of false-positive variant calls. Researchers using WGS data should be aware that the current pig reference genome does not give an accurate representation of the copy number of alleles in the original Duroc sow's genome.
Collapse
Affiliation(s)
- Amanda Warr
- Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of EdinburghEdinburgh, UK
| | - Christelle Robert
- Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of EdinburghEdinburgh, UK
| | - David Hume
- Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of EdinburghEdinburgh, UK
| | - Alan L. Archibald
- Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of EdinburghEdinburgh, UK
| | | | - Mick Watson
- Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of EdinburghEdinburgh, UK,*Correspondence: Mick Watson,
| |
Collapse
|
46
|
Fang LT, Afshar PT, Chhibber A, Mohiyuddin M, Fan Y, Mu JC, Gibeling G, Barr S, Asadi NB, Gerstein MB, Koboldt DC, Wang W, Wong WH, Lam HYK. An ensemble approach to accurately detect somatic mutations using SomaticSeq. Genome Biol 2015; 16:197. [PMID: 26381235 PMCID: PMC4574535 DOI: 10.1186/s13059-015-0758-2] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2015] [Accepted: 08/20/2015] [Indexed: 12/03/2022] Open
Abstract
SomaticSeq is an accurate somatic mutation detection pipeline implementing a stochastic boosting algorithm to produce highly accurate somatic mutation calls for both single nucleotide variants and small insertions and deletions. The workflow currently incorporates five state-of-the-art somatic mutation callers, and extracts over 70 individual genomic and sequencing features for each candidate site. A training set is provided to an adaptively boosted decision tree learner to create a classifier for predicting mutation statuses. We validate our results with both synthetic and real data. We report that SomaticSeq is able to achieve better overall accuracy than any individual tool incorporated.
Collapse
Affiliation(s)
- Li Tai Fang
- Bina Technologies, Roche Sequencing, Redwood City, 94065, CA, USA. li\
| | | | - Aparna Chhibber
- Bina Technologies, Roche Sequencing, Redwood City, 94065, CA, USA.
| | | | - Yu Fan
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, 77030, TX, USA.
| | - John C Mu
- Bina Technologies, Roche Sequencing, Redwood City, 94065, CA, USA.
| | - Greg Gibeling
- Bina Technologies, Roche Sequencing, Redwood City, 94065, CA, USA.
| | - Sharon Barr
- Bina Technologies, Roche Sequencing, Redwood City, 94065, CA, USA.
| | | | - Mark B Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, 06520, CT, USA.
| | - Daniel C Koboldt
- The Genome Institute, Washington University in St Louis, St Louis, 63108, MO, USA.
| | - Wenyi Wang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, 77030, TX, USA.
| | - Wing H Wong
- Department of Statistics, Stanford University, Stanford, 94305, CA, USA.
- Department of Health Research and Policy, Stanford University, Stanford, 94305, CA, USA.
| | - Hugo Y K Lam
- Bina Technologies, Roche Sequencing, Redwood City, 94065, CA, USA.
| |
Collapse
|
47
|
Bao R, Hernandez K, Huang L, Kang W, Bartom E, Onel K, Volchenboum S, Andrade J. ExScalibur: A High-Performance Cloud-Enabled Suite for Whole Exome Germline and Somatic Mutation Identification. PLoS One 2015; 10:e0135800. [PMID: 26271043 PMCID: PMC4535852 DOI: 10.1371/journal.pone.0135800] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 07/27/2015] [Indexed: 12/30/2022] Open
Abstract
Whole exome sequencing has facilitated the discovery of causal genetic variants associated with human diseases at deep coverage and low cost. In particular, the detection of somatic mutations from tumor/normal pairs has provided insights into the cancer genome. Although there is an abundance of publicly-available software for the detection of germline and somatic variants, concordance is generally limited among variant callers and alignment algorithms. Successful integration of variants detected by multiple methods requires in-depth knowledge of the software, access to high-performance computing resources, and advanced programming techniques. We present ExScalibur, a set of fully automated, highly scalable and modulated pipelines for whole exome data analysis. The suite integrates multiple alignment and variant calling algorithms for the accurate detection of germline and somatic mutations with close to 99% sensitivity and specificity. ExScalibur implements streamlined execution of analytical modules, real-time monitoring of pipeline progress, robust handling of errors and intuitive documentation that allows for increased reproducibility and sharing of results and workflows. It runs on local computers, high-performance computing clusters and cloud environments. In addition, we provide a data analysis report utility to facilitate visualization of the results that offers interactive exploration of quality control files, read alignment and variant calls, assisting downstream customization of potential disease-causing mutations. ExScalibur is open-source and is also available as a public image on Amazon cloud.
Collapse
Affiliation(s)
- Riyue Bao
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
| | - Kyle Hernandez
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
| | - Lei Huang
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
| | - Wenjun Kang
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
| | - Elizabeth Bartom
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
| | - Kenan Onel
- Department of Pediatrics, The University of Chicago, Chicago, Illinois, United States of America
| | - Samuel Volchenboum
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
- Department of Pediatrics, The University of Chicago, Chicago, Illinois, United States of America
- Computation Institute, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (JA); (SV)
| | - Jorge Andrade
- Center for Research Informatics, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (JA); (SV)
| |
Collapse
|
48
|
Samorodnitsky E, Jewell BM, Hagopian R, Miya J, Wing MR, Lyon E, Damodaran S, Bhatt D, Reeser JW, Datta J, Roychowdhury S. Evaluation of Hybridization Capture Versus Amplicon-Based Methods for Whole-Exome Sequencing. Hum Mutat 2015; 36:903-14. [PMID: 26110913 PMCID: PMC4832303 DOI: 10.1002/humu.22825] [Citation(s) in RCA: 175] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 06/11/2015] [Indexed: 11/26/2022]
Abstract
Next‐generation sequencing has aided characterization of genomic variation. While whole‐genome sequencing may capture all possible mutations, whole‐exome sequencing remains cost‐effective and captures most phenotype‐altering mutations. Initial strategies for exome enrichment utilized a hybridization‐based capture approach. Recently, amplicon‐based methods were designed to simplify preparation and utilize smaller DNA inputs. We evaluated two hybridization capture‐based and two amplicon‐based whole‐exome sequencing approaches, utilizing both Illumina and Ion Torrent sequencers, comparing on‐target alignment, uniformity, and variant calling. While the amplicon methods had higher on‐target rates, the hybridization capture‐based approaches demonstrated better uniformity. All methods identified many of the same single‐nucleotide variants, but each amplicon‐based method missed variants detected by the other three methods and reported additional variants discordant with all three other technologies. Many of these potential false positives or negatives appear to result from limited coverage, low variant frequency, vicinity to read starts/ends, or the need for platform‐specific variant calling algorithms. All methods demonstrated effective copy‐number variant calling when evaluated against a single‐nucleotide polymorphism array. This study illustrates some differences between whole‐exome sequencing approaches, highlights the need for selecting appropriate variant calling based on capture method, and will aid laboratories in selecting their preferred approach.
Collapse
Affiliation(s)
- Eric Samorodnitsky
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, 43210
| | - Benjamin M Jewell
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, 43210
| | - Raffi Hagopian
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, 43210
| | - Jharna Miya
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, 43210
| | - Michele R Wing
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, 43210
| | - Ezra Lyon
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, 43210
| | - Senthilkumar Damodaran
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, 43210.,Division of Medical Oncology, Department of Internal Medicine, The Ohio State University, Columbus, Ohio, 43210
| | - Darshna Bhatt
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, 43210
| | - Julie W Reeser
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, 43210
| | - Jharna Datta
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, 43210
| | - Sameek Roychowdhury
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, 43210.,Division of Medical Oncology, Department of Internal Medicine, The Ohio State University, Columbus, Ohio, 43210.,Department of Pharmacology, The Ohio State University, Columbus, Ohio, 43210
| |
Collapse
|
49
|
Abstract
In this review, we describe key components of a computational infrastructure for a precision medicine program that is based on clinical-grade genomic sequencing. Specific aspects covered in this review include software components and hardware infrastructure, reporting, integration into Electronic Health Records for routine clinical use and regulatory aspects. We emphasize informatics components related to reproducibility and reliability in genomic testing, regulatory compliance, traceability and documentation of processes, integration into clinical workflows, privacy requirements, prioritization and interpretation of results to report based on clinical needs, rapidly evolving knowledge base of genomic alterations and clinical treatments and return of results in a timely and predictable fashion. We also seek to differentiate between the use of precision medicine in germline and cancer.
Collapse
|
50
|
Betge J, Kerr G, Miersch T, Leible S, Erdmann G, Galata CL, Zhan T, Gaiser T, Post S, Ebert MP, Horisberger K, Boutros M. Amplicon sequencing of colorectal cancer: variant calling in frozen and formalin-fixed samples. PLoS One 2015; 10:e0127146. [PMID: 26010451 PMCID: PMC4444292 DOI: 10.1371/journal.pone.0127146] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2015] [Accepted: 04/13/2015] [Indexed: 12/21/2022] Open
Abstract
Next generation sequencing (NGS) is an emerging technology becoming relevant for genotyping of clinical samples. Here, we assessed the stability of amplicon sequencing from formalin-fixed paraffin-embedded (FFPE) and paired frozen samples from colorectal cancer metastases with different analysis pipelines. 212 amplicon regions in 48 cancer related genes were sequenced with Illumina MiSeq using DNA isolated from resection specimens from 17 patients with colorectal cancer liver metastases. From ten of these patients, paired fresh frozen and routinely processed FFPE tissue was available for comparative study. Sample quality of FFPE tissues was determined by the amount of amplifiable DNA using qPCR, sequencing libraries were evaluated using Bioanalyzer. Three bioinformatic pipelines were compared for analysis of amplicon sequencing data. Selected hot spot mutations were reviewed using Sanger sequencing. In the sequenced samples from 16 patients, 29 non-synonymous coding mutations were identified in eleven genes. Most frequent were mutations in TP53 (10), APC (7), PIK3CA (3) and KRAS (2). A high concordance of FFPE and paired frozen tissue samples was observed in ten matched samples, revealing 21 identical mutation calls and only two mutations differing. Comparison of these results with two other commonly used variant calling tools, however, showed high discrepancies. Hence, amplicon sequencing can potentially be used to identify hot spot mutations in colorectal cancer metastases in frozen and FFPE tissue. However, remarkable differences exist among results of different variant calling tools, which are not only related to DNA sample quality. Our study highlights the need for standardization and benchmarking of variant calling pipelines, which will be required for translational and clinical applications.
Collapse
Affiliation(s)
- Johannes Betge
- Division of Signaling and Functional Genomics, German Cancer Research Center (DKFZ) and Department of Cell and Molecular Biology, Medical Faculty Mannheim, Heidelberg University, Heidelberg, Germany
- Department of Medicine II, University Hospital Mannheim, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
- * E-mail: ;
| | - Grainne Kerr
- Division of Signaling and Functional Genomics, German Cancer Research Center (DKFZ) and Department of Cell and Molecular Biology, Medical Faculty Mannheim, Heidelberg University, Heidelberg, Germany
| | - Thilo Miersch
- Division of Signaling and Functional Genomics, German Cancer Research Center (DKFZ) and Department of Cell and Molecular Biology, Medical Faculty Mannheim, Heidelberg University, Heidelberg, Germany
| | - Svenja Leible
- Division of Signaling and Functional Genomics, German Cancer Research Center (DKFZ) and Department of Cell and Molecular Biology, Medical Faculty Mannheim, Heidelberg University, Heidelberg, Germany
| | - Gerrit Erdmann
- Division of Signaling and Functional Genomics, German Cancer Research Center (DKFZ) and Department of Cell and Molecular Biology, Medical Faculty Mannheim, Heidelberg University, Heidelberg, Germany
| | - Christian L. Galata
- Department of Surgery, University Hospital Mannheim, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Tianzuo Zhan
- Division of Signaling and Functional Genomics, German Cancer Research Center (DKFZ) and Department of Cell and Molecular Biology, Medical Faculty Mannheim, Heidelberg University, Heidelberg, Germany
- Department of Medicine II, University Hospital Mannheim, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Timo Gaiser
- Institue of Pathology, University Hospital Mannheim, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Stefan Post
- Department of Surgery, University Hospital Mannheim, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Matthias P. Ebert
- Department of Medicine II, University Hospital Mannheim, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Karoline Horisberger
- Department of Surgery, University Hospital Mannheim, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Michael Boutros
- Division of Signaling and Functional Genomics, German Cancer Research Center (DKFZ) and Department of Cell and Molecular Biology, Medical Faculty Mannheim, Heidelberg University, Heidelberg, Germany
- * E-mail: ;
| |
Collapse
|