1
|
Xie B, Mo M, Cui H, Dong Y, Yin H, Lu Z. Integration of Nuclear, Clinical, and Genetic Features for Lung Cancer Subtype Classification and Survival Prediction Based on Machine- and Deep-Learning Models. Diagnostics (Basel) 2025; 15:872. [PMID: 40218222 PMCID: PMC11988547 DOI: 10.3390/diagnostics15070872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2025] [Revised: 03/20/2025] [Accepted: 03/25/2025] [Indexed: 04/14/2025] Open
Abstract
Objectives: Lung cancer is one of the most prevalent cancers worldwide. Accurately determining lung cancer subtypes and identifying high-risk patients are helpful for individualized treatment and follow-up. Our study aimed to establish an effective model for subtype classification and overall survival (OS) prediction in patients with lung cancer. Methods: Histopathological images, clinical data, and genetic information of lung adenocarcinoma and lung squamous cell carcinoma cases were downloaded from The Cancer Genome Atlas. An influencing factor system was optimized based on the nuclear, clinical, and genetic features. Four machine-learning models-light gradient boosting machine (LightGBM), extreme gradient boosting (XGBoost), random forest (RF), and adaptive boosting (AdaBoost)-and three deep-learning models-multilayer perceptron (MLP), TabNet, and convolutional neural network (CNN)-were employed for subtype classification and OS prediction. The performance of the models was comprehensively evaluated. Results: XGBoost exhibited the highest area under the curve (AUC) value of 0.9821 in subtype classification, whereas RF exhibited the highest AUC values of 0.9134, 0.8706, and 0.8765 in predicting OS at 1, 2, and 3 years, respectively. Conclusions: Our study was the first to incorporate the characteristics of nuclei and the genetic information of patients to predict the subtypes and OS of patients with lung cancer. The combination of different factors and the usage of artificial intelligence methods achieved a small breakthrough in the results of previous studies regarding AUC values.
Collapse
Affiliation(s)
- Bin Xie
- School of Information Science and Technology, Hangzhou Normal University, Hangzhou 311121, China; (B.X.); (M.M.)
| | - Mingda Mo
- School of Information Science and Technology, Hangzhou Normal University, Hangzhou 311121, China; (B.X.); (M.M.)
| | - Haidong Cui
- Department of Breast Surgery, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 311121, China;
| | - Yijie Dong
- School of Software Technology, Zhejiang University, Ningbo 315048, China;
| | - Hongping Yin
- School of Basic Medical Sciences, Hangzhou Normal University, Hangzhou 311121, China;
- Zhejiang Key Laboratory of Medical Epigenetics, Hangzhou Normal University, Hangzhou 311121, China
| | - Zhe Lu
- School of Basic Medical Sciences, Hangzhou Normal University, Hangzhou 311121, China;
- Zhejiang Key Laboratory of Medical Epigenetics, Hangzhou Normal University, Hangzhou 311121, China
| |
Collapse
|
2
|
Todd C, Jin L, McQuillan I. SV-JIM, detailed pairwise structural variant calling using long-reads and genome assemblies. Methods 2025; 234:305-313. [PMID: 39826659 DOI: 10.1016/j.ymeth.2024.12.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 12/21/2024] [Accepted: 12/30/2024] [Indexed: 01/22/2025] Open
Abstract
This paper proposes a detailed process for SV calling that permits a data-driven assessment of multiple SV callers that uses both genome assemblies and long-reads. The process is implemented as a software pipeline named Structural Variant - Jaccard Index Measure, or SVJIM, using the Snakemake [20] workflow management system. Like most state-of-the-art SV callers, SV-JIM detects the presence of variations between pairs of genomes, but it streamlines the numerous SV calling stages into a single process for user convenience and evaluates the multiple SV sets produced using the Jaccard index measure to identify those with the highest consistency among the included SV callers. SV-JIM then produces aggregated SV results based on how many callers supported the reported SVs. For validation, SV-JIM was assessed through three case studies on the Homo sapiens genome and two plant genomes - Brassica nigra and Arabidopsis thaliana. Executing SV-JIM identified a significant amount of inter-caller variance which varied by tens of thousands of results on the larger Brassica nigra and Homo sapiens genomes. Further, aggregating the SV sets helped simplify better retention of the less frequently occurring SV types by requiring a level of minimum support rather than from a specific SV caller combination. Finally, these case studies identified a potential for inflated precision reporting that can occur during evaluation. SV-JIM is available publicly under MIT license at https://github.com/USask-BINFO/SV-JIM.
Collapse
Affiliation(s)
- Clarence Todd
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada.
| | - Lingling Jin
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada.
| | - Ian McQuillan
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada.
| |
Collapse
|
3
|
Zhai H, Dong C, Wang T, Luo J. HiSVision: A Method for Detecting Large-Scale Structural Variations Based on Hi-C Data and Detection Transformer. Interdiscip Sci 2024:10.1007/s12539-024-00677-0. [PMID: 39714580 DOI: 10.1007/s12539-024-00677-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Accepted: 11/17/2024] [Indexed: 12/24/2024]
Abstract
Structural variation (SV) is an important component of the diversity of the human genome. Many studies have shown that SV has a significant impact on human disease and is strongly associated with the development of cancer. In recent years, the Hi-C sequencing technique has been shown to be useful for detecting large-scale SVs, and several methods have been proposed for identifying SVs from Hi-C data. However, due to the complexity of the 3D genome structure, accurate identifying SVs from the Hi-C contact matrix remains a challenging task. Here, we present HiSVision, a method for identifying large-scale SVs from Hi-C data using a detection transformer framework. Inspired by object detection network, we transform the Hi-C contact matrix into images, then identify candidate SV regions on the image by detection transformer, and finally filter SVs based on features around the breakpoints. Experimental results show that HiSVision outperforms existing methods in terms of precision and F1 score on cancer cell lines and simulated datasets. The source code and data are available from https://github.com/dcy99/HiSVision .
Collapse
Affiliation(s)
- Haixia Zhai
- School of Software, Henan Polytechnic University, Jiaozuo, 454003, China
| | - Chengyao Dong
- School of Software, Henan Polytechnic University, Jiaozuo, 454003, China
| | - Tao Wang
- School of Software, Henan Polytechnic University, Jiaozuo, 454003, China
| | - Junwei Luo
- School of Software, Henan Polytechnic University, Jiaozuo, 454003, China.
| |
Collapse
|
4
|
Zhang W, Xiao Y, Zhou Q, Zhu X, Zhang Y, Xiang Q, Wu S, Song X, Zhao J, Yuan R, Xiao B, Li L. KNSTRN Is a Prognostic Biomarker That Is Correlated with Immune Infiltration in Breast Cancer and Promotes Cell Cycle and Proliferation. Biochem Genet 2024; 62:3709-3739. [PMID: 38198023 PMCID: PMC11427568 DOI: 10.1007/s10528-023-10615-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 11/29/2023] [Indexed: 01/11/2024]
Abstract
Kinetochore-localized astrin/SPAG5-binding protein (KNSTRN) promotes the progression of bladder cancer and lung adenocarcinoma. However, its expression and biological function in breast cancer remain largely unknown. Therefore, this study aimed to analyze KNSTRN expression, prognoses, correlation with immune infiltration, expression-associated genes, and regulated signaling pathways to characterize its role in regulating the cell cycle using both bioinformatics and in vitro functional experiments. Analyses of The Cancer Genome Atlas, Gene Expression Omnibus, TIMER, and The Human Protein Atlas databases revealed a significant upregulation of KNSTRN transcript and protein levels in breast cancer. Kaplan-Meier survival analyses demonstrated a significant association between high expression of KNSTRN and poor overall survival, relapse-free survival, post-progression survival, and distant metastases-free survival in patients with breast cancer. Furthermore, multivariate Cox regression analyses confirmed that KNSTRN is an independent prognostic factor for breast cancer. Immune infiltration analysis indicated a positive correlation between KNSTRN expression and T regulatory cell infiltration while showing a negative correlation with Tgd and natural killer cell infiltration. Gene set enrichment analysis along with single-cell transcriptome data analysis suggested that KNSTRN promoted cell cycle progression by regulating the expression of key cell cycle proteins. The overexpression and silencing of KNSTRN in vitro, respectively, promoted and inhibited the proliferation of breast cancer cells. The overexpression of KNSTRN enhanced the expression of key cell cycle regulators, including CDK4, CDK6, and cyclin D3, thereby accelerating the G1/S phase transition and leading to aberrant proliferation of breast cancer cells. In conclusion, our study demonstrates that KNSTRN functions as an oncogene in breast cancer by regulating immune response, promoting G1/S transition, and facilitating breast cancer cell proliferation. Moreover, KNSTRN has potential as a molecular biomarker for diagnostic and prognostic prediction in breast cancer.
Collapse
Affiliation(s)
- Wenwu Zhang
- Department of Laboratory Medicine, The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People's Hospital, Qingyuan, 511518, China
- Department of Laboratory Medicine, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, 215008, China
| | - Yuhan Xiao
- School of Public Health, Dali University, Dali, 671000, China
| | - Quan Zhou
- Department of Laboratory Medicine, General Hospital of Southern Theater Command of People's Liberation Army (PLA), Guangzhou, 510010, China
| | - Xin Zhu
- Department of Laboratory Medicine, The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People's Hospital, Qingyuan, 511518, China
| | - Yanxia Zhang
- Department of Laboratory Medicine, The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People's Hospital, Qingyuan, 511518, China
| | - Qin Xiang
- Department of Laboratory Medicine, The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People's Hospital, Qingyuan, 511518, China
| | - Shunhong Wu
- Department of Laboratory Medicine, The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People's Hospital, Qingyuan, 511518, China
| | - Xiaoyu Song
- Department of Laboratory Medicine, The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People's Hospital, Qingyuan, 511518, China
| | - Junxiu Zhao
- School of Public Health, Dali University, Dali, 671000, China
| | - Ruanfei Yuan
- Department of Laboratory Medicine, The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People's Hospital, Qingyuan, 511518, China
| | - Bin Xiao
- Department of Laboratory Medicine, The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People's Hospital, Qingyuan, 511518, China.
| | - Linhai Li
- Department of Laboratory Medicine, The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People's Hospital, Qingyuan, 511518, China.
- Department of Laboratory Medicine, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, 215008, China.
| |
Collapse
|
5
|
Zheng Y, Shang X. FindCSV: a long-read based method for detecting complex structural variations. BMC Bioinformatics 2024; 25:315. [PMID: 39342151 PMCID: PMC11439270 DOI: 10.1186/s12859-024-05937-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Accepted: 09/18/2024] [Indexed: 10/01/2024] Open
Abstract
BACKGROUND Structural variations play a significant role in genetic diseases and evolutionary mechanisms. Extensive research has been conducted over the past decade to detect simple structural variations, leading to the development of well-established detection methods. However, recent studies have highlighted the potentially greater impact of complex structural variations on individuals compared to simple structural variations. Despite this, the field still lacks precise detection methods specifically designed for complex structural variations. Therefore, the development of a highly efficient and accurate detection method is of utmost importance. RESULT In response to this need, we propose a novel method called FindCSV, which leverages deep learning techniques and consensus sequences to enhance the detection of SVs using long-read sequencing data. Compared to current methods, FindCSV performs better in detecting complex and simple structural variations. CONCLUSIONS FindCSV is a new method to detect complex and simple structural variations with reasonable accuracy in real and simulated data. The source code for the program is available at https://github.com/nwpuzhengyan/FindCSV .
Collapse
Affiliation(s)
- Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China.
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China.
| |
Collapse
|
6
|
Junjun R, Zhengqian Z, Ying W, Jialiang W, Yongzhuang L. A comprehensive review of deep learning-based variant calling methods. Brief Funct Genomics 2024; 23:303-313. [PMID: 38366908 DOI: 10.1093/bfgp/elae003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 01/14/2024] [Accepted: 01/18/2023] [Indexed: 02/18/2024] Open
Abstract
Genome sequencing data have become increasingly important in the field of personalized medicine and diagnosis. However, accurately detecting genomic variations remains a challenging task. Traditional variation detection methods rely on manual inspection or predefined rules, which can be time-consuming and prone to errors. Consequently, deep learning-based approaches for variation detection have gained attention due to their ability to automatically learn genomic features that distinguish between variants. In our review, we discuss the recent advancements in deep learning-based algorithms for detecting small variations and structural variations in genomic data, as well as their advantages and limitations.
Collapse
Affiliation(s)
- Ren Junjun
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin 150001, China
| | - Zhang Zhengqian
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin 150001, China
| | - Wu Ying
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin 150001, China
| | - Wang Jialiang
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin 150001, China
| | - Liu Yongzhuang
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin 150001, China
| |
Collapse
|
7
|
Zhang Z, Liu Y, Li X, Liu Y, Wang Y, Jiang T. HapKled: a haplotype-aware structural variant calling approach for Oxford nanopore sequencing data. Front Genet 2024; 15:1435087. [PMID: 39045321 PMCID: PMC11263161 DOI: 10.3389/fgene.2024.1435087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 06/13/2024] [Indexed: 07/25/2024] Open
Abstract
Introduction: Structural Variants (SVs) are a type of variation that can significantly influence phenotypes and cause diseases. Thus, the accurate detection of SVs is a vital part of modern genetic analysis. The advent of long-read sequencing technology ushers in a new era of more accurate and comprehensive SV calling, and many tools have been developed to call SVs using long-read data. Haplotype-tagging is a procedure that can tag haplotype information on reads and can thus potentially improve the SV detection; nevertheless, few methods make use of this information. In this article, we introduce HapKled, a new SV detection tool that can accurately detect SVs from Oxford Nanopore Technologies (ONT) long-read alignment data. Methods: HapKled utilizes haplotype information underlying alignment data by conducting haplotype-tagging using Whatshap on the reads to improve the detection performance, with three unique calling mechanics including altering clustering conditions according to haplotype information of signatures, determination of similar SVs based on haplotype information, and slack filtering conditions based on haplotype quality. Results: In our evaluations, HapKled outperformed state-of-the-art tools and can deliver better SV detection results on both simulated and real sequencing data. The code and experiments of HapKled can be obtained from https://github.com/CoREse/HapKled. Discussion: With the superb SV detection performance that HapKled can deliver, HapKled could be useful in bioinformatics research, clinical diagnosis, and medical research and development.
Collapse
Affiliation(s)
- Zhendong Zhang
- Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Yue Liu
- Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Xin Li
- Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Yadong Liu
- Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, China
| | - Yadong Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, China
| | - Tao Jiang
- Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, China
| |
Collapse
|
8
|
Eldfors S, Saad J, Ikonen N, Malani D, Vähä-Koskela M, Gjertsen BT, Kontro M, Porkka K, Heckman CA. Monosomy 7/del(7q) cause sensitivity to inhibitors of nicotinamide phosphoribosyltransferase in acute myeloid leukemia. Blood Adv 2024; 8:1621-1633. [PMID: 38197948 PMCID: PMC10987804 DOI: 10.1182/bloodadvances.2023010435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 12/11/2023] [Accepted: 12/30/2023] [Indexed: 01/11/2024] Open
Abstract
ABSTRACT Monosomy 7 and del(7q) (-7/-7q) are frequent chromosomal abnormalities detected in up to 10% of patients with acute myeloid leukemia (AML). Despite unfavorable treatment outcomes, no approved targeted therapies exist for patients with -7/-7q. Therefore, we aimed to identify novel vulnerabilities. Through an analysis of data from ex vivo drug screens of 114 primary AML samples, we discovered that -7/-7q AML cells are highly sensitive to the inhibition of nicotinamide phosphoribosyltransferase (NAMPT). NAMPT is the rate-limiting enzyme in the nicotinamide adenine dinucleotide salvage pathway. Mechanistically, the NAMPT gene is located at 7q22.3, and deletion of 1 copy due to -7/-7q results in NAMPT haploinsufficiency, leading to reduced expression and a therapeutically targetable vulnerability to the inhibition of NAMPT. Our results show that in -7/-7q AML, differentiated CD34+CD38+ myeloblasts are more sensitive to the inhibition of NAMPT than less differentiated CD34+CD38- myeloblasts. Furthermore, the combination of the BCL2 inhibitor venetoclax and the NAMPT inhibitor KPT-9274 resulted in the death of significantly more leukemic blasts in AML samples with -7/-7q than the NAMPT inhibitor alone. In conclusion, our findings demonstrate that AML with -7/-7q is highly sensitive to NAMPT inhibition, suggesting that NAMPT inhibitors have the potential to be an effective targeted therapy for patients with monosomy 7 or del(7q).
Collapse
Affiliation(s)
- Samuli Eldfors
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
- Department of Internal Medicine, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- Krantz Family Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA
- Department of Medicine, Harvard Medical School, Boston, MA
| | - Joseph Saad
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, Helsinki, Finland
| | - Nemo Ikonen
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, Helsinki, Finland
| | - Disha Malani
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
- Department of Medicine, Harvard Medical School, Boston, MA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA
| | - Markus Vähä-Koskela
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, Helsinki, Finland
| | - Bjørn T. Gjertsen
- Department of Medicine, Hematology Section, Haukeland University Hospital, Bergen, Norway
- Department of Clinical Science, Center for Cancer Biomarkers, University of Bergen, Bergen, Norway
| | - Mika Kontro
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, Helsinki, Finland
- Department of Hematology, Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
- Foundation for the Finnish Cancer Institute, Helsinki, Finland
| | - Kimmo Porkka
- Department of Internal Medicine, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, Helsinki, Finland
- Department of Hematology, Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
| | - Caroline A. Heckman
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, Helsinki, Finland
| |
Collapse
|
9
|
Wang Y, Chen Y, Gao J, Xie H, Guo Y, Yang J, Liu J, Chen Z, Li Q, Li M, Ren J, Wen L, Tang F. Mapping crossover events of mouse meiotic recombination by restriction fragment ligation-based Refresh-seq. Cell Discov 2024; 10:26. [PMID: 38443370 PMCID: PMC10915157 DOI: 10.1038/s41421-023-00638-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 12/11/2023] [Indexed: 03/07/2024] Open
Abstract
Single-cell whole-genome sequencing methods have undergone great improvements over the past decade. However, allele dropout, which means the inability to detect both alleles simultaneously in an individual diploid cell, largely restricts the application of these methods particularly for medical applications. Here, we develop a new single-cell whole-genome sequencing method based on third-generation sequencing (TGS) platform named Refresh-seq (restriction fragment ligation-based genome amplification and TGS). It is based on restriction endonuclease cutting and ligation strategy in which two alleles in an individual cell can be cut into equal fragments and tend to be amplified simultaneously. As a new single-cell long-read genome sequencing method, Refresh-seq features much lower allele dropout rate compared with SMOOTH-seq. Furthermore, we apply Refresh-seq to 688 sperm cells and 272 female haploid cells (secondary polar bodies and parthenogenetic oocytes) from F1 hybrid mice. We acquire high-resolution genetic map of mouse meiosis recombination at low sequencing depth and reveal the sexual dimorphism in meiotic crossovers. We also phase the structure variations (deletions and insertions) in sperm cells and female haploid cells with high precision. Refresh-seq shows great performance in screening aneuploid sperm cells and oocytes due to the low allele dropout rate and has great potential for medical applications such as preimplantation genetic diagnosis.
Collapse
Affiliation(s)
- Yan Wang
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Yijun Chen
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Junpeng Gao
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
- Emergency Center, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China
| | - Haoling Xie
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Yuqing Guo
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Jingwei Yang
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Jun'e Liu
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Zonggui Chen
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
- Changping Laboratory, Beijing, China
| | - Qingqing Li
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Mengyao Li
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Jie Ren
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Lu Wen
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Fuchou Tang
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China.
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China.
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.
- Changping Laboratory, Beijing, China.
| |
Collapse
|
10
|
Zhang Z, Jiang T, Li G, Cao S, Liu Y, Liu B, Wang Y. Kled: an ultra-fast and sensitive structural variant detection tool for long-read sequencing data. Brief Bioinform 2024; 25:bbae049. [PMID: 38385878 PMCID: PMC10883419 DOI: 10.1093/bib/bbae049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 01/12/2024] [Accepted: 01/26/2024] [Indexed: 02/23/2024] Open
Abstract
Structural Variants (SVs) are a crucial type of genetic variant that can significantly impact phenotypes. Therefore, the identification of SVs is an essential part of modern genomic analysis. In this article, we present kled, an ultra-fast and sensitive SV caller for long-read sequencing data given the specially designed approach with a novel signature-merging algorithm, custom refinement strategies and a high-performance program structure. The evaluation results demonstrate that kled can achieve optimal SV calling compared to several state-of-the-art methods on simulated and real long-read data for different platforms and sequencing depths. Furthermore, kled excels at rapid SV calling and can efficiently utilize multiple Central Processing Unit (CPU) cores while maintaining low memory usage. The source code for kled can be obtained from https://github.com/CoREse/kled.
Collapse
Affiliation(s)
- Zhendong Zhang
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Tao Jiang
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, 450000, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Gaoyang Li
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Shuqi Cao
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yadong Liu
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, 450000, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Bo Liu
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, 450000, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yadong Wang
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, 450000, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| |
Collapse
|
11
|
Zheng Y, Shang X. SVvalidation: A long-read-based validation method for genomic structural variation. PLoS One 2024; 19:e0291741. [PMID: 38181020 PMCID: PMC10769053 DOI: 10.1371/journal.pone.0291741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 09/05/2023] [Indexed: 01/07/2024] Open
Abstract
Although various methods have been developed to detect structural variations (SVs) in genomic sequences, few are used to validate these results. Several commonly used SV callers produce many false positive SVs, and existing validation methods are not accurate enough. Therefore, a highly efficient and accurate validation method is essential. In response, we propose SVvalidation-a new method that uses long-read sequencing data for validating SVs with higher accuracy and efficiency. Compared to existing methods, SVvalidation performs better in validating SVs in repeat regions and can determine the homozygosity or heterozygosity of an SV. Additionally, SVvalidation offers the highest recall, precision, and F1-score (improving by 7-16%) across all datasets. Moreover, SVvalidation is suitable for different types of SVs. The program is available at https://github.com/nwpuzhengyan/SVvalidation.
Collapse
Affiliation(s)
- Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
| |
Collapse
|
12
|
Klever MK, Sträng E, Hetzel S, Jungnitsch J, Dolnik A, Schöpflin R, Schrezenmeier JF, Schick F, Blau O, Westermann J, Rücker FG, Xia Z, Döhner K, Schrezenmeier H, Spielmann M, Meissner A, Melo US, Mundlos S, Bullinger L. AML with complex karyotype: extreme genomic complexity revealed by combined long-read sequencing and Hi-C technology. Blood Adv 2023; 7:6520-6531. [PMID: 37582288 PMCID: PMC10632680 DOI: 10.1182/bloodadvances.2023010887] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 07/17/2023] [Accepted: 07/30/2023] [Indexed: 08/17/2023] Open
Abstract
Acute myeloid leukemia with complex karyotype (CK-AML) is associated with poor prognosis, which is only in part explained by underlying TP53 mutations. Especially in the presence of complex chromosomal rearrangements, such as chromothripsis, the outcome of CK-AML is dismal. However, this degree of complexity of genomic rearrangements contributes to the leukemogenic phenotype and treatment resistance of CK-AML remains largely unknown. Applying an integrative workflow for the detection of structural variants (SVs) based on Oxford Nanopore (ONT) genomic DNA long-read sequencing (gDNA-LRS) and high-throughput chromosome confirmation capture (Hi-C) in a well-defined cohort of CK-AML identified regions with an extreme density of SVs. These rearrangements consisted to a large degree of focal amplifications enriched in the proximity of mammalian-wide interspersed repeat elements, which often result in oncogenic fusion transcripts, such as USP7::MVD, or the deregulation of oncogenic driver genes as confirmed by RNA-seq and ONT direct complementary DNA sequencing. We termed this novel phenomenon chromocataclysm. Thus, our integrative SV detection workflow combing gDNA-LRS and Hi-C enables to unravel complex genomic rearrangements at a very high resolution in regions hard to analyze by conventional sequencing technology, thereby providing an important tool to identify novel important drivers underlying cancer with complex karyotypic changes.
Collapse
Affiliation(s)
- Marius-Konstantin Klever
- Division of Hematology, Oncology, and Cancer Immunology, Medical Department, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
- RG Development and Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical Genetics and Human Genetics, Charité University Medicine Berlin, Berlin, Germany
| | - Eric Sträng
- Division of Hematology, Oncology, and Cancer Immunology, Medical Department, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Sara Hetzel
- Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Julius Jungnitsch
- Institute for Medical Genetics and Human Genetics, Charité University Medicine Berlin, Berlin, Germany
- Human Molecular Genomics Group, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Anna Dolnik
- Division of Hematology, Oncology, and Cancer Immunology, Medical Department, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Robert Schöpflin
- RG Development and Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical Genetics and Human Genetics, Charité University Medicine Berlin, Berlin, Germany
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Jens-Florian Schrezenmeier
- Division of Hematology, Oncology, and Cancer Immunology, Medical Department, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Felix Schick
- Division of Hematology, Oncology, and Cancer Immunology, Medical Department, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Olga Blau
- Division of Hematology, Oncology, and Cancer Immunology, Medical Department, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
- Labor Berlin – Charité Vivantes GmbH, Berlin, Germany
| | - Jörg Westermann
- Division of Hematology, Oncology, and Cancer Immunology, Medical Department, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
- Labor Berlin – Charité Vivantes GmbH, Berlin, Germany
| | - Frank G. Rücker
- Department of Internal Medicine III, University Hospital of Ulm, Ulm, Germany
| | - Zuyao Xia
- Department of Internal Medicine III, University Hospital of Ulm, Ulm, Germany
| | - Konstanze Döhner
- Department of Internal Medicine III, University Hospital of Ulm, Ulm, Germany
| | - Hubert Schrezenmeier
- Institute of Transfusion Medicine, University of Ulm, Ulm, Germany
- Institute for Clinical Transfusion Medicine and Immunogenetics, German Red Cross Blood Transfusion Service Baden-Württemberg-Hessen and University Hospital Ulm, Ulm, Germany
| | - Malte Spielmann
- Human Molecular Genomics Group, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institut für Humangenetik Lübeck, Universität zu Lübeck, Lübeck, Germany
| | - Alexander Meissner
- Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Uirá Souto Melo
- RG Development and Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical Genetics and Human Genetics, Charité University Medicine Berlin, Berlin, Germany
| | - Stefan Mundlos
- RG Development and Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical Genetics and Human Genetics, Charité University Medicine Berlin, Berlin, Germany
- Labor Berlin – Charité Vivantes GmbH, Berlin, Germany
| | - Lars Bullinger
- Division of Hematology, Oncology, and Cancer Immunology, Medical Department, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
- Labor Berlin – Charité Vivantes GmbH, Berlin, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Heidelberg, Germany
| |
Collapse
|
13
|
Sohn JI, Choi MH, Yi D, Menon VA, Kim YJ, Lee J, Park JW, Kyung S, Shin SH, Na B, Joung JG, Ju YS, Yeom MS, Koh Y, Yoon SS, Baek D, Kim TM, Nam JW. Ultrafast prediction of somatic structural variations by filtering out reads matched to pan-genome k-mer sets. Nat Biomed Eng 2023; 7:853-866. [PMID: 36536253 DOI: 10.1038/s41551-022-00980-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 11/01/2022] [Indexed: 12/24/2022]
Abstract
Variant callers typically produce massive numbers of false positives for structural variations, such as cancer-relevant copy-number alterations and fusion genes resulting from genome rearrangements. Here we describe an ultrafast and accurate detector of somatic structural variations that reduces read-mapping costs by filtering out reads matched to pan-genome k-mer sets. The detector, which we named ETCHING (for efficient detection of chromosomal rearrangements and fusion genes), reduces the number of false positives by leveraging machine-learning classifiers trained with six breakend-related features (clipped-read count, split-reads count, supporting paired-end read count, average mapping quality, depth difference and total length of clipped bases). When benchmarked against six callers on reference cell-free DNA, validated biomarkers of structural variants, matched tumour and normal whole genomes, and tumour-only targeted sequencing datasets, ETCHING was 11-fold faster than the second-fastest structural-variant caller at comparable performance and memory use. The speed and accuracy of ETCHING may aid large-scale genome projects and facilitate practical implementations in precision medicine.
Collapse
Affiliation(s)
- Jang-Il Sohn
- Department of Life Science, Hanyang University, Seoul, Republic of Korea
- Research Institute for Convergence of Basic Sciences, Hanyang University, Seoul, Republic of Korea
| | - Min-Hak Choi
- Department of Life Science, Hanyang University, Seoul, Republic of Korea
| | - Dohun Yi
- Department of Life Science, Hanyang University, Seoul, Republic of Korea
| | - Vipin A Menon
- Department of Life Science, Hanyang University, Seoul, Republic of Korea
| | - Yeon Jeong Kim
- Samsung Genome Institute, Samsung Medical Center, Seoul, Republic of Korea
| | - Junehawk Lee
- Center for Supercomputing Applications, Division of National Supercomputing, Korea Institute of Science and Technology Information, Daejeon, Republic of Korea
| | - Jung Woo Park
- Center for Supercomputing Applications, Division of National Supercomputing, Korea Institute of Science and Technology Information, Daejeon, Republic of Korea
| | | | | | - Byunggook Na
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea
| | - Je-Gun Joung
- Department of Biomedical Science, College of Life Science, CHA University, Seongnam, Republic of Korea
| | - Young Seok Ju
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
- Biomedical Science and Engineering Interdisciplinary Program, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Min Sun Yeom
- Center for Supercomputing Applications, Division of National Supercomputing, Korea Institute of Science and Technology Information, Daejeon, Republic of Korea
| | - Youngil Koh
- College of Medicine, Seoul National University, Seoul, Republic of Korea
| | - Sung-Soo Yoon
- College of Medicine, Seoul National University, Seoul, Republic of Korea
| | - Daehyun Baek
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Tae-Min Kim
- Department of Medical Informatics and Cancer Research Institute, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Jin-Wu Nam
- Department of Life Science, Hanyang University, Seoul, Republic of Korea.
- Research Institute for Convergence of Basic Sciences, Hanyang University, Seoul, Republic of Korea.
- Bio-BigData Center, Hanyang Institute of Bioscience and Biotechnology, Hanyang University, Seoul, Republic of Korea.
| |
Collapse
|
14
|
Lu B, Curtius K, Graham TA, Yang Z, Barnes CP. CNETML: maximum likelihood inference of phylogeny from copy number profiles of multiple samples. Genome Biol 2023; 24:144. [PMID: 37340508 PMCID: PMC10283241 DOI: 10.1186/s13059-023-02983-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 06/08/2023] [Indexed: 06/22/2023] Open
Abstract
Phylogenetic trees based on copy number profiles from multiple samples of a patient are helpful to understand cancer evolution. Here, we develop a new maximum likelihood method, CNETML, to infer phylogenies from such data. CNETML is the first program to jointly infer the tree topology, node ages, and mutation rates from total copy numbers of longitudinal samples. Our extensive simulations suggest CNETML performs well on copy numbers relative to ploidy and under slight violation of model assumptions. The application of CNETML to real data generates results consistent with previous discoveries and provides novel early copy number events for further investigation.
Collapse
Affiliation(s)
- Bingxin Lu
- Department of Cell and Developmental Biology, University College London, London, UK.
- UCL Genetics Institute, University College London, London, UK.
| | - Kit Curtius
- Barts Cancer Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Trevor A Graham
- Barts Cancer Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK
- Centre for Evolution and Cancer, Institute of Cancer Research, London, UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Chris P Barnes
- Department of Cell and Developmental Biology, University College London, London, UK.
- UCL Genetics Institute, University College London, London, UK.
| |
Collapse
|
15
|
Han R, Han L, Zhao X, Wang Q, Xia Y, Li H. Haplotype-resolved Genome of Sika Deer Reveals Allele-specific Gene Expression and Chromosome Evolution. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:470-482. [PMID: 36395998 PMCID: PMC10787017 DOI: 10.1016/j.gpb.2022.11.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 10/24/2022] [Accepted: 11/07/2022] [Indexed: 11/16/2022]
Abstract
Despite the scientific and medicinal importance of diploid sika deer (Cervus nippon), its genome resources are limited and haplotype-resolved chromosome-scale assembly is urgently needed. To explore mechanisms underlying the expression patterns of the allele-specific genes in antlers and the chromosome evolution in Cervidae, we report, for the first time, a high-quality haplotype-resolved chromosome-scale genome of sika deer by integrating multiple sequencing strategies, which was anchored to 32 homologous groups with a pair of sex chromosomes (XY). Several expanded genes (RET, PPP2R1A, PPP2R1B, YWHAB, YWHAZ, and RPS6) and positively selected genes (eIF4E, Wnt8A, Wnt9B, BMP4, and TP53) were identified, which could contribute to rapid antler growth without carcinogenesis. A comprehensive and systematic genome-wide analysis of allele expression patterns revealed that most alleles were functionally equivalent in regulating rapid antler growth and inhibiting oncogenesis. Comparative genomic analysis revealed that chromosome fission might occur during the divergence of sika deer and red deer (Cervus elaphus), and the olfactory sensation of sika deer might be more powerful than that of red deer. Obvious inversion regions containing olfactory receptor genes were also identified, which arose since the divergence. In conclusion, the high-quality allele-aware reference genome provides valuable resources for further illustration of the unique biological characteristics of antler, chromosome evolution, and multi-omics research of cervid animals.
Collapse
Affiliation(s)
- Ruobing Han
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, China
| | - Lei Han
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, China
| | - Xunwu Zhao
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, China
| | - Qianghui Wang
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, China
| | - Yanling Xia
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, China
| | - Heping Li
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, China.
| |
Collapse
|
16
|
Ding Y, Liao Y, He J, Ma J, Wei X, Liu X, Zhang G, Wang J. Enhancing genomic mutation data storage optimization based on the compression of asymmetry of sparsity. Front Genet 2023; 14:1213907. [PMID: 37323665 PMCID: PMC10267386 DOI: 10.3389/fgene.2023.1213907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 05/24/2023] [Indexed: 06/17/2023] Open
Abstract
Background: With the rapid development of high-throughput sequencing technology and the explosive growth of genomic data, storing, transmitting and processing massive amounts of data has become a new challenge. How to achieve fast lossless compression and decompression according to the characteristics of the data to speed up data transmission and processing requires research on relevant compression algorithms. Methods: In this paper, a compression algorithm for sparse asymmetric gene mutations (CA_SAGM) based on the characteristics of sparse genomic mutation data was proposed. The data was first sorted on a row-first basis so that neighboring non-zero elements were as close as possible to each other. The data were then renumbered using the reverse Cuthill-Mckee sorting technique. Finally the data were compressed into sparse row format (CSR) and stored. We had analyzed and compared the results of the CA_SAGM, coordinate format (COO) and compressed sparse column format (CSC) algorithms for sparse asymmetric genomic data. Nine types of single-nucleotide variation (SNV) data and six types of copy number variation (CNV) data from the TCGA database were used as the subjects of this study. Compression and decompression time, compression and decompression rate, compression memory and compression ratio were used as evaluation metrics. The correlation between each metric and the basic characteristics of the original data was further investigated. Results: The experimental results showed that the COO method had the shortest compression time, the fastest compression rate and the largest compression ratio, and had the best compression performance. CSC compression performance was the worst, and CA_SAGM compression performance was between the two. When decompressing the data, CA_SAGM performed the best, with the shortest decompression time and the fastest decompression rate. COO decompression performance was the worst. With increasing sparsity, the COO, CSC and CA_SAGM algorithms all exhibited longer compression and decompression times, lower compression and decompression rates, larger compression memory and lower compression ratios. When the sparsity was large, the compression memory and compression ratio of the three algorithms showed no difference characteristics, but the rest of the indexes were still different. Conclusion: CA_SAGM was an efficient compression algorithm that combines compression and decompression performance for sparse genomic mutation data.
Collapse
Affiliation(s)
- Youde Ding
- The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People’s Hospital, Qingyuan, China
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Yuan Liao
- The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People’s Hospital, Qingyuan, China
| | - Ji He
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Jianfeng Ma
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Xu Wei
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Xuemei Liu
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Guiying Zhang
- The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People’s Hospital, Qingyuan, China
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Jing Wang
- The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People’s Hospital, Qingyuan, China
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
17
|
Zheng Y, Shang X. SVcnn: an accurate deep learning-based method for detecting structural variation based on long-read data. BMC Bioinformatics 2023; 24:213. [PMID: 37221476 DOI: 10.1186/s12859-023-05324-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 05/06/2023] [Indexed: 05/25/2023] Open
Abstract
BACKGROUND Structural variations (SVs) refer to variations in an organism's chromosome structure that exceed a length of 50 base pairs. They play a significant role in genetic diseases and evolutionary mechanisms. While long-read sequencing technology has led to the development of numerous SV caller methods, their performance results have been suboptimal. Researchers have observed that current SV callers often miss true SVs and generate many false SVs, especially in repetitive regions and areas with multi-allelic SVs. These errors are due to the messy alignments of long-read data, which are affected by their high error rate. Therefore, there is a need for a more accurate SV caller method. RESULT We propose a new method-SVcnn, a more accurate deep learning-based method for detecting SVs by using long-read sequencing data. We run SVcnn and other SV callers in three real datasets and find that SVcnn improves the F1-score by 2-8% compared with the second-best method when the read depth is greater than 5×. More importantly, SVcnn has better performance for detecting multi-allelic SVs. CONCLUSIONS SVcnn is an accurate deep learning-based method to detect SVs. The program is available at https://github.com/nwpuzhengyan/SVcnn .
Collapse
Affiliation(s)
- Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China.
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China.
| |
Collapse
|
18
|
Gao R, Luo J, Ding H, Zhai H. INSnet: a method for detecting insertions based on deep learning network. BMC Bioinformatics 2023; 24:80. [PMID: 36879189 PMCID: PMC9990265 DOI: 10.1186/s12859-023-05216-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 03/01/2023] [Indexed: 03/08/2023] Open
Abstract
BACKGROUND Many studies have shown that structural variations (SVs) strongly impact human disease. As a common type of SV, insertions are usually associated with genetic diseases. Therefore, accurately detecting insertions is of great significance. Although many methods for detecting insertions have been proposed, these methods often generate some errors and miss some variants. Hence, accurately detecting insertions remains a challenging task. RESULTS In this paper, we propose a method named INSnet to detect insertions using a deep learning network. First, INSnet divides the reference genome into continuous sub-regions and takes five features for each locus through alignments between long reads and the reference genome. Next, INSnet uses a depthwise separable convolutional network. The convolution operation extracts informative features through spatial information and channel information. INSnet uses two attention mechanisms, the convolutional block attention module (CBAM) and efficient channel attention (ECA) to extract key alignment features in each sub-region. In order to capture the relationship between adjacent subregions, INSnet uses a gated recurrent unit (GRU) network to further extract more important SV signatures. After predicting whether a sub-region contains an insertion through the previous steps, INSnet determines the precise site and length of the insertion. The source code is available from GitHub at https://github.com/eioyuou/INSnet . CONCLUSION Experimental results show that INSnet can achieve better performance than other methods in terms of F1 score on real datasets.
Collapse
Affiliation(s)
- Runtian Gao
- School of Software, Henan Polytechnic University, Jiaozuo, 454003, China
| | - Junwei Luo
- School of Software, Henan Polytechnic University, Jiaozuo, 454003, China.
| | - Hongyu Ding
- School of Software, Henan Polytechnic University, Jiaozuo, 454003, China
| | - Haixia Zhai
- School of Software, Henan Polytechnic University, Jiaozuo, 454003, China
| |
Collapse
|
19
|
Zheng Y, Shang X, Sung WK. SVsearcher: A more accurate structural variation detection method in long read data. Comput Biol Med 2023; 158:106843. [PMID: 37019014 DOI: 10.1016/j.compbiomed.2023.106843] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 03/03/2023] [Accepted: 03/30/2023] [Indexed: 04/03/2023]
Abstract
Structural variations (SVs) represent genomic rearrangements (such as deletions, insertions, and inversions) whose sizes are larger than 50bp. They play important roles in genetic diseases and evolution mechanism. Due to the advance of long-read sequencing (i.e. PacBio long-read sequencing and Oxford Nanopore (ONT) long-read sequencing), we can call SVs accurately. However, for ONT long reads, we observe that existing long read SV callers miss a lot of true SVs and call a lot of false SVs in repetitive regions and in regions with multi-allelic SVs. Those errors are caused by messy alignments of ONT reads due to their high error rate. Hence, we propose a novel method, SVsearcher, to solve these issues. We run SVsearcher and other callers in three real datasets and find that SVsearcher improves the F1 score by approximately 10% for high coverage (50×) datasets and more than 25% for low coverage (10×) datasets. More importantly, SVsearcher can identify 81.7%-91.8% multi-allelic SVs while existing methods only identify 13.2% (Sniffles)-54.0% (nanoSV) of them. SVsearcher is available at https://github.com/kensung-lab/SVsearcher.
Collapse
Affiliation(s)
- Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi'an, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi'an, China.
| | - Wing-Kin Sung
- Department of Chemical Pathology, The Chinese University of Hong Kong, Hong Kong, China; Hong Kong Genome Institute, Hong Kong Science Park, Shatin, Hong Kong, China; Laboratory of Computational Genomics, Li Ka Shing Institute of Health Science, The Chinese University of Hong Kong, Hong Kong, China.
| |
Collapse
|
20
|
Long-read sequencing identifies novel structural variations in colorectal cancer. PLoS Genet 2023; 19:e1010514. [PMID: 36812239 PMCID: PMC10013895 DOI: 10.1371/journal.pgen.1010514] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Revised: 03/14/2023] [Accepted: 11/08/2022] [Indexed: 02/24/2023] Open
Abstract
Structural variations (SVs) are a key type of cancer genomic alterations, contributing to oncogenesis and progression of many cancers, including colorectal cancer (CRC). However, SVs in CRC remain difficult to be reliably detected due to limited SV-detection capacity of the commonly used short-read sequencing. This study investigated the somatic SVs in 21 pairs of CRC samples by Nanopore whole-genome long-read sequencing. 5200 novel somatic SVs from 21 CRC patients (494 SVs / patient) were identified. A 4.9-Mbp long inversion that silences APC expression (confirmed by RNA-seq) and an 11.2-kbp inversion that structurally alters CFTR were identified. Two novel gene fusions that might functionally impact the oncogene RNF38 and the tumor-suppressor SMAD3 were detected. RNF38 fusion possesses metastasis-promoting ability confirmed by in vitro migration and invasion assay, and in vivo metastasis experiments. This work highlighted the various applications of long-read sequencing in cancer genome analysis, and shed new light on how somatic SVs structurally alter critical genes in CRC. The investigation on somatic SVs via nanopore sequencing revealed the potential of this genomic approach in facilitating precise diagnosis and personalized treatment of CRC.
Collapse
|
21
|
Cancer classification based on multiple dimensions: SNV patterns. Comput Biol Med 2022; 151:106270. [PMID: 36395594 DOI: 10.1016/j.compbiomed.2022.106270] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 10/09/2022] [Accepted: 10/30/2022] [Indexed: 11/13/2022]
Abstract
BACKGROUND The occurrence of cancer is closely related to single nucleotide variants (SNVs). However, in DNA samples collected from patients with distinct cancers, SNVs are detected in different patterns. Therefore, it is an important task to select the appropriate method by which to classify cancer to the greatest extent of SNV patterns, which will aid in cancer diagnosis and treatment. In traditional studies, researchers combined each SNV with its neighboring nucleotides to form a trinucleotide. Mutation signatures for cancer classification were extracted from the patterns of the trinucleotides, but the SNV feature extraction in a single dimension may result in partial information loss and poor model performance. RESULTS In this study, we defined multidimensional SNV (M-SNV) features to classify cancer. M-SNV features considered first- and second-order neighboring nucleotides of one-dimensional SNVs and included six types of features. We validated the feasibility of M-SNV features using a dataset obtained from The Cancer Genome Atlas (TCGA) consisting of 2761 samples from 12 cancers. We performed preliminary screening of 562,321 DNA mutation sites in these samples. The remaining mutation sites were characterized by cancer type in six signatures. We found that the extracted features showed a similar distribution in the cluster center of the cancer type of the samples. After the preprocessing of raw data, samples were more focused on the cancer subtype distributions at the SNV level. We used KNN (k-nearest neighbors) to classify the extracted features and employed the leave-one-out cross to verify them. The accuracy of classifying is stable at approximately 97% and can reach 97.43% in the most optimal case. Furthermore, we found that the validated oncogenes in the loci of the features had the highest importance among the 8 cancers. CONCLUSIONS It is feasible to classify cancers by the distribution of features we defined. Moreover, our methodology has potential implications for the discovery of oncogenes.
Collapse
|
22
|
Li X, Zhang X, Luo Y, Liu R, Sun Y, Zhao S, Yu M, Cao J. Large Fragment InDels Reshape Genome Structure of Porcine Alveolar Macrophage 3D4/21 Cells. Genes (Basel) 2022; 13:genes13091515. [PMID: 36140681 PMCID: PMC9498719 DOI: 10.3390/genes13091515] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 08/17/2022] [Accepted: 08/20/2022] [Indexed: 11/25/2022] Open
Abstract
The porcine monomyeloid cell line, or 3D4/21 cells, is an effective tool to study the immune characteristics and virus infection mechanism of pigs. Due to the introduction of the neomycin resistance gene and the SV40 large T antigen gene, its genome has undergone essential changes, which are still unknown. Studying the variation in genome structure, especially the large fragments of insertions and deletions (InDels), is one of the proper ways to reveal these issues. In this study, an All-seq method was established by combining Mate-pair and Shotgun sequencing methods, and the detection and verification of large fragments of InDels were performed on 3D4/21 cells. The results showed that there were 844 InDels with a length of more than 1 kb, of which 12 regions were deletions of more than 100 kb in the 3D4/21 cell genome. In addition, compared with porcine primary alveolar macrophages, 82 genes including the CD163 had lost transcription in 3D4/21 cells, and 72 genes gained transcription as well. Further referring to the Hi-C structure, it was found that the fusion of the topologically associated domains (TADs) caused by the deletion may lead to abnormal gene function. The results of this study provide a basis for elaborating the genome structure and functional variation in 3D4/21 cells, provide a method for rapid and convenient detection of large-scale InDels, and provide useful clues for the study of the porcine immune function genome and the molecular mechanism of virus infection.
Collapse
Affiliation(s)
- Xiaolong Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
- College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Xiaoqian Zhang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
- College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Yandong Luo
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
- College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Ru Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
- College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Yan Sun
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
- College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
- College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
- The Cooperative Innovation Center for Sustainable Pig Production, Swine Breeding and Reproduction Innovation Platform, Huazhong Agricultural University, Wuhan 430070, China
| | - Mei Yu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
- College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
- The Cooperative Innovation Center for Sustainable Pig Production, Swine Breeding and Reproduction Innovation Platform, Huazhong Agricultural University, Wuhan 430070, China
| | - Jianhua Cao
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
- College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
- The Cooperative Innovation Center for Sustainable Pig Production, Swine Breeding and Reproduction Innovation Platform, Huazhong Agricultural University, Wuhan 430070, China
- 3D Genomics Research Center, Huazhong Agricultural University, Wuhan 430070, China
- Correspondence:
| |
Collapse
|
23
|
Abujudeh S, Zeki SS, van Lanschot MCJ, Pusung M, Weaver JMJ, Li X, Noorani A, Metz AJ, Bornschein J, Bower L, Miremadi A, Fitzgerald RC, Morrissey ER, Lynch AG. Low-cost and clinically applicable copy number profiling using repeat DNA. BMC Genomics 2022; 23:599. [PMID: 35978291 PMCID: PMC9386984 DOI: 10.1186/s12864-022-08681-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 06/10/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Somatic copy number alterations (SCNAs) are an important class of genomic alteration in cancer. They are frequently observed in cancer samples, with studies showing that, on average, SCNAs affect 34% of a cancer cell's genome. Furthermore, SCNAs have been shown to be major drivers of tumour development and have been associated with response to therapy and prognosis. Large-scale cancer genome studies suggest that tumours are driven by somatic copy number alterations (SCNAs) or single-nucleotide variants (SNVs). Despite the frequency of SCNAs and their clinical relevance, the use of genomics assays in the clinic is biased towards targeted gene panels, which identify SNVs but provide limited scope to detect SCNAs throughout the genome. There is a need for a comparably low-cost and simple method for high-resolution SCNA profiling. RESULTS We present conliga, a fully probabilistic method that infers SCNA profiles from a low-cost, simple, and clinically-relevant assay (FAST-SeqS). When applied to 11 high-purity oesophageal adenocarcinoma samples, we obtain good agreement (Spearman's rank correlation coefficient, rs=0.94) between conliga's inferred SCNA profiles using FAST-SeqS data (approximately £14 per sample) and those inferred by ASCAT using high-coverage WGS (gold-standard). We find that conliga outperforms CNVkit (rs=0.89), also applied to FAST-SeqS data, and is comparable to QDNAseq (rs=0.96) applied to low-coverage WGS, which is approximately four-fold more expensive, more laborious and less clinically-relevant. By performing an in silico dilution series experiment, we find that conliga is particularly suited to detecting SCNAs in low tumour purity samples. At two million reads per sample, conliga is able to detect SCNAs in all nine samples at 3% tumour purity and as low as 0.5% purity in one sample. Crucially, we show that conliga's hidden state information can be used to decide when a sample is abnormal or normal, whereas CNVkit and QDNAseq cannot provide this critical information. CONCLUSIONS We show that conliga provides high-resolution SCNA profiles using a convenient, low-cost assay. We believe conliga makes FAST-SeqS a more clinically valuable assay as well as a useful research tool, enabling inexpensive and fast copy number profiling of pre-malignant and cancer samples.
Collapse
Affiliation(s)
- Sam Abujudeh
- Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK.
| | - Sebastian S Zeki
- Medical Research Council (MRC) Cancer Unit, University of Cambridge, Cambridge, UK. .,Department of Gastroenterology, Guy's and St Thomas' NHS Trust, London, SE1 7EH, UK.
| | | | - Mark Pusung
- Medical Research Council (MRC) Cancer Unit, University of Cambridge, Cambridge, UK
| | - Jamie M J Weaver
- Medical Research Council (MRC) Cancer Unit, University of Cambridge, Cambridge, UK.,Department of Medical Oncology, The Christie NHS Foundation Trust, Manchester, M20 4TX, UK
| | - Xiaodun Li
- Medical Research Council (MRC) Cancer Unit, University of Cambridge, Cambridge, UK
| | - Ayesha Noorani
- Medical Research Council (MRC) Cancer Unit, University of Cambridge, Cambridge, UK
| | - Andrew J Metz
- Medical Research Council (MRC) Cancer Unit, University of Cambridge, Cambridge, UK
| | - Jan Bornschein
- Medical Research Council (MRC) Cancer Unit, University of Cambridge, Cambridge, UK
| | - Lawrence Bower
- Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK
| | - Ahmad Miremadi
- Medical Research Council (MRC) Cancer Unit, University of Cambridge, Cambridge, UK
| | - Rebecca C Fitzgerald
- Medical Research Council (MRC) Cancer Unit, University of Cambridge, Cambridge, UK.
| | - Edward R Morrissey
- Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK. .,Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK.
| | - Andy G Lynch
- Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK. .,School of Mathematics and Statistics/School of Medicine, University of St Andrews, St Andrews, UK.
| |
Collapse
|
24
|
Hamdan A, Ewing A. Unravelling the tumour genome: The evolutionary and clinical impacts of structural variants in tumourigenesis. J Pathol 2022; 257:479-493. [PMID: 35355264 PMCID: PMC9321913 DOI: 10.1002/path.5901] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 03/16/2022] [Accepted: 03/28/2022] [Indexed: 11/15/2022]
Abstract
Structural variants (SVs) represent a major source of aberration in tumour genomes. Given the diversity in the size and type of SVs present in tumours, the accurate detection and interpretation of SVs in tumours is challenging. New classes of complex structural events in tumours are discovered frequently, and the definitions of the genomic consequences of complex events are constantly being refined. Detailed analyses of short-read whole-genome sequencing (WGS) data from large tumour cohorts facilitate the interrogation of SVs at orders of magnitude greater scale and depth. However, the inherent technical limitations of short-read WGS prevent us from accurately detecting and investigating the impact of all the SVs present in tumours. The expanded use of long-read WGS will be critical for improving the accuracy of SV detection, and in fully resolving complex SV events, both of which are crucial for determining the impact of SVs on tumour progression and clinical outcome. Despite the present limitations, we demonstrate that SVs play an important role in tumourigenesis. In particular, SVs contribute significantly to late-stage tumour development and to intratumoural heterogeneity. The evolutionary trajectories of SVs represent a window into the clonal dynamics in tumours, a comprehensive understanding of which will be vital for influencing patient outcomes in the future. Recent findings have highlighted many clinical applications of SVs in cancer, from early detection to biomarkers for treatment response and prognosis. As the methods to detect and interpret SVs improve, elucidating the full breadth of the complex SV landscape and determining how these events modulate tumour evolution will improve our understanding of cancer biology and our ability to capitalise on the utility of SVs in the clinical management of cancer patients. © 2022 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
Collapse
Affiliation(s)
- Alhafidz Hamdan
- MRC Human Genetics Unit, Institute of Genetics and CancerUniversity of EdinburghEdinburghUK
- Cancer Research UK Edinburgh Centre, Institute of Genetics and CancerUniversity of EdinburghEdinburghUK
| | - Ailith Ewing
- MRC Human Genetics Unit, Institute of Genetics and CancerUniversity of EdinburghEdinburghUK
- Cancer Research UK Edinburgh Centre, Institute of Genetics and CancerUniversity of EdinburghEdinburghUK
| |
Collapse
|
25
|
Espejo Valle-Inclan J, Besselink NJ, de Bruijn E, Cameron DL, Ebler J, Kutzera J, van Lieshout S, Marschall T, Nelen M, Priestley P, Renkens I, Roemer MG, van Roosmalen MJ, Wenger AM, Ylstra B, Fijneman RJ, Kloosterman WP, Cuppen E. A multi-platform reference for somatic structural variation detection. CELL GENOMICS 2022; 2:100139. [PMID: 36778136 PMCID: PMC9903816 DOI: 10.1016/j.xgen.2022.100139] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 05/06/2021] [Accepted: 05/06/2022] [Indexed: 10/18/2022]
Abstract
Accurate detection of somatic structural variation (SV) in cancer genomes remains a challenging problem. This is in part due to the lack of high-quality, gold-standard datasets that enable the benchmarking of experimental approaches and bioinformatic analysis pipelines. Here, we performed somatic SV analysis of the paired melanoma and normal lymphoblastoid COLO829 cell lines using four different sequencing technologies. Based on the evidence from multiple technologies combined with extensive experimental validation, we compiled a comprehensive set of carefully curated and validated somatic SVs, comprising all SV types. We demonstrate the utility of this resource by determining the SV detection performance as a function of tumor purity and sequence depth, highlighting the importance of assessing these parameters in cancer genomics projects. The truth somatic SV dataset as well as the underlying raw multi-platform sequencing data are freely available and are an important resource for community somatic benchmarking efforts.
Collapse
Affiliation(s)
| | - Nicolle J.M. Besselink
- Center for Molecular Medicine and Oncode Institute, UMC Utrecht, Utrecht, the Netherlands
| | | | - Daniel L. Cameron
- Hartwig Medical Foundation, Amsterdam, the Netherlands,Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
| | - Jana Ebler
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Joachim Kutzera
- Center for Molecular Medicine and Oncode Institute, UMC Utrecht, Utrecht, the Netherlands
| | | | - Tobias Marschall
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Marcel Nelen
- Department of Human Genetics, Radboud UMC, Nijmegen, the Netherlands
| | | | - Ivo Renkens
- Center for Molecular Medicine and Oncode Institute, UMC Utrecht, Utrecht, the Netherlands
| | - Margaretha G.M. Roemer
- Department of Pathology, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, Amsterdam, the Netherlands
| | | | | | - Bauke Ylstra
- Department of Pathology, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, Amsterdam, the Netherlands
| | - Remond J.A. Fijneman
- Department of Pathology, Netherlands Cancer Institute, Amsterdam, the Netherlands
| | - Wigard P. Kloosterman
- Center for Molecular Medicine and Oncode Institute, UMC Utrecht, Utrecht, the Netherlands,Corresponding author
| | - Edwin Cuppen
- Center for Molecular Medicine and Oncode Institute, UMC Utrecht, Utrecht, the Netherlands,Hartwig Medical Foundation, Amsterdam, the Netherlands,Corresponding author
| |
Collapse
|
26
|
Du Y, Gu Z, Li Z, Yuan Z, Zhao Y, Zheng X, Bo X, Chen H, Wang C. Dynamic Interplay between Structural Variations and 3D Genome Organization in Pancreatic Cancer. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2022; 9:e2200818. [PMID: 35570408 PMCID: PMC9218654 DOI: 10.1002/advs.202200818] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 04/04/2022] [Indexed: 06/05/2023]
Abstract
Structural variations (SVs) are the greatest source of variations in the genome and can lead to oncogenesis. However, the identification and interpretation of SVs in human cancer remain technologically challenging. Here, long-read sequencing is first employed to depict the signatures of structural variations in carcinogenesis of human pancreatic ductal epithelium. Then widespread reprogramming of the 3D chromatin architecture is revealed by an in situ Hi-C technique. Integrative analyses indicate that the distribution pattern of SVs among the 3D genome is highly cell-type specific and the bulk remodeling effects of SVs in the chromatin organization partly depend on intercellular genomic heterogeneity. Meanwhile, contact domains tend to minimize these disrupting effects of SVs within local adjacent genomic regions to maintain overall stability. Notably, complex genomic rearrangements involving two key driver genes CDKN2A and SMAD4 are identified, and their influence on the expression of oncogenes MIR31HG, MYO5B, etc., are further elucidated from both a linear view and 3D perspective. Overall, this work provides a genome-wide resource and highlights the impact, complexity, and dynamicity of the interplay between structural variations and high-order chromatin organization, which expands the current understanding of the pathogenesis of SVs in human cancer.
Collapse
Affiliation(s)
- Yongxing Du
- Department of Pancreatic and Gastric SurgeryNational Cancer Center/Cancer HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijing100021P. R. China
| | - Zongting Gu
- Department of Pancreatic and Gastric SurgeryNational Cancer Center/Cancer HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijing100021P. R. China
| | - Zongze Li
- Department of Pancreatic and Gastric SurgeryNational Cancer Center/Cancer HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijing100021P. R. China
| | - Zan Yuan
- Annoroad Gene Technology Co. LtdBeijing100176P. R. China
| | - Yue Zhao
- Annoroad Gene Technology Co. LtdBeijing100176P. R. China
| | - Xiaohao Zheng
- Department of Pancreatic and Gastric SurgeryNational Cancer Center/Cancer HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijing100021P. R. China
| | - Xiaochen Bo
- Department of BiotechnologyInstitute of Health Service and Transfusion MedicineBeijing100850P. R. China
| | - Hebing Chen
- Department of BiotechnologyInstitute of Health Service and Transfusion MedicineBeijing100850P. R. China
| | - Chengfeng Wang
- Department of Pancreatic and Gastric SurgeryNational Cancer Center/Cancer HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijing100021P. R. China
| |
Collapse
|
27
|
Yang J, Chaisson MJP. TT-Mars: structural variants assessment based on haplotype-resolved assemblies. Genome Biol 2022; 23:110. [PMID: 35524317 PMCID: PMC9077962 DOI: 10.1186/s13059-022-02666-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 03/30/2022] [Indexed: 01/30/2023] Open
Abstract
Variant benchmarking is often performed by comparing a test callset to a gold standard set of variants. In repetitive regions of the genome, it may be difficult to establish what is the truth for a call, for example, when different alignment scoring metrics provide equally supported but different variant calls on the same data. Here, we provide an alternative approach, TT-Mars, that takes advantage of the recent production of high-quality haplotype-resolved genome assemblies by providing false discovery rates for variant calls based on how well their call reflects the content of the assembly, rather than comparing calls themselves.
Collapse
Affiliation(s)
- Jianzhi Yang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
28
|
Uppuluri L, Wang Y, Young E, Wong JS, Abid HZ, Xiao M. Multiplex structural variant detection by whole-genome mapping and nanopore sequencing. Sci Rep 2022; 12:6512. [PMID: 35444207 PMCID: PMC9021263 DOI: 10.1038/s41598-022-10483-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 04/08/2022] [Indexed: 11/26/2022] Open
Abstract
Identification of structural variants (SVs) breakpoints is important in studying mutations, mutagenic causes, and functional impacts. Next-generation sequencing and whole-genome optical mapping are extensively used in SV discovery and characterization. However, multiple platforms and computational approaches are needed for comprehensive analysis, making it resource-intensive and expensive. Here, we propose a strategy combining optical mapping and cas9-assisted targeted nanopore sequencing to analyze SVs. Optical mapping can economically and quickly detect SVs across a whole genome but does not provide sequence-level information or precisely resolve breakpoints. Furthermore, since only a subset of all SVs is known to affect biology, we attempted to type a subset of all SVs using targeted nanopore sequencing. Using our approach, we resolved the breakpoints of five deletions, five insertions, and an inversion, in a single experiment.
Collapse
Affiliation(s)
- Lahari Uppuluri
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA.,Department of Mechanical Engineering and Mechanics, Drexel University, Philadelphia, PA, USA
| | - Yilin Wang
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Eleanor Young
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Jessica S Wong
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Heba Z Abid
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Ming Xiao
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA. .,Center for Genomic Sciences, Institute of Molecular Medicine and Infectious Disease, Drexel University, Philadelphia, PA, USA.
| |
Collapse
|
29
|
Lei Y, Meng Y, Guo X, Ning K, Bian Y, Li L, Hu Z, Anashkina AA, Jiang Q, Dong Y, Zhu X. Overview of structural variation calling: Simulation, identification, and visualization. Comput Biol Med 2022; 145:105534. [DOI: 10.1016/j.compbiomed.2022.105534] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 04/09/2022] [Accepted: 04/14/2022] [Indexed: 12/11/2022]
|
30
|
Liu Z, Roberts R, Mercer TR, Xu J, Sedlazeck FJ, Tong W. Towards accurate and reliable resolution of structural variants for clinical diagnosis. Genome Biol 2022; 23:68. [PMID: 35241127 PMCID: PMC8892125 DOI: 10.1186/s13059-022-02636-8] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 02/15/2022] [Indexed: 12/17/2022] Open
Abstract
Structural variants (SVs) are a major source of human genetic diversity and have been associated with different diseases and phenotypes. The detection of SVs is difficult, and a diverse range of detection methods and data analysis protocols has been developed. This difficulty and diversity make the detection of SVs for clinical applications challenging and requires a framework to ensure accuracy and reproducibility. Here, we discuss current developments in the diagnosis of SVs and propose a roadmap for the accurate and reproducible detection of SVs that includes case studies provided from the FDA-led SEquencing Quality Control Phase II (SEQC-II) and other consortium efforts.
Collapse
Affiliation(s)
- Zhichao Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Ruth Roberts
- ApconiX, BioHub at Alderley Park, Alderley Edge, SK10 4TG, UK
- University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | - Timothy R Mercer
- Australian Institute for Bioengineering and Nanotechnology, University of Queensland, Brisbane, QLD, Australia
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - Joshua Xu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| | - Weida Tong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
31
|
Jiang T, Liu S, Cao S, Wang Y. Structural Variant Detection from Long-Read Sequencing Data with cuteSV. Methods Mol Biol 2022; 2493:137-151. [PMID: 35751813 DOI: 10.1007/978-1-0716-2293-3_9] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Structural Variation (SV) represents genomic rearrangements and is strongly associated with human health and disease. Recently, long-read sequencing technologies provide the opportunity to more comprehensive identification of SVs at an ever-high resolution. However, under the circumstance of high sequencing errors and the complexity of SVs, there remains lots of technical issues to be settled. Hence, we propose cuteSV, a sensitive, fast, and scalable alignment-based SV detection approach to complete comprehensive discovery of diverse SVs. The benchmarking results indicate cuteSV is suitable for large-scale genome project since its excellent SV yields and ultra-fast speed. Here, we explain the overall framework for providing a detailed outline for users to apply cuteSV correctly and comprehensively. More details are available at https://github.com/tjiangHIT/cuteSV .
Collapse
Affiliation(s)
- Tao Jiang
- Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Shiqi Liu
- Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Shuqi Cao
- Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Yadong Wang
- Harbin Institute of Technology, Harbin, Heilongjiang, China.
| |
Collapse
|
32
|
Khayat MM, Sahraeian SME, Zarate S, Carroll A, Hong H, Pan B, Shi L, Gibbs RA, Mohiyuddin M, Zheng Y, Sedlazeck FJ. Hidden biases in germline structural variant detection. Genome Biol 2021; 22:347. [PMID: 34930391 PMCID: PMC8686633 DOI: 10.1186/s13059-021-02558-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 11/24/2021] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Genomic structural variations (SV) are important determinants of genotypic and phenotypic changes in many organisms. However, the detection of SV from next-generation sequencing data remains challenging. RESULTS In this study, DNA from a Chinese family quartet is sequenced at three different sequencing centers in triplicate. A total of 288 derivative data sets are generated utilizing different analysis pipelines and compared to identify sources of analytical variability. Mapping methods provide the major contribution to variability, followed by sequencing centers and replicates. Interestingly, SV supported by only one center or replicate often represent true positives with 47.02% and 45.44% overlapping the long-read SV call set, respectively. This is consistent with an overall higher false negative rate for SV calling in centers and replicates compared to mappers (15.72%). Finally, we observe that the SV calling variability also persists in a genotyping approach, indicating the impact of the underlying sequencing and preparation approaches. CONCLUSIONS This study provides the first detailed insights into the sources of variability in SV identification from next-generation sequencing and highlights remaining challenges in SV calling for large cohorts. We further give recommendations on how to reduce SV calling variability and the choice of alignment methodology.
Collapse
Affiliation(s)
- Michael M Khayat
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | | | | | | | - Huixiao Hong
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR, USA
| | - Bohu Pan
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR, USA
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China
- Institute of Thoracic Oncology, Fudan University, Shanghai, China
| | - Richard A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | | | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China.
- Institute of Thoracic Oncology, Fudan University, Shanghai, China.
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
33
|
Beagan JJ, Drees EEE, Stathi P, Eijk PP, Meulenbroeks L, Kessler F, Middeldorp JM, Pegtel DM, Zijlstra JM, Sie D, Heideman DAM, Thunnissen E, Smit L, de Jong D, Mouliere F, Ylstra B, Roemer MGM, van Dijk E. PCR-Free Shallow Whole Genome Sequencing for Chromosomal Copy Number Detection from Plasma of Cancer Patients Is an Efficient Alternative to the Conventional PCR-Based Approach. J Mol Diagn 2021; 23:1553-1563. [PMID: 34454114 DOI: 10.1016/j.jmoldx.2021.08.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 07/27/2021] [Accepted: 08/09/2021] [Indexed: 11/19/2022] Open
Abstract
Somatic copy number alterations can be detected in cell-free DNA (cfDNA) by shallow whole genome sequencing (sWGS). PCR is typically included in library preparations, but a PCR-free method could serve as a high-throughput alternative. To evaluate a PCR-free method for research and diagnostics, archival peripheral blood or bone marrow plasma samples, collected in EDTA- or lithium-heparin-containing tubes, were collected from patients with non-small-cell lung cancer (n = 10 longitudinal samples; 4 patients), B-cell lymphoma (n = 31), and acute myeloid leukemia (n = 15), or from healthy donors (n = 14). sWGS was performed on PCR-free and PCR library preparations, and the mapping quality, percentage of unique reads, genome coverage, fragment lengths, and copy number profiles were compared. The percentage of unique reads was significantly higher for PCR-free method compared with PCR method, independent of the type of collection tube: EDTA PCR-free method, 96.4% (n = 35); EDTA PCR method, 85.1% (n = 32); heparin PCR-free method, 94.5% (n = 25); and heparin PCR method, 89.4% (n = 10). All other evaluated metrics were highly comparable for PCR-free and PCR library preparations. These results demonstrate the feasibility of somatic copy number alteration detection by PCR-free sWGS using cfDNA from plasma collected in EDTA- or lithium-heparin-containing tubes and pave the way for an automated cfDNA analysis workflow for samples from cancer patients.
Collapse
MESH Headings
- Biomarkers, Tumor/blood
- Biomarkers, Tumor/genetics
- Blood Specimen Collection/methods
- Carcinoma, Non-Small-Cell Lung/blood
- Carcinoma, Non-Small-Cell Lung/diagnosis
- Carcinoma, Non-Small-Cell Lung/genetics
- Case-Control Studies
- Circulating Tumor DNA/blood
- Circulating Tumor DNA/genetics
- DNA Copy Number Variations
- Feasibility Studies
- Humans
- Leukemia, Myeloid, Acute/blood
- Leukemia, Myeloid, Acute/diagnosis
- Leukemia, Myeloid, Acute/genetics
- Limit of Detection
- Liquid Biopsy
- Longitudinal Studies
- Lung Neoplasms/blood
- Lung Neoplasms/diagnosis
- Lung Neoplasms/genetics
- Lymphoma, B-Cell/blood
- Lymphoma, B-Cell/diagnosis
- Lymphoma, B-Cell/genetics
- Polymerase Chain Reaction/methods
- Whole Genome Sequencing/methods
Collapse
Affiliation(s)
- Jamie J Beagan
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Esther E E Drees
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Phylicia Stathi
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Paul P Eijk
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Laura Meulenbroeks
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Floortje Kessler
- Department of Hematology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Jaap M Middeldorp
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - D Michiel Pegtel
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Josée M Zijlstra
- Department of Hematology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Daoud Sie
- Department of Clinical Genetics, Core Facility Genomics, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Daniëlle A M Heideman
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Erik Thunnissen
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Linda Smit
- Department of Hematology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Daphne de Jong
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Florent Mouliere
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Bauke Ylstra
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands.
| | - Margaretha G M Roemer
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| | - Erik van Dijk
- Department of Pathology, Cancer Center Amsterdam, Amsterdam University Medical Center, Location Vrije Universiteit Medical Center Amsterdam, Amsterdam, the Netherlands
| |
Collapse
|
34
|
Uppuluri L, Jadhav T, Wang Y, Xiao M. Multicolor Whole-Genome Mapping in Nanochannels for Genetic Analysis. Anal Chem 2021; 93:9808-9816. [PMID: 34232611 PMCID: PMC9705121 DOI: 10.1021/acs.analchem.1c01373] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Analysis of structural variations (SVs) is important to understand mutations underlying genetic disorders and pathogenic conditions. However, characterizing SVs using short-read, high-throughput sequencing technology is difficult. Although long-read sequencing technologies are being increasingly employed in characterizing SVs, their low throughput and high costs discourage widespread adoption. Sequence motif-based optical mapping in nanochannels is useful in whole-genome mapping and SV detection, but it is not possible to precisely locate the breakpoints or estimate the copy numbers. We present here a universal multicolor mapping strategy in nanochannels combining conventional sequence-motif labeling system with Cas9-mediated target-specific labeling of any 20-base sequences (20mers) to create custom labels and detect new features. The sequence motifs are labeled with green fluorophores and the 20mers are labeled with red fluorophores. Using this strategy, it is possible to not only detect the SVs but also utilize custom labels to interrogate the features not accessible to motif-labeling, locate breakpoints, and precisely estimate copy numbers of genomic repeats. We validated our approach by quantifying the D4Z4 copy numbers, a known biomarker for facioscapulohumeral muscular dystrophy (FSHD) and estimating the telomere length, a clinical biomarker for assessing disease risk factors in aging-related diseases and malignant cancers. We also demonstrate the application of our methodology in discovering transposable long non-interspersed Elements 1 (LINE-1) insertions across the whole genome.
Collapse
Affiliation(s)
- Lahari Uppuluri
- School of Biomedical Engineering, Science and Health Systems, Drexel University, 3141 Chestnut Street, Philadelphia, Pennsylvania 19104, United States
| | - Tanaya Jadhav
- School of Biomedical Engineering, Science and Health Systems, Drexel University, 3141 Chestnut Street, Philadelphia, Pennsylvania 19104, United States
| | - Yilin Wang
- School of Biomedical Engineering, Science and Health Systems, Drexel University, 3141 Chestnut Street, Philadelphia, Pennsylvania 19104, United States
| | - Ming Xiao
- School of Biomedical Engineering, Science and Health Systems, Drexel University, 3141 Chestnut Street, Philadelphia, Pennsylvania 19104, United States
- Center for Genomic Sciences, Institute of Molecular Medicine and Infectious Disease, Drexel University, 3141 Chestnut Street, Philadelphia, Pennsylvania 19104, United States
| |
Collapse
|
35
|
Eleveld TF, Bakali C, Eijk PP, Stathi P, Vriend LE, Poddighe PJ, Ylstra B. Engineering large-scale chromosomal deletions by CRISPR-Cas9. Nucleic Acids Res 2021; 49:12007-12016. [PMID: 34230973 PMCID: PMC8643637 DOI: 10.1093/nar/gkab557] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 06/07/2021] [Accepted: 06/14/2021] [Indexed: 01/06/2023] Open
Abstract
Large-scale chromosomal deletions are a prevalent and defining feature of cancer. A high degree of tumor-type and subtype specific recurrencies suggest a selective oncogenic advantage. However, due to their large size it has been difficult to pinpoint the oncogenic drivers that confer this advantage. Suitable functional genomics approaches to study the oncogenic driving capacity of large-scale deletions are limited. Here, we present an effective technique to engineer large-scale deletions by CRISPR-Cas9 and create isogenic cell line models. We simultaneously induce double-strand breaks (DSBs) at two ends of a chromosomal arm and select the cells that have lost the intermittent region. Using this technique, we induced large-scale deletions on chromosome 11q (65 Mb) and chromosome 6q (53 Mb) in neuroblastoma cell lines. A high frequency of successful deletions (up to 30% of selected clones) and increased colony forming capacity in the 11q deleted lines suggest an oncogenic advantage of these deletions. Such isogenic models enable further research on the role of large-scale deletions in tumor development and growth, and their possible therapeutic potential.
Collapse
Affiliation(s)
- Thomas F Eleveld
- Department of Pathology, Cancer CenterAmsterdam, Amsterdam UMC, Vrije Universiteit Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, the Netherlands
| | - Chaimaa Bakali
- Department of Pathology, Cancer CenterAmsterdam, Amsterdam UMC, Vrije Universiteit Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, the Netherlands
| | - Paul P Eijk
- Department of Pathology, Cancer CenterAmsterdam, Amsterdam UMC, Vrije Universiteit Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, the Netherlands
| | - Phylicia Stathi
- Department of Pathology, Cancer CenterAmsterdam, Amsterdam UMC, Vrije Universiteit Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, the Netherlands
| | - Lianne E Vriend
- Department of Clinical Genetics, Amsterdam UMC, Vrije Universiteit Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, the Netherlands
| | - Pino J Poddighe
- Department of Clinical Genetics, Amsterdam UMC, Vrije Universiteit Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, the Netherlands
| | - Bauke Ylstra
- Department of Pathology, Cancer CenterAmsterdam, Amsterdam UMC, Vrije Universiteit Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, the Netherlands
| |
Collapse
|
36
|
Jurcă MC, Ivaşcu ME, Jurcă AA, Kozma K, Magyar I, Şandor MI, Jurcă AD, Zaha DC, Albu CC, Pantiş C, Bembea M, Petcheşi CD. Genetics of congenital solid tumors. ROMANIAN JOURNAL OF MORPHOLOGY AND EMBRYOLOGY 2021; 61:1039-1049. [PMID: 34171053 PMCID: PMC8343493 DOI: 10.47162/rjme.61.4.06] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
When we discuss the genetics of tumors, we cannot fail to remember that in the second decade of the twentieth century, more precisely in 1914, Theodore Boveri defined for the first time the chromosomal bases of cancer. In the last 30 years, progresses in genetics have only confirmed Boveri's remarkable predictions made more than 80 years ago. Before the cloning of the retinoblastoma 1 (RB1) gene, the existence of a genetic component in most, if not all, solid childhood tumors were well known. The existence of familial tumor aggregations has been found much more frequently than researchers expected to find at random. Sometimes, the demonstration of this family predisposition was very difficult, because the survival of children diagnosed as having a certain tumor, up to an age at which reproduction and procreation is possible, was very rare. In recent years, advances in the diagnosis and treatment of these diseases have made it possible for these children to survive until the age when they were able to start their own families, including the ability to procreate. Four distinct groups of so-called cancer genes have been identified: oncogenes, which promote tumor cell proliferation; tumor suppressor genes, which inhibit this growth/proliferation; anti-mutational genes, with a role in deoxyribonucleic acid (DNA) stability; and micro-ribonucleic acid (miRNA) genes, with a role in the posttranscriptional process.
Collapse
Affiliation(s)
- Maria Claudia Jurcă
- Department of Preclinical Disciplines, Faculty of Medicine and Pharmacy, University of Oradea, Romania; ,
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Allahyar A, Pieterse M, Swennenhuis J, Los-de Vries GT, Yilmaz M, Leguit R, Meijers RWJ, van der Geize R, Vermaat J, Cleven A, van Wezel T, Diepstra A, van Kempen LC, Hijmering NJ, Stathi P, Sharma M, Melquiond ASJ, de Vree PJP, Verstegen MJAM, Krijger PHL, Hajo K, Simonis M, Rakszewska A, van Min M, de Jong D, Ylstra B, Feitsma H, Splinter E, de Laat W. Robust detection of translocations in lymphoma FFPE samples using targeted locus capture-based sequencing. Nat Commun 2021; 12:3361. [PMID: 34099699 PMCID: PMC8184748 DOI: 10.1038/s41467-021-23695-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 05/10/2021] [Indexed: 12/03/2022] Open
Abstract
In routine diagnostic pathology, cancer biopsies are preserved by formalin-fixed, paraffin-embedding (FFPE) procedures for examination of (intra-) cellular morphology. Such procedures inadvertently induce DNA fragmentation, which compromises sequencing-based analyses of chromosomal rearrangements. Yet, rearrangements drive many types of hematolymphoid malignancies and solid tumors, and their manifestation is instructive for diagnosis, prognosis, and treatment. Here, we present FFPE-targeted locus capture (FFPE-TLC) for targeted sequencing of proximity-ligation products formed in FFPE tissue blocks, and PLIER, a computational framework that allows automated identification and characterization of rearrangements involving selected, clinically relevant, loci. FFPE-TLC, blindly applied to 149 lymphoma and control FFPE samples, identifies the known and previously uncharacterized rearrangement partners. It outperforms fluorescence in situ hybridization (FISH) in sensitivity and specificity, and shows clear advantages over standard capture-NGS methods, finding rearrangements involving repetitive sequences which they typically miss. FFPE-TLC is therefore a powerful clinical diagnostics tool for accurate targeted rearrangement detection in FFPE specimens.
Collapse
Affiliation(s)
- Amin Allahyar
- Oncode Institute & Hubrecht Institute-KNAW and University Medical Center Utrecht, Utrecht, the Netherlands
| | - Mark Pieterse
- Oncode Institute & Hubrecht Institute-KNAW and University Medical Center Utrecht, Utrecht, the Netherlands
| | | | - G Tjitske Los-de Vries
- Amsterdam UMC-Vrije Universiteit Amsterdam, Department of Pathology and Cancer Center Amsterdam, Amsterdam, the Netherlands
| | | | - Roos Leguit
- University Medical Centre Utrecht, Department of Pathology, Utrecht, the Netherlands
| | - Ruud W J Meijers
- University Medical Centre Utrecht, Department of Pathology, Utrecht, the Netherlands
| | | | - Joost Vermaat
- Leiden University Medical Centre, Department of Hematology, Leiden, the Netherlands
| | - Arjen Cleven
- Leiden University Medical Center, Department of Pathology, Leiden, the Netherlands
| | - Tom van Wezel
- Leiden University Medical Center, Department of Pathology, Leiden, the Netherlands
| | - Arjan Diepstra
- University of Groningen, University Medical Centre Groningen, Department of Pathology & Medical Biology, Groningen, the Netherlands
| | - Léon C van Kempen
- University of Groningen, University Medical Centre Groningen, Department of Pathology & Medical Biology, Groningen, the Netherlands
| | - Nathalie J Hijmering
- Amsterdam UMC-Vrije Universiteit Amsterdam, Department of Pathology and Cancer Center Amsterdam, Amsterdam, the Netherlands
| | - Phylicia Stathi
- Amsterdam UMC-Vrije Universiteit Amsterdam, Department of Pathology and Cancer Center Amsterdam, Amsterdam, the Netherlands
| | - Milan Sharma
- Oncode Institute & Hubrecht Institute-KNAW and University Medical Center Utrecht, Utrecht, the Netherlands
| | - Adrien S J Melquiond
- Oncode Institute & Hubrecht Institute-KNAW and University Medical Center Utrecht, Utrecht, the Netherlands
| | - Paula J P de Vree
- Oncode Institute & Hubrecht Institute-KNAW and University Medical Center Utrecht, Utrecht, the Netherlands
| | - Marjon J A M Verstegen
- Oncode Institute & Hubrecht Institute-KNAW and University Medical Center Utrecht, Utrecht, the Netherlands
| | - Peter H L Krijger
- Oncode Institute & Hubrecht Institute-KNAW and University Medical Center Utrecht, Utrecht, the Netherlands
| | | | | | | | | | - Daphne de Jong
- Amsterdam UMC-Vrije Universiteit Amsterdam, Department of Pathology and Cancer Center Amsterdam, Amsterdam, the Netherlands
| | - Bauke Ylstra
- Amsterdam UMC-Vrije Universiteit Amsterdam, Department of Pathology and Cancer Center Amsterdam, Amsterdam, the Netherlands
| | | | | | - Wouter de Laat
- Oncode Institute & Hubrecht Institute-KNAW and University Medical Center Utrecht, Utrecht, the Netherlands.
| |
Collapse
|
38
|
Valle-Inclan JE, Stangl C, de Jong AC, van Dessel LF, van Roosmalen MJ, Helmijr JCA, Renkens I, Janssen R, de Blank S, de Witte CJ, Martens JWM, Jansen MPHM, Lolkema MP, Kloosterman WP. Optimizing Nanopore sequencing-based detection of structural variants enables individualized circulating tumor DNA-based disease monitoring in cancer patients. Genome Med 2021; 13:86. [PMID: 34006333 PMCID: PMC8130429 DOI: 10.1186/s13073-021-00899-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 04/27/2021] [Indexed: 12/18/2022] Open
Abstract
Here, we describe a novel approach for rapid discovery of a set of tumor-specific genomic structural variants (SVs), based on a combination of low coverage cancer genome sequencing using Oxford Nanopore with an SV calling and filtering pipeline. We applied the method to tumor samples of high-grade ovarian and prostate cancer patients and validated on average ten somatic SVs per patient with breakpoint-spanning PCR mini-amplicons. These SVs could be quantified in ctDNA samples of patients with metastatic prostate cancer using a digital PCR assay. The results suggest that SV dynamics correlate with and may improve existing treatment-response biomarkers such as PSA. https://github.com/UMCUGenetics/SHARC .
Collapse
Affiliation(s)
- Jose Espejo Valle-Inclan
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands.,Oncode Institute, Utrecht, The Netherlands
| | - Christina Stangl
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands.,Oncode Institute, Utrecht, The Netherlands.,Division of Molecular Oncology, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Anouk C de Jong
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Lisanne F van Dessel
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Markus J van Roosmalen
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands.,Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Jean C A Helmijr
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Ivo Renkens
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
| | - Roel Janssen
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands.,Oncode Institute, Utrecht, The Netherlands
| | - Sam de Blank
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
| | - Chris J de Witte
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands.,Oncode Institute, Utrecht, The Netherlands
| | - John W M Martens
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Maurice P H M Jansen
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Martijn P Lolkema
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands.
| | - Wigard P Kloosterman
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands. .,Cyclomics, Utrecht, The Netherlands. .,Frame Cancer Therapeutics, Amsterdam, The Netherlands.
| |
Collapse
|
39
|
Gao B, Baudis M. Signatures of Discriminative Copy Number Aberrations in 31 Cancer Subtypes. Front Genet 2021; 12:654887. [PMID: 34054918 PMCID: PMC8155688 DOI: 10.3389/fgene.2021.654887] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Accepted: 04/15/2021] [Indexed: 12/13/2022] Open
Abstract
Copy number aberrations (CNA) are one of the most important classes of genomic mutations related to oncogenetic effects. In the past three decades, a vast amount of CNA data has been generated by molecular-cytogenetic and genome sequencing based methods. While this data has been instrumental in the identification of cancer-related genes and promoted research into the relation between CNA and histo-pathologically defined cancer types, the heterogeneity of source data and derived CNV profiles pose great challenges for data integration and comparative analysis. Furthermore, a majority of existing studies have been focused on the association of CNA to pre-selected "driver" genes with limited application to rare drivers and other genomic elements. In this study, we developed a bioinformatics pipeline to integrate a collection of 44,988 high-quality CNA profiles of high diversity. Using a hybrid model of neural networks and attention algorithm, we generated the CNA signatures of 31 cancer subtypes, depicting the uniqueness of their respective CNA landscapes. Finally, we constructed a multi-label classifier to identify the cancer type and the organ of origin from copy number profiling data. The investigation of the signatures suggested common patterns, not only of physiologically related cancer types but also of clinico-pathologically distant cancer types such as different cancers originating from the neural crest. Further experiments of classification models confirmed the effectiveness of the signatures in distinguishing different cancer types and demonstrated their potential in tumor classification.
Collapse
Affiliation(s)
- Bo Gao
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Zurich, Switzerland
| | - Michael Baudis
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Zurich, Switzerland
| |
Collapse
|
40
|
Gu W, Zhou A, Wang L, Sun S, Cui X, Zhu D. SVLR: Genome Structural Variant Detection Using Long-Read Sequencing Data. J Comput Biol 2021; 28:774-788. [PMID: 33973820 DOI: 10.1089/cmb.2021.0048] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Genome structural variants (SVs) have great impacts on human phenotype and diversity, and have been linked to numerous diseases. Long-read sequencing technologies arise to make it possible to find SVs of as long as 10,000 nucleotides. Thus, long read-based SV detection has been drawing attention of many recent research projects, and many tools have been developed for long reads to detect SVs recently. In this article, we present a new method, called SVLR, to detect SVs based on long-read sequencing data. Comparing with existing methods, SVLR can detect three new kinds of SVs: block replacements, block interchanges, and translocations. Although these new SVs are structurally more complicated, SVLR achieves accuracies that are comparable with those of the classic SVs. Moreover, for the classic SVs that can be detected by state-of-the-art methods (e.g., SVIM and Sniffles), our experiments demonstrate recall improvements of up to 38% without harming the precisions (i.e., >78%). We also point out three directions to further improve SV detection in the future. Source codes: https://github.com/GWYSDU/SVLR.
Collapse
Affiliation(s)
- Wenyan Gu
- School of Computer Science and Technology, Shandong University, Qindao, China
| | - Aizhong Zhou
- School of Computer Science and Technology, Shandong University, Qindao, China
| | - Lusheng Wang
- Department of Computer Science, City University of Hong Kong, Hong Kong, China
| | - Shiwei Sun
- Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
| | - Xuefeng Cui
- School of Computer Science and Technology, Shandong University, Qindao, China
| | - Daming Zhu
- School of Computer Science and Technology, Shandong University, Qindao, China
| |
Collapse
|
41
|
Voulaz E, Novellis P, Rossetti F, Solinas M, Rossi S, Alloisio M, Pelosi G, Veronesi G. Distinguishing multiple lung primaries from intra-pulmonary metastases and treatment implications. Expert Rev Anticancer Ther 2020; 20:985-995. [PMID: 32915097 DOI: 10.1080/14737140.2020.1823223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
INTRODUCTION The distinction between multiple primary lung cancers and intra-pulmonary metastases has been extensively investigated because of its important clinical and therapeutic implications. AREAS COVERED Rapidly improving imaging technology and genomic analysis has led to a finer discrimination between multiple primary lung tumors and pulmonary metastases. However, over the past few decades, standardized criteria for the identification of multiple lung tumors have been lacking. Therefore, in 2017 a multidisciplinary international committee composed of the Union for International Cancer Control (UICC), American Joint Committee on Cancer (AJCC) and International Association for the Study of Lung Cancer (IASLC) addressed this problem when drawing up the 8th edition of TMN stage classification, that now represents a specific consensus on this topic. The most advanced diagnostic strategies associated with screening allow for the detection of early stage synchronous lung cancers. EXPERT OPINION Although diagnostic confirmation relies on pathologic and clinical examination, new molecular analyses help in the discrimination between primary and secondary tumors. The treatment of multiple primary lung tumors remains, whenever possible, a local treatment based on surgical resection, providing the absence of distant or local (lymph node) metastases.
Collapse
Affiliation(s)
- Emanuele Voulaz
- Division of Thoracic Surgery, Humanitas Clinical and Research Center - IRCCS , Milan, Italy.,Department of Biomedical Sciences, Humanitas University , Milan, Italy
| | - Pierluigi Novellis
- Division of Thoracic Surgery, San Raffaele Scientific Institute ¬- IRCCS , Milan, Italy
| | - Francesca Rossetti
- Division of Thoracic Surgery, San Raffaele Scientific Institute ¬- IRCCS , Milan, Italy
| | - Michela Solinas
- Division of General and Thoracic Surgery of New Hospital of Legnano, Milan, Italy
| | - Sabrina Rossi
- Department of Oncology and Hematology, Humanitas Clinical and Research Center - IRCCS , Milan, Italy
| | - Marco Alloisio
- Division of Thoracic Surgery, Humanitas Clinical and Research Center - IRCCS , Milan, Italy.,Department of Biomedical Sciences, Humanitas University , Milan, Italy
| | - Giuseppe Pelosi
- Department of Oncology and Hemato-Oncology, University of Milan , Milan, Italy.,Inter-Hospital Pathology Division, IRCCS MultiMedica , Milan, Italy
| | - Giulia Veronesi
- Division of Thoracic Surgery, San Raffaele Scientific Institute ¬- IRCCS , Milan, Italy.,School of Medicine, Vita-Salute San Raffaele University , Milan, Italy
| |
Collapse
|
42
|
Lopez G, Conkrite KL, Doepner M, Rathi KS, Modi A, Vaksman Z, Farra LM, Hyson E, Noureddine M, Wei JS, Smith MA, Asgharzadeh S, Seeger RC, Khan J, Auvil JG, Gerhard DS, Maris JM, Diskin SJ. Somatic structural variation targets neurodevelopmental genes and identifies SHANK2 as a tumor suppressor in neuroblastoma. Genome Res 2020; 30:1228-1242. [PMID: 32796005 PMCID: PMC7545140 DOI: 10.1101/gr.252106.119] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Accepted: 08/07/2020] [Indexed: 12/18/2022]
Abstract
Neuroblastoma is a malignancy of the developing sympathetic nervous system that accounts for 12% of childhood cancer deaths. Like many childhood cancers, neuroblastoma shows a relative paucity of somatic single-nucleotide variants (SNVs) and small insertions and deletions (indels) compared to adult cancers. Here, we assessed the contribution of somatic structural variation (SV) in neuroblastoma using a combination of whole-genome sequencing (WGS) of tumor-normal pairs (n = 135) and single-nucleotide polymorphism (SNP) genotyping of primary tumors (n = 914). Our study design allowed for orthogonal validation and replication across platforms. SV frequency, type, and localization varied significantly among high-risk tumors. MYCN nonamplified high-risk tumors harbored an increased SV burden overall, including a significant excess of tandem duplication events across the genome. Genes disrupted by SV breakpoints were enriched in neuronal lineages and associated with phenotypes such as autism spectrum disorder (ASD). The postsynaptic adapter protein-coding gene, SHANK2, located on Chromosome 11q13, was disrupted by SVs in 14% of MYCN nonamplified high-risk tumors based on WGS and 10% in the SNP array cohort. Expression of SHANK2 was low across human-derived neuroblastoma cell lines and high-risk neuroblastoma tumors. Forced expression of SHANK2 in neuroblastoma cells resulted in significant growth inhibition (P = 2.6 × 10-2 to 3.4 × 10-5) and accelerated neuronal differentiation following treatment with all-trans retinoic acid (P = 3.1 × 10-13 to 2.4 × 10-30). These data further define the complex landscape of somatic structural variation in neuroblastoma and suggest that events leading to deregulation of neurodevelopmental processes, such as inactivation of SHANK2, are key mediators of tumorigenesis in this childhood cancer.
Collapse
Affiliation(s)
- Gonzalo Lopez
- Division of Oncology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Karina L Conkrite
- Division of Oncology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
| | - Miriam Doepner
- Division of Oncology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
| | - Komal S Rathi
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
| | - Apexa Modi
- Division of Oncology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Genomics and Computational Biology, Biomedical Graduate Studies, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Zalman Vaksman
- Division of Oncology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
| | - Lance M Farra
- Division of Oncology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
| | - Eric Hyson
- Division of Oncology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
| | - Moataz Noureddine
- Division of Oncology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
| | - Jun S Wei
- Oncogenomics Section, Genetics Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892, USA
| | - Malcolm A Smith
- Cancer Therapy Evaluation Program, National Cancer Institute, Bethesda, Maryland 20892, USA
| | - Shahab Asgharzadeh
- Division of Hematology, Oncology and Blood and Marrow Transplantation, Keck School of Medicine of the University of Southern California, Los Angeles, California 90033, USA
- The Saban Research Institute, Children's Hospital of Los Angeles, Los Angeles, California 90027, USA
| | - Robert C Seeger
- Division of Hematology, Oncology and Blood and Marrow Transplantation, Keck School of Medicine of the University of Southern California, Los Angeles, California 90033, USA
- The Saban Research Institute, Children's Hospital of Los Angeles, Los Angeles, California 90027, USA
| | - Javed Khan
- Oncogenomics Section, Genetics Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892, USA
| | - Jaime Guidry Auvil
- Office of Cancer Genomics, National Cancer Institute, Bethesda, Maryland 20892, USA
| | - Daniela S Gerhard
- Office of Cancer Genomics, National Cancer Institute, Bethesda, Maryland 20892, USA
| | - John M Maris
- Division of Oncology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Abramson Family Cancer Research Institute, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Sharon J Diskin
- Division of Oncology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Abramson Family Cancer Research Institute, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
43
|
Jiang T, Liu Y, Jiang Y, Li J, Gao Y, Cui Z, Liu Y, Liu B, Wang Y. Long-read-based human genomic structural variation detection with cuteSV. Genome Biol 2020; 21:189. [PMID: 32746918 PMCID: PMC7477834 DOI: 10.1186/s13059-020-02107-y] [Citation(s) in RCA: 208] [Impact Index Per Article: 41.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 07/14/2020] [Indexed: 01/01/2023] Open
Abstract
Long-read sequencing is promising for the comprehensive discovery of structural variations (SVs). However, it is still non-trivial to achieve high yields and performance simultaneously due to the complex SV signatures implied by noisy long reads. We propose cuteSV, a sensitive, fast, and scalable long-read-based SV detection approach. cuteSV uses tailored methods to collect the signatures of various types of SVs and employs a clustering-and-refinement method to implement sensitive SV detection. Benchmarks on simulated and real long-read sequencing datasets demonstrate that cuteSV has higher yields and scaling performance than state-of-the-art tools. cuteSV is available at https://github.com/tjiangHIT/cuteSV.
Collapse
Affiliation(s)
- Tao Jiang
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, Heilongjiang, China
| | - Yongzhuang Liu
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, Heilongjiang, China
| | - Yue Jiang
- Nebula Genomics, Harbin, 150030, Heilongjiang, China
| | - Junyi Li
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, 518055, Guangdong, China
| | - Yan Gao
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, Heilongjiang, China
| | - Zhe Cui
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, Heilongjiang, China
| | - Yadong Liu
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, Heilongjiang, China
| | - Bo Liu
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, Heilongjiang, China.
| | - Yadong Wang
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, Heilongjiang, China.
| |
Collapse
|
44
|
Beagan JJ, Sluiter NR, Bach S, Eijk PP, Vlek SL, Heideman DAM, Kusters M, Pegtel DM, Kazemier G, van Grieken NCT, Ylstra B, Tuynman JB. Circulating Tumor DNA as a Preoperative Marker of Recurrence in Patients with Peritoneal Metastases of Colorectal Cancer: A Clinical Feasibility Study. J Clin Med 2020; 9:jcm9061738. [PMID: 32512811 PMCID: PMC7357031 DOI: 10.3390/jcm9061738] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 05/29/2020] [Accepted: 06/01/2020] [Indexed: 12/13/2022] Open
Abstract
Cytoreductive Surgery and Hyperthermic Intraperitoneal Chemotherapy (CRS-HIPEC) may be curative for colorectal cancer patients with peritoneal metastases (PMs) but it has a high rate of morbidity. Accurate preoperative patient selection is therefore imperative, but is constrained by the limitations of current imaging techniques. In this pilot study, we explored the feasibility of circulating tumor (ct) DNA analysis to select patients for CRS-HIPEC. Thirty patients eligible for CRS-HIPEC provided blood samples preoperatively and during follow-up if the procedure was completed. Targeted Next-Generation Sequencing (NGS) of DNA from PMs was used to identify bespoke mutations that were subsequently tested in corresponding plasma cell-free (cf) DNA samples using droplet digital (dd) PCR. CtDNA was detected preoperatively in cfDNA samples from 33% of patients and was associated with a reduced disease-free survival (DFS) after CRS-HIPEC (median 6.0 months vs median not reached, p = 0.016). This association could indicate the presence of undiagnosed systemic metastases or an increased metastatic potential of the tumors. We demonstrate the feasibility of ctDNA to serve as a preoperative marker of recurrence in patients with PMs of colorectal cancer using a highly sensitive technique. A more appropriate treatment for patients with preoperative ctDNA detection may be systemic chemotherapy in addition to, or instead of, CRS-HIPEC.
Collapse
Affiliation(s)
- Jamie J. Beagan
- Department of Pathology, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands; (J.J.B.); (P.P.E.); (D.A.M.H.); (D.M.P.); (N.C.T.v.G.)
| | - Nina R. Sluiter
- Department of Surgery, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands; (N.R.S.); (S.B.); (S.L.V.); (M.K.); (G.K.); (J.B.T.)
| | - Sander Bach
- Department of Surgery, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands; (N.R.S.); (S.B.); (S.L.V.); (M.K.); (G.K.); (J.B.T.)
| | - Paul P. Eijk
- Department of Pathology, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands; (J.J.B.); (P.P.E.); (D.A.M.H.); (D.M.P.); (N.C.T.v.G.)
| | - Stijn L. Vlek
- Department of Surgery, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands; (N.R.S.); (S.B.); (S.L.V.); (M.K.); (G.K.); (J.B.T.)
| | - Daniëlle A. M. Heideman
- Department of Pathology, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands; (J.J.B.); (P.P.E.); (D.A.M.H.); (D.M.P.); (N.C.T.v.G.)
| | - Miranda Kusters
- Department of Surgery, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands; (N.R.S.); (S.B.); (S.L.V.); (M.K.); (G.K.); (J.B.T.)
| | - D. Michiel Pegtel
- Department of Pathology, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands; (J.J.B.); (P.P.E.); (D.A.M.H.); (D.M.P.); (N.C.T.v.G.)
| | - Geert Kazemier
- Department of Surgery, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands; (N.R.S.); (S.B.); (S.L.V.); (M.K.); (G.K.); (J.B.T.)
| | - Nicole C. T. van Grieken
- Department of Pathology, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands; (J.J.B.); (P.P.E.); (D.A.M.H.); (D.M.P.); (N.C.T.v.G.)
| | - Bauke Ylstra
- Department of Pathology, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands; (J.J.B.); (P.P.E.); (D.A.M.H.); (D.M.P.); (N.C.T.v.G.)
- Correspondence: ; Tel.: +31-(0)20-4442-495
| | - Jurriaan B. Tuynman
- Department of Surgery, Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands; (N.R.S.); (S.B.); (S.L.V.); (M.K.); (G.K.); (J.B.T.)
| |
Collapse
|
45
|
Mohd Yunos RI, Ab Mutalib NS, Tieng FYF, Abu N, Jamal R. Actionable Potentials of Less Frequently Mutated Genes in Colorectal Cancer and Their Roles in Precision Medicine. Biomolecules 2020; 10:biom10030476. [PMID: 32245111 PMCID: PMC7175115 DOI: 10.3390/biom10030476] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 03/11/2020] [Accepted: 03/13/2020] [Indexed: 02/06/2023] Open
Abstract
Global statistics have placed colorectal cancer (CRC) as the third most frequently diagnosed cancer and the fourth principal cause of cancer-related deaths worldwide. Improving survival for CRC is as important as early detection. Personalized medicine is important in maximizing an individual's treatment success and minimizing the risk of adverse reactions. Approaches in achieving personalized therapy in CRC have included analyses of specific genes with its clinical implications. Tumour genotyping via next-generation sequencing has become a standard practice to guide clinicians into predicting tumor behaviour, disease prognosis, and treatment response. Nevertheless, better prognostic markers are necessary to further stratify patients for personalized treatment plans. The discovery of new markers remains indispensable in providing the most effective chemotherapy in order to improve the outcomes of treatment and survival in CRC patients. This review aims to compile and discuss newly discovered, less frequently mutated genes in CRC. We also discuss how these mutations are being used to assist therapeutic decisions and their potential prospective clinical utilities. In addition, we will summarize the importance of profiling the large genomic rearrangements, gene amplification, and large deletions and how these alterations may assist in determining the best treatment option for CRC patients.
Collapse
Affiliation(s)
| | | | | | | | - Rahman Jamal
- Correspondence: (N.S.A.M.); (R.J.); Tel.: +60-3-91459073 (N.S.A.M.); +60-3-91459000 (R.J.)
| |
Collapse
|
46
|
Chander V, Gibbs RA, Sedlazeck FJ. Evaluation of computational genotyping of structural variation for clinical diagnoses. Gigascience 2020; 8:5565134. [PMID: 31494671 PMCID: PMC6732172 DOI: 10.1093/gigascience/giz110] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Revised: 06/27/2019] [Accepted: 08/13/2019] [Indexed: 01/08/2023] Open
Abstract
Background Structural variation (SV) plays a pivotal role in genetic disease. The discovery of SVs based on short DNA sequence reads from next-generation DNA sequence methods is error-prone, with low sensitivity and high false discovery rates. These shortcomings can be partially overcome with extensive orthogonal validation methods or use of long reads, but the current cost precludes their application for routine clinical diagnostics. In contrast, SV genotyping of known sites of SV occurrence is relatively robust and therefore offers a cost-effective clinical diagnostic tool with potentially few false-positive and false-negative results, even when applied to short-read DNA sequence data. Results We assess 5 state-of-the-art SV genotyping software methods, applied to short-read sequence data. The methods are characterized on the basis of their ability to genotype different SV types, spanning different size ranges. Furthermore, we analyze their ability to parse different VCF file subformats and assess their reliance on specific metadata. We compare the SV genotyping methods across a range of simulated and real data including SVs that were not found with Illumina data alone. We assess sensitivity and the ability to filter initial false discovery calls. We determined the impact of SV type and size on the performance for each SV genotyper. Overall, STIX performed the best on both simulated and GiaB based SV calls, demonstrating a good balance between sensitivity and specificty. Conclusion Our results indicate that, although SV genotyping software methods have superior performance to SV callers, there are limitations that suggest the need for further innovation.
Collapse
Affiliation(s)
- Varuna Chander
- Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Richard A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| |
Collapse
|
47
|
Tham CY, Tirado-Magallanes R, Goh Y, Fullwood MJ, Koh BTH, Wang W, Ng CH, Chng WJ, Thiery A, Tenen DG, Benoukraf T. NanoVar: accurate characterization of patients' genomic structural variants using low-depth nanopore sequencing. Genome Biol 2020; 21:56. [PMID: 32127024 PMCID: PMC7055087 DOI: 10.1186/s13059-020-01968-7] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Accepted: 02/21/2020] [Indexed: 12/19/2022] Open
Abstract
The recent advent of third-generation sequencing technologies brings promise for better characterization of genomic structural variants by virtue of having longer reads. However, long-read applications are still constrained by their high sequencing error rates and low sequencing throughput. Here, we present NanoVar, an optimized structural variant caller utilizing low-depth (8X) whole-genome sequencing data generated by Oxford Nanopore Technologies. NanoVar exhibits higher structural variant calling accuracy when benchmarked against current tools using low-depth simulated datasets. In patient samples, we successfully validate structural variants characterized by NanoVar and uncover normal alternative sequences or alleles which are present in healthy individuals.
Collapse
Affiliation(s)
- Cheng Yong Tham
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore
| | - Roberto Tirado-Magallanes
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore
| | - Yufen Goh
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore
| | - Melissa J Fullwood
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore.,School of Biological Sciences, Nanyang Technological University, Singapore, 637551, Singapore
| | - Bryan T H Koh
- Department of Orthopedic Surgery, National University Health Systems, Singapore, 119228, Singapore
| | - Wilson Wang
- Department of Orthopedic Surgery, National University Health Systems, Singapore, 119228, Singapore.,Department of Orthopaedic Surgery, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 119228, Singapore
| | - Chin Hin Ng
- Department of Hematology-Oncology, National University Cancer Institute of Singapore, National University Health System, Singapore, 119228, Singapore
| | - Wee Joo Chng
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore.,Department of Hematology-Oncology, National University Cancer Institute of Singapore, National University Health System, Singapore, 119228, Singapore.,Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 119228, Singapore
| | - Alexandre Thiery
- Department of Statistics and Applied Probability, National University of Singapore, Singapore, 117546, Singapore
| | - Daniel G Tenen
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore.,Harvard Stem Cell Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Touati Benoukraf
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore. .,Discipline of Genetics, Faculty of Medicine, Memorial University of Newfoundland, St. John's, NL, A1B 3V6, Canada.
| |
Collapse
|
48
|
Abstract
Identifying structural variation (SV) is essential for genome interpretation but has been historically difficult due to limitations inherent to available genome technologies. Detection methods that use ensemble algorithms and emerging sequencing technologies have enabled the discovery of thousands of SVs, uncovering information about their ubiquity, relationship to disease and possible effects on biological mechanisms. Given the variability in SV type and size, along with unique detection biases of emerging genomic platforms, multiplatform discovery is necessary to resolve the full spectrum of variation. Here, we review modern approaches for investigating SVs and proffer that, moving forwards, studies integrating biological information with detection will be necessary to comprehensively understand the impact of SV in the human genome.
Collapse
Affiliation(s)
- Steve S Ho
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Alexander E Urban
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Ryan E Mills
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
49
|
Franceschini N, Lam SW, Cleton-Jansen AM, Bovée JVMG. What's new in bone forming tumours of the skeleton? Virchows Arch 2020; 476:147-157. [PMID: 31741049 PMCID: PMC6969005 DOI: 10.1007/s00428-019-02683-w] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 09/12/2019] [Accepted: 09/30/2019] [Indexed: 12/15/2022]
Abstract
Bone tumours are difficult to diagnose and treat, as they are rare and over 60 different subtypes are recognised. The emergence of next-generation sequencing has partly elucidated the molecular mechanisms behind these tumours, including the group of bone forming tumours (osteoma, osteoid osteoma, osteoblastoma and osteosarcoma). Increased knowledge on the molecular mechanism could help to identify novel diagnostic markers and/or treatment options. Osteoid osteoma and osteoblastoma are bone forming tumours without malignant potential that have overlapping morphology. They were recently shown to carry FOS and-to a lesser extent-FOSB rearrangements suggesting that these tumours are closely related. The presence of these rearrangements could help discriminate these entities from other lesions with woven bone deposition. Osteosarcoma is a malignant bone forming tumour for which different histological subtypes are recognised. High-grade osteosarcoma is the prototype of a complex karyotype tumour, and extensive research exploring its molecular background has identified phenomena like chromothripsis and kataegis and some recurrent alterations. Due to lack of specificity, this has not led to a valuable novel diagnostic marker so far. Nevertheless, these studies have also pointed towards potential targetable drivers of which the therapeutic merit remains to be further explored.
Collapse
Affiliation(s)
- Natasja Franceschini
- Department of Pathology, Leiden University Medical Center, P.O. Box 9600, L1-Q, 2300 RC, Leiden, Netherlands
| | - Suk Wai Lam
- Department of Pathology, Leiden University Medical Center, P.O. Box 9600, L1-Q, 2300 RC, Leiden, Netherlands
| | - Anne-Marie Cleton-Jansen
- Department of Pathology, Leiden University Medical Center, P.O. Box 9600, L1-Q, 2300 RC, Leiden, Netherlands
| | - Judith V M G Bovée
- Department of Pathology, Leiden University Medical Center, P.O. Box 9600, L1-Q, 2300 RC, Leiden, Netherlands.
| |
Collapse
|
50
|
Vincenten JPL, van Essen HF, Lissenberg-Witte BI, Bulkmans NWJ, Krijgsman O, Sie D, Eijk PP, Smit EF, Ylstra B, Thunnissen E. Clonality analysis of pulmonary tumors by genome-wide copy number profiling. PLoS One 2019; 14:e0223827. [PMID: 31618260 PMCID: PMC6795528 DOI: 10.1371/journal.pone.0223827] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Accepted: 09/30/2019] [Indexed: 01/15/2023] Open
Abstract
Multiple tumors in patients are frequently diagnosed, either synchronous or metachronous. The distinction between a second primary and a metastasis is important for treatment. Chromosomal DNA copy number aberrations (CNA) patterns are highly unique to specific tumors. The aim of this study was to assess genome-wide CNA-patterns as method to identify clonally related tumors in a prospective cohort of patients with synchronous or metachronous tumors, with at least one intrapulmonary tumor. In total, 139 tumor pairs from 90 patients were examined: 35 synchronous and 104 metachronous pairs. Results of CNA were compared to histological type, clinicopathological methods (Martini-Melamed-classification (MM) and ACCP-2013-criteria), and, if available, EGFR- and KRAS-mutation analysis. CNA-results were clonal in 74 pairs (53%), non-clonal in 33 pairs (24%), and inconclusive in 32 pairs (23%). Histological similarity was found in 130 pairs (94%). Concordance between histology and conclusive CNA-results was 69% (74 of 107 pairs: 72 clonal and two non-clonal). In 31 of 103 pairs with similar histology, genetics revealed non-clonality. In two out of four pairs with non-matching histology, genetics revealed clonality. The subgroups of synchronous and metachronous pairs showed similar outcome for the comparison of histological versus CNA-results. MM-classification and ACCP-2013-criteria, applicable on 34 pairs, and CNA-results were concordant in 50% and 62% respectively. Concordance between mutation matching and conclusive CNA-results was 89% (8 of 9 pairs: six clonal and two non-clonal). Interestingly, in one patient both tumors had the same KRAS mutation, but the CNA result was non-clonal. In conclusion, although some concordance between histological comparison and CNA profiling is present, arguments exist to prefer extensive molecular testing to determine whether a second tumor is a metastasis or a second primary.
Collapse
Affiliation(s)
- Julien P. L. Vincenten
- Amsterdam UMC, location VUmc, Department of Pulmonary Diseases, Amsterdam, The Netherlands
- Albert Schweitzer Hospital, Department of Pulmonary Diseases, Dordrecht, The Netherlands
| | - Hendrik F. van Essen
- Amsterdam UMC, location VUmc, Tumor Genome Analysis Core, Cancer Center Amsterdam, The Netherlands
| | | | | | - Oscar Krijgsman
- Netherlands Cancer Institute - Antoni van Leeuwenhoek, Department of Molecular Oncology & Immunology, Amsterdam, The Netherlands
| | - Daoud Sie
- Amsterdam UMC, location VUmc, Tumor Genome Analysis Core, Cancer Center Amsterdam, The Netherlands
| | - Paul P. Eijk
- Amsterdam UMC, location VUmc, Tumor Genome Analysis Core, Cancer Center Amsterdam, The Netherlands
| | - Egbert F. Smit
- Amsterdam UMC, location VUmc, Department of Pulmonary Diseases, Amsterdam, The Netherlands
- Netherlands Cancer Institute - Antoni van Leeuwenhoek, Department of Thoracic Oncology, Amsterdam, The Netherlands
| | - Bauke Ylstra
- Amsterdam UMC, location VUmc, Tumor Genome Analysis Core, Cancer Center Amsterdam, The Netherlands
| | - Erik Thunnissen
- Amsterdam UMC, location VUmc, Department of Pathology, Amsterdam, The Netherlands
- * E-mail:
| |
Collapse
|