1
|
Liu D, Liu L, Zhang X, Zhao X, Li X, Che X, Wu G. Decoding driver and phenotypic genes in cancer: Unveiling the essence behind the phenomenon. Mol Aspects Med 2025; 103:101358. [PMID: 40037122 DOI: 10.1016/j.mam.2025.101358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2024] [Revised: 01/25/2025] [Accepted: 02/26/2025] [Indexed: 03/06/2025]
Abstract
Gray hair, widely regarded as a hallmark of aging. While gray hair is associated with aging, reversing this trait through gene targeting does not alter the fundamental biological processes of aging. Similarly, certain oncogenes (such as CXCR4, MMP-related genes, etc.) can serve as markers of tumor behavior, such as malignancy or prognosis, but targeting these genes alone may not lead to tumor regression. We pioneered the name of this class of genes as "phenotypic genes". Historically, cancer genetics research has focused on tumor driver genes, while genes influencing cancer phenotypes have been relatively overlooked. This review explores the critical distinction between driver genes and phenotypic genes in cancer, using the MAPK and PI3K/AKT/mTOR pathways as key examples. We also discuss current research techniques for identifying driver and phenotypic genes, such as whole-genome sequencing (WGS), RNA sequencing (RNA-seq), RNA interference (RNAi), CRISPR-Cas9, and other genomic screening methods, alongside the concept of synthetic lethality in driver genes. The development of these technologies will help develop personalized treatment strategies and precision medicine based on the characteristics of relevant genes. By addressing the gap in discussions on phenotypic genes, this review significantly contributes to clarifying the roles of driver and phenotypic genes, aiming at advancing the field of targeted cancer therapy.
Collapse
Affiliation(s)
- Dequan Liu
- Department of Urology, The First Affiliated Hospital of Dalian Medical University, Dalian, 116011, China
| | - Lei Liu
- Department of Urology, The First Affiliated Hospital of Dalian Medical University, Dalian, 116011, China
| | - Xiaoman Zhang
- Department of Urology, The First Affiliated Hospital of Dalian Medical University, Dalian, 116011, China
| | - Xinming Zhao
- Department of Urology, The First Affiliated Hospital of Dalian Medical University, Dalian, 116011, China
| | - Xiaorui Li
- Department of Oncology, Cancer Hospital of Dalian University of Technology, Shenyang, 110042, China.
| | - Xiangyu Che
- Department of Urology, The First Affiliated Hospital of Dalian Medical University, Dalian, 116011, China.
| | - Guangzhen Wu
- Department of Urology, The First Affiliated Hospital of Dalian Medical University, Dalian, 116011, China.
| |
Collapse
|
2
|
De La Vega FM, Irvine SA, Anur P, Potts K, Kraft L, Torres R, Kang P, Truong S, Lee Y, Han S, Onuchic V, Han J. Benchmarking of germline copy number variant callers from whole genome sequencing data for clinical applications. BIOINFORMATICS ADVANCES 2025; 5:vbaf071. [PMID: 40248358 PMCID: PMC12005901 DOI: 10.1093/bioadv/vbaf071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2024] [Revised: 03/11/2025] [Accepted: 04/08/2025] [Indexed: 04/19/2025]
Abstract
Motivation Whole-genome sequencing (WGS) is increasingly preferred for clinical applications due to its comprehensive coverage, effectiveness in detecting copy number variants (CNVs), and declining costs. However, systematic evaluations of WGS CNV callers tailored to germline clinical testing-where high sensitivity and confirmation of reported CNVs are essential-remain necessary. Clinical reporting typically emphasizes CNVs affecting coding regions over precise breakpoint detection. This study benchmarks several short-read WGS CNV detection tools using reference cell lines to inform their clinical use. Results While tools vary in sensitivity (7%-83%) and precision (1%-76%), few meet the sensitivity needed for clinical testing. Callers generally perform better for deletions (up to 88% sensitivity) than duplications (up to 47% sensitivity), with poor detection of duplications under 5 kb. Notably, for CNVs in genes commonly included in clinical panels, significantly improved sensitivity and precision were observed when benchmarking against 25 cell lines with known CNVs. DRAGEN v4.2 high-sensitivity CNV calls, post-processed with custom filters, achieved 100% sensitivity and 77% precision on the optimized gene panel after excluding recurring artifacts. This level of performance may support clinical use with orthogonal confirmation of reportable CNVs, pending validation on laboratory-specific samples. Availability and implementation The data underlying this article are available in the European Nucleo-tide Archive under project accession PRJEB87628.
Collapse
Affiliation(s)
- Francisco M De La Vega
- Tempus AI, Inc., Chicago, IL 60654, United States
- Department of Biomedical Data Sciences, Stanford University School of Medicine, Palo Alto, CA 94304, United States
| | - Sean A Irvine
- Real Time Genomics, Ltd., Hamilton 3204, New Zealand
| | - Pavana Anur
- Tempus AI, Inc., Chicago, IL 60654, United States
| | - Kelly Potts
- Tempus AI, Inc., Chicago, IL 60654, United States
| | - Lewis Kraft
- Tempus AI, Inc., Chicago, IL 60654, United States
| | - Raul Torres
- Tempus AI, Inc., Chicago, IL 60654, United States
| | - Peter Kang
- Tempus AI, Inc., Chicago, IL 60654, United States
| | - Sean Truong
- llumina, Inc., San Diego, CA 92122, United States
| | - Yeonghun Lee
- llumina, Inc., San Diego, CA 92122, United States
| | - Shunhua Han
- llumina, Inc., San Diego, CA 92122, United States
| | | | - James Han
- llumina, Inc., San Diego, CA 92122, United States
| |
Collapse
|
3
|
Berghöfer J, Khaveh N, Mundlos S, Metzger J. Multi-tool copy number detection highlights common body size-associated variants in miniature pig breeds from different geographical regions. BMC Genomics 2025; 26:285. [PMID: 40121435 PMCID: PMC11929999 DOI: 10.1186/s12864-025-11446-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Accepted: 03/05/2025] [Indexed: 03/25/2025] Open
Abstract
BACKGROUND Copy number variations (CNVs) represent a common and highly specific type of variation in the genome, potentially influencing genetic diversity and mammalian phenotypic development. Structural variants, such as deletions, duplications, and insertions, have frequently been highlighted as key factors influencing traits in high-production pigs. However, comprehensive CNV analyses in miniature pig breeds are limited despite their value in biomedical research. RESULTS This study performed whole-genome sequencing in 36 miniature pigs from nine breeds from America, Asia and Oceania, and Europe. By employing a multi-tool approach (CNVpytor, Delly, GATK gCNV, Smoove), the accuracy of CNV identification was improved. In total, 34 homozygous CNVs overlapped with exonic regions in all samples, suggesting a role in expressing specific phenotypes such as uniform growth patterns, fertility, or metabolic function. In addition, 386 copy number variation regions (CNVRs) shared by all breeds were detected, covering 33.6 Mb (1.48% of the autosomal genome). Further, 132 exclusive CNVRs were identified for American breeds, 47 for Asian and Oceanian breeds, and 114 for European breeds. Functional enrichment analysis revealed genes within the common CNVRs involved in body height determination and other growth-related parameters. Exclusive CNVRs were located in the region of genes enriched for lipid metabolism in American minipigs, reproductive traits in Asian and Oceanian breeds, and cardiovascular features and body height in European breeds. In the selected groups, quantitative trait loci associated with body size, meat quality, reproduction, and disease susceptibility were highlighted. CONCLUSION This investigation of the CNV landscape of minipigs underlines the impact of selective breeding on structural variants and its role in the development of specific breed phenotypes across geographical areas. The multi-tool approach provides a valuable resource for future studies on the effects of artificial selection on livestock genomes.
Collapse
Affiliation(s)
- Jan Berghöfer
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute of Chemistry and Biochemistry, Department of Biology, Chemistry and Pharmacy, Freie Universität Berlin, Berlin, Germany
- Institute of Animal Genomics, University of Veterinary Medicine Hanover, Hanover, Germany
| | - Nadia Khaveh
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute of Animal Genomics, University of Veterinary Medicine Hanover, Hanover, Germany
| | - Stefan Mundlos
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany
- Charité - Universitätsmedizin Berlin, BCRT - Berlin Institute of Health Centre for Regenerative Therapies, Berlin, Germany
| | - Julia Metzger
- Max Planck Institute for Molecular Genetics, Berlin, Germany.
- Institute of Animal Genomics, University of Veterinary Medicine Hanover, Hanover, Germany.
| |
Collapse
|
4
|
Zhao S, Xu D, Cai J, Shen Q, He M, Pan X, Gao Y, Li J, Yuan X. Benchmarking strategies for CNV calling from whole genome bisulfite data in humans. Comput Struct Biotechnol J 2025; 27:912-919. [PMID: 40123798 PMCID: PMC11929052 DOI: 10.1016/j.csbj.2025.02.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2025] Open
Abstract
It's important to dissect the relationship between copy number variations (CNVs) and DNA methylation, because both greatly change the dosages of genes and are responsible for diverse human cancers. Although whole genome bisulfite sequencing (WGBS) informs CNVs and DNA methylation, no study has provided a systematic benchmark for detecting CNVs from WGBS data. Herein, based on simulated and real WGBS datasets of 84.62 billion reads, we undertook 714 CNV detections to comprehensively benchmark the performance of 35 strategies, 5 alignment algorithms (bismarkbt2, bsbolt, bsmap, bwameth, and walt) wrapping with 7 CNV detection applications (BreakDancer, cn.mops, CNVkit, CNVnator, DELLY, GASV and Pindel). The results highlighted a subset of strategies that accurately called CNVs depending on numbers, lengths, precision, recall, and F1 scores of CNV detections. We found that bwameth-DELLY and bwameth-BreakDancer were the best strategies for calling deletions, and walt-CNVnator and bismarkbt2-CNVnator were the best strategies for calling duplications. These works provided investigators with useful information to accurately explore CNVs from WGBS data in humans.
Collapse
Affiliation(s)
- Shanghui Zhao
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, Guangdong 510642, China
| | - Dantong Xu
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, Guangdong 510642, China
| | - Jiali Cai
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, Guangdong 510642, China
| | - Qingpeng Shen
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, Guangdong 510642, China
| | - Mingran He
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, Guangdong 510642, China
| | - Xiangchun Pan
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, Guangdong 510642, China
| | - Yahui Gao
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, Guangdong 510642, China
| | - Jiaqi Li
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, Guangdong 510642, China
- National Center of Technology Innovation for Pigs, Chongqing 402460, China
| | - Xiaolong Yuan
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, Guangdong 510642, China
- National Center of Technology Innovation for Pigs, Chongqing 402460, China
- Centre for Healthy Ageing, Health Futures Institute, Murdoch University, Murdoch, WA 6150, Australia
| |
Collapse
|
5
|
Celus CS, Ahmad SF, Gangwar M, Kumar S, Kumar A. Deciphering new insights into copy number variations as drivers of genomic diversity and adaptation in farm animal species. Gene 2025; 939:149159. [PMID: 39672215 DOI: 10.1016/j.gene.2024.149159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2024] [Revised: 11/15/2024] [Accepted: 12/09/2024] [Indexed: 12/15/2024]
Abstract
The basis of all improvement in (re)production performance of animals and plants lies in the genetic variation. The underlying genetic variation can be further explored through investigations using molecular markers including single nucleotide polymorphism (SNP) and microsatellite, and more recently structural variants like copy number variations (CNVs). Unlike SNPs, CNVs affect a larger proportion of the genome, making them more impactful vis-à-vis variation at the phenotype level. They significantly contribute to genetic variation and provide raw material for natural and artificial selection for improved performance. CNVs are characterized as unbalanced structural variations that arise from four major mechanisms viz., non-homologous end joining (NHEJ), non-allelic homologous recombination (NAHR), fork stalling and template switching (FoSTeS), and retrotransposition. Various detection methods have been developed to identify CNVs, including molecular techniques and massively parallel sequencing. Next-generation sequencing (NGS)/high-throughput sequencing offers higher resolution and sensitivity, but challenges remain in delineating CNVs in regions with repetitive sequences or high GC content. High-throughput sequencing technologies utilize different methods based on read-pair, split-read, read depth, and assembly approaches (or their combination) to detect CNVs. Read-pair based methods work by mapping discordant reads, while the read-depth approach works on detecting the correlation between read depth and copy number of genetic segments or a gene. Split-read methods involve mapping segments of reads to different locations on the genome, while assembly methods involve comparing contigs to a reference or de novo sequencing. Similar to other marker-trait association studies, CNV-association studies are not uncommon in humans and farm animals. Soon, extensive studies will be needed to deduce the unique evolutionary trajectories and underlying molecular mechanisms for targeted genetic improvements in different farm animal species. The present review delineates the importance of CNVs in genetic studies, their generation along with programs and principles to efficiently identify them, and finally throw light on the existing literature on studies in farm animal species vis-à-vis CNVs.
Collapse
Affiliation(s)
- C S Celus
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India
| | - Sheikh Firdous Ahmad
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India; Livestock Production and Management Section, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India.
| | - Munish Gangwar
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India
| | - Subodh Kumar
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India
| | - Amit Kumar
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India
| |
Collapse
|
6
|
Bozzi F, Conca E, Silvestri M, Dagrada G, Ardore A, Penso D, Lorenzini D, Volpi CC, Trupia DV, Busico A, Capone I, Perrone F, Tamborini E, Vingiani A, Agnelli L, Pruneri G. Detecting gene copy number alterations by Oncomine Comprehensive genomic profiling in a comparative study on FFPE tumor samples. Sci Rep 2025; 15:4314. [PMID: 39910096 PMCID: PMC11799426 DOI: 10.1038/s41598-025-88494-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Accepted: 01/28/2025] [Indexed: 02/07/2025] Open
Abstract
Copy number alterations (CNAs) play a fundamental role in cancer development and constitute a potential tool for tailored treatments. The CNAs recognition in formalin fixed paraffin embedded (FFPE) material for diagnostic purposes has relied for years mainly on fluorescence in situ hybridization. The introduction of other procedures, such as Next-Generation Sequencing has dramatically improved CNAs discovery at genome-wide level. The detection of CNAs by NGS in FFPE material is, nonetheless, a complex issue, which still requires validation studies. Herein, the CNAs detection by a widely used NGS assay (Oncomine Comprehensive Assay plus®, OCA+) were evaluated in 14 FFPE samples mirroring diagnostic daily practice and compared to a whole-genome assay. OCA+, a targeted DNA panel, showed lower CNAs detection sensitivity and equal specificity for gains and losses. According to proprietary software pipeline, OCA+ accurately identified gains characterized by CN ≥ 5,2. No significant threshold maximizing the difference between true and false positive losses was found. Orthogonal FISH tests validated seven CNAs characterized by CN gain ≥ 6 or complete loss. Considering the CNAs growing significance in precision medicine, our findings further prompt towards a robust validation of NGS detection in FFPE materials.
Collapse
Affiliation(s)
- Fabio Bozzi
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
| | - Elena Conca
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
| | - Marco Silvestri
- Department of Research, Nutrition and Metabolomics, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
- Isinnova srl, via Enrico Berlinguer 2, Brescia, 25124, Italy
| | - Gianpaolo Dagrada
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
| | - Alice Ardore
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
| | - Donata Penso
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
| | - Daniele Lorenzini
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
| | - Chiara Costanza Volpi
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
| | - Desirè Viola Trupia
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
| | - Adele Busico
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
| | - Iolanda Capone
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
| | - Federica Perrone
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
| | - Elena Tamborini
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
| | - Andrea Vingiani
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
- Department of Oncology and Haemato-Oncology, University of Milan, Milano, 20122, Italy
| | - Luca Agnelli
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy.
| | - Giancarlo Pruneri
- Department of Diagnostic Innovation, Pathology Unit 2, Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, 20133, Italy
- Department of Oncology and Haemato-Oncology, University of Milan, Milano, 20122, Italy
| |
Collapse
|
7
|
Chen X, Wei S, Sun C, Yi Z, Wang Z, Wu Y, Xu J, Tao J, Chen H, Zhang M, Jiang Y, Lv H, Huang C. Computational Tools for Studying Genome Structural Variation. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2025; 29:36-48. [PMID: 39905890 DOI: 10.1089/omi.2024.0200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2025]
Abstract
Structural variation (SV) typically refers to alterations in DNA fragments at least 50 base pairs long in the human genome. It can alter thousands of DNA nucleotides and thus significantly influence human health, disease, and clinical phenotypes. There is a shared and growing recognition that the emergence of effective computational tools and high-throughput technologies such as short-read sequencing and long-read sequencing offers novel insight into SV and, by extension, diseases affecting planetary health. However, numerous available SV tools exist with varying strengths and weaknesses. This is currently hampering the abilities of scholars to select the optimal tools to study SVs. Here, we reviewed 175 tools developed in the past two decades for SV detection, annotation, visualization, and downstream analysis of human genomics. In this expert review, we provide a comprehensive catalog of SV-related tools across different technology platforms and summarize their features, strengths, and limitations with an eye to accelerate systems science and planetary health innovations.
Collapse
Affiliation(s)
- Xingyu Chen
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Kay Laboratory of Quality Research in Chinese Medicine & Faculty of Chinese Medicine, Macau University of Science and Technology, Taipa, China
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Siyu Wei
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Chen Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Zelin Yi
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Kay Laboratory of Quality Research in Chinese Medicine & Faculty of Chinese Medicine, Macau University of Science and Technology, Taipa, China
| | - Zihan Wang
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Kay Laboratory of Quality Research in Chinese Medicine & Faculty of Chinese Medicine, Macau University of Science and Technology, Taipa, China
| | - Yingyi Wu
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Kay Laboratory of Quality Research in Chinese Medicine & Faculty of Chinese Medicine, Macau University of Science and Technology, Taipa, China
| | - Jing Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Junxian Tao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Haiyan Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Mingming Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yongshuai Jiang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Hongchao Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Chen Huang
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Kay Laboratory of Quality Research in Chinese Medicine & Faculty of Chinese Medicine, Macau University of Science and Technology, Taipa, China
| |
Collapse
|
8
|
Landeros N, Vargas-Roig L, Denita S, Mampel A, Hasbún R, Araya H, Castillo I, Valdes C, Flores M, Salter JS, Vasquez K, Romero J, Pérez-Castro R. Regional Hereditary Cancer Program in Chile: A scalable model of genetic counseling and molecular diagnosis to improve clinical outcomes for patients with hereditary cancer across Latin America. Biol Res 2024; 57:99. [PMID: 39710803 DOI: 10.1186/s40659-024-00579-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Accepted: 12/09/2024] [Indexed: 12/24/2024] Open
Abstract
BACKGROUND Breast cancer is a leading cause of cancer-related mortality worldwide, with hereditary forms accounting for approximately 10% of cases. In Chile, significant gaps exist in genetic counseling and testing, particularly within the public health system. This study presents the implementation and outcomes of the first regional hereditary cancer program in the Maule region of Chile, aimed at improving detection and management of hereditary breast cancer. METHODS A cohort of 48 high-risk breast cancer patients from the Hospital Regional de Talca received genetic counseling and underwent Next-Generation Sequencing multigene panel testing. The program was established through collaboration between multiple institutions, leveraging telemedicine and outsourcing sequencing analysis to address regional gaps. RESULTS Pathogenic or likely pathogenic variants were identified in 12% of patients, including in BRCA1, BRCA2, TP53, and PALB2. Notably, novel pathogenic variants in BRCA1 (rs80357505) and TP53 (rs1131691022) were discovered, highlighting the unique genetic landscape of the Chilean population. Additionally, 70 variants of uncertain significance were found across 42 genes, particularly in FAN1, MSH6, and FANCI, underscoring the need for further research. The program's collaborative approach effectively bridged critical gaps in genetic services, providing high-quality care within the public health system despite limited resources. CONCLUSIONS The Regional Hereditary Cancer Program addresses significant gaps in genetic counseling and testing in Chile's public health system. This scalable model enhances early detection and personalized treatment for hereditary cancer patients and could be adapted to other regions across Latin America.
Collapse
Affiliation(s)
- Natalia Landeros
- Unidad de Innovación en Prevención y Oncología de Precisión Centro Oncológico, Facultad de Medicina, Unidad de Innovación en Prevención y Oncología de Precisión Universidad Católica del Maule, Talca, 3480094, Chile
- In Vivo Tumor Biology Research Facility, Centro Oncológico, Facultad de Medicina, Universidad Católica del Maule, Talca, 3480094, Chile
- Biomedical Research Labs, Facultad de Medicina, Universidad Católica del Maule, Talca, 3480094, Chile
| | - Laura Vargas-Roig
- Tumor Biology Laboratory, Institute of Medicine and Experimental Biology of Cuyo, National Research Council of Argentine, Mendoza, Argentina
- Medical School, National University of Cuyo, Mendoza, Argentina
| | | | - Alejandra Mampel
- Tumor Biology Laboratory, Institute of Medicine and Experimental Biology of Cuyo, National Research Council of Argentine, Mendoza, Argentina
- Medical School, National University of Cuyo, Mendoza, Argentina
- University Hospital, Mendoza, Argentina
| | - Rafael Hasbún
- Hospital Regional de Talca (HRT), Talca, 3480094, Chile
| | - Hernán Araya
- Hospital Regional de Talca (HRT), Talca, 3480094, Chile
| | - Iván Castillo
- Unidad de Innovación en Prevención y Oncología de Precisión Centro Oncológico, Facultad de Medicina, Unidad de Innovación en Prevención y Oncología de Precisión Universidad Católica del Maule, Talca, 3480094, Chile
- Hospital Regional de Talca (HRT), Talca, 3480094, Chile
| | - Camila Valdes
- Hospital Regional de Talca (HRT), Talca, 3480094, Chile
| | | | | | - Katherin Vasquez
- Biomedical Research Labs, Facultad de Medicina, Universidad Católica del Maule, Talca, 3480094, Chile
| | - Jacqueline Romero
- Biomedical Research Labs, Facultad de Medicina, Universidad Católica del Maule, Talca, 3480094, Chile
| | - Ramón Pérez-Castro
- Unidad de Innovación en Prevención y Oncología de Precisión Centro Oncológico, Facultad de Medicina, Unidad de Innovación en Prevención y Oncología de Precisión Universidad Católica del Maule, Talca, 3480094, Chile.
- In Vivo Tumor Biology Research Facility, Centro Oncológico, Facultad de Medicina, Universidad Católica del Maule, Talca, 3480094, Chile.
- Biomedical Research Labs, Facultad de Medicina, Universidad Católica del Maule, Talca, 3480094, Chile.
| |
Collapse
|
9
|
Bahbahani H, Mohammad Z, Al-Ateeqi A, Almathen F. A comprehensive map of copy number variations in dromedary camels based on whole genome sequence data. Sci Rep 2024; 14:25573. [PMID: 39462079 PMCID: PMC11513024 DOI: 10.1038/s41598-024-77773-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Accepted: 10/25/2024] [Indexed: 10/28/2024] Open
Abstract
Copy number variants (CNVs) are structural variants within the eukaryotic genome that vary among individuals of a species. These variants have been associated with different phenotypic traits, making them a valuable consideration as markers for designing breeding programmes. In this study, whole genome sequence data of 60 dromedary camel samples originating from the Arabian Peninsula were analyzed to construct a comprehensive dromedary CNV map. Utilizing four CNV callers employing read-depth, split-read and paired-end mapping approaches, a total of 37,519 CNV events (17,847 deletions and 19,672 duplications) were called on the dromedary autosomes. These CNV events were merged into 2,557 regions, categorized as 1,322 losses, 122 gains, and 1,113 "mixed regions" comprising both types. The cumulative size of the CNV regions amounted to 22.5 Mb, covering roughly 1.16% of the dromedary autosomes. Approximately 32% of the defined CNV regions (comprising 60% losses, 18% gains, and 0.27% mixed regions) were found in ≥ 90% of the dromedary samples, classifying them as prevalent regions. Genes with biological functions related to the different adaptive physiologies of dromedary camels, such as fertility, heat stress, musculoskeletal development, and fat metabolism, were overlapping with or in close proximity to ~ 68% of the defined CNV regions, demonstrating their potential role in dromedaries' physiology. This study presents the first comprehensive CNV map of dromedary camels and builds on the present knowledge in understanding the genetic structure of this species.
Collapse
Affiliation(s)
- Hussain Bahbahani
- Department of Biological Sciences, Faculty of Science, Kuwait University, Sh. Sabah Al-Salem campus, Kuwait City, Kuwait.
| | - Zainab Mohammad
- Department of Biological Sciences, Faculty of Science, Kuwait University, Sh. Sabah Al-Salem campus, Kuwait City, Kuwait
| | - Abdulaziz Al-Ateeqi
- Environment and Life Sciences Research Center, Kuwait Institute for Scientific Research, Kuwait City, Kuwait
| | - Faisal Almathen
- Department of Veterinary Public Health and Animal Husbandry, College of Veterinary Medicine, King Faisal University, 400, Al-Ahsa, Kingdom of Saudi Arabia
- Camel Research Center, King Faisal University, 400, Al-Ahsa, Saudi Arabia
| |
Collapse
|
10
|
Yang M, Kim JA, Jo HS, Park JH, Ahn SY, Sung SI, Park WS, Cho HW, Kim JM, Park MH, Park HY, Jang JH, Chang YS. Diagnostic Utility of Whole Genome Sequencing After Negative Karyotyping/Chromosomal Microarray in Infants Born With Multiple Congenital Anomalies. J Korean Med Sci 2024; 39:e250. [PMID: 39315442 PMCID: PMC11419962 DOI: 10.3346/jkms.2024.39.e250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 07/14/2024] [Indexed: 09/25/2024] Open
Abstract
BACKGROUND Achieving a definitive genetic diagnosis of unexplained multiple congenital anomalies (MCAs) in neonatal intensive care units (NICUs) infants is challenging because of the limited diagnostic capabilities of conventional genetic tests. Although the implementation of whole genome sequencing (WGS) has commenced for diagnosing MCAs, due to constraints in resources and faculty, many NICUs continue to utilize chromosomal microarray (CMA) and/or karyotyping as the initial diagnostic approach. We aimed to evaluate the diagnostic efficacy of WGS in infants with MCAs who have received negative results from karyotyping and/or CMA. METHODS In this prospective study, we enrolled 80 infants with MCAs who were admitted to a NICU at a single center and had received negative results from CMA and/or karyotyping. The phenotypic characteristics were classified according to the International Classification of Diseases and the Human Phenotype Ontology. We assessed the diagnostic yield of trio-WGS in infants with normal chromosomal result and explored the process of diagnosing by analyzing both phenotype and genotype. Also, we compared the phenotype and clinical outcomes between the groups diagnosed with WGS and the undiagnosed group. RESULTS The diagnostic yield of WGS was 26% (21/80), of which 76% were novel variants. There was a higher diagnostic yield in cases of craniofacial abnormalities, including those of the eye and ear, and a lower diagnostic yield in cases of gastrointestinal and genitourinary abnormalities. In addition, higher rates of rehabilitation therapy and gastrostomy were observed in WGS-diagnosed infants than in undiagnosed infants. CONCLUSION This prospective cohort study assessed the usefulness of trio-WGS following chromosomal analysis for diagnosing MCAs in the NICU and revealed improvements in the diagnostic yield and clinical utility of WGS.
Collapse
Affiliation(s)
- Misun Yang
- Department of Pediatrics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
- Cell and Gene Therapy Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Jee Ah Kim
- Department of Laboratory Medicine and Genetics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
- Department of Laboratory Medicine, Kangbuk Samsung Hospital, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Heui Seung Jo
- Department of Pediatrics, Kangwon National University Hospital, Kangwon National University School of Medicine, Chuncheon, Korea
| | - Jong-Ho Park
- Clinical Genomics Center, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - So Yoon Ahn
- Department of Pediatrics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
- Cell and Gene Therapy Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Se In Sung
- Department of Pediatrics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
- Cell and Gene Therapy Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Won Soon Park
- Department of Pediatrics, CHA Gangnam Medical Center, CHA University, Seoul, Korea
| | - Hye-Won Cho
- Division of Genome Science, Department of Precision Medicine, National Institute of Health, Cheongju, Korea
| | - Jeong-Min Kim
- Division of Genome Science, Department of Precision Medicine, National Institute of Health, Cheongju, Korea
| | - Mi-Hyun Park
- Division of Genome Science, Department of Precision Medicine, National Institute of Health, Cheongju, Korea
| | | | - Ja-Hyun Jang
- Department of Laboratory Medicine and Genetics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea.
| | - Yun Sil Chang
- Department of Pediatrics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
- Cell and Gene Therapy Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
- Department of Health Sciences and Technology, Samsung Advanced Institute for Health Sciences & Technology, Sungkyunkwan University, Seoul, Korea.
| |
Collapse
|
11
|
Bertani-Torres W, Lezirovitz K, Alencar-Coutinho D, Pardono E, da Costa SS, Antunes LDN, de Oliveira J, Otto PA, Pingault V, Mingroni-Netto RC. Waardenburg Syndrome: The Contribution of Next-Generation Sequencing to the Identification of Novel Causative Variants. Audiol Res 2023; 14:9-25. [PMID: 38391765 PMCID: PMC10886116 DOI: 10.3390/audiolres14010002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 12/08/2023] [Accepted: 12/13/2023] [Indexed: 02/24/2024] Open
Abstract
Waardenburg syndrome (WS) is characterized by hearing loss and pigmentary abnormalities of the eyes, hair, and skin. The condition is genetically heterogeneous, and is classified into four clinical types differentiated by the presence of dystopia canthorum in type 1 and its absence in type 2. Additionally, limb musculoskeletal abnormalities and Hirschsprung disease differentiate types 3 and 4, respectively. Genes PAX3, MITF, SOX10, KITLG, EDNRB, and EDN3 are already known to be associated with WS. In WS, a certain degree of molecularly undetected patients remains, especially in type 2. This study aims to pinpoint causative variants using different NGS approaches in a cohort of 26 Brazilian probands with possible/probable diagnosis of WS1 (8) or WS2 (18). DNA from the patients was first analyzed by exome sequencing. Seven of these families were submitted to trio analysis. For inconclusive cases, we applied a targeted NGS panel targeting WS/neurocristopathies genes. Causative variants were detected in 20 of the 26 probands analyzed, these being five in PAX3, eight in MITF, two in SOX10, four in EDNRB, and one in ACTG1 (type 2 Baraitser-Winter syndrome, BWS2). In conclusion, in our cohort of patients, the detection rate of the causative variant was 77%, confirming the superior detection power of NGS in genetically heterogeneous diseases.
Collapse
Affiliation(s)
- William Bertani-Torres
- Centro de Estudos sobre o Genoma Humano e Células Tronco, Departamento de Genética e Biologia Evolutiva, Instituto de Biociências, Universidade de São Paulo, São Paulo 05508-090, Brazil
- Department of Embryology and Genetics of Malformations, INSERM (Institut National de la Santé et de la Recherche Médicale) UMR (Unité Mixte de Recherche) 1163, Université Paris-Cité and Institut Imagine, 75015 Paris, France
| | - Karina Lezirovitz
- Otorhinolaryngology Lab-LIM 32, Hospital das Clínicas, Faculdade de Medicina, Universidade de São Paulo, São Paulo 01246-000, Brazil
| | - Danillo Alencar-Coutinho
- Otorhinolaryngology Lab-LIM 32, Hospital das Clínicas, Faculdade de Medicina, Universidade de São Paulo, São Paulo 01246-000, Brazil
| | - Eliete Pardono
- Instituto de Ciências da Saúde, Universidade Paulista UNIP, São Paulo 04026-002, Brazil
- Colégio Miguel de Cervantes, São Paulo 05618-001, Brazil
| | - Silvia Souza da Costa
- Centro de Estudos sobre o Genoma Humano e Células Tronco, Departamento de Genética e Biologia Evolutiva, Instituto de Biociências, Universidade de São Paulo, São Paulo 05508-090, Brazil
| | - Larissa do Nascimento Antunes
- Centro de Estudos sobre o Genoma Humano e Células Tronco, Departamento de Genética e Biologia Evolutiva, Instituto de Biociências, Universidade de São Paulo, São Paulo 05508-090, Brazil
| | - Judite de Oliveira
- Médecine Génomique des Maladies Rares, AP-HP, Hôpital Necker-Enfants Malades, 75015 Paris, France
| | - Paulo Alberto Otto
- Centro de Estudos sobre o Genoma Humano e Células Tronco, Departamento de Genética e Biologia Evolutiva, Instituto de Biociências, Universidade de São Paulo, São Paulo 05508-090, Brazil
| | - Véronique Pingault
- Department of Embryology and Genetics of Malformations, INSERM (Institut National de la Santé et de la Recherche Médicale) UMR (Unité Mixte de Recherche) 1163, Université Paris-Cité and Institut Imagine, 75015 Paris, France
- Médecine Génomique des Maladies Rares, AP-HP, Hôpital Necker-Enfants Malades, 75015 Paris, France
| | - Regina Célia Mingroni-Netto
- Centro de Estudos sobre o Genoma Humano e Células Tronco, Departamento de Genética e Biologia Evolutiva, Instituto de Biociências, Universidade de São Paulo, São Paulo 05508-090, Brazil
| |
Collapse
|
12
|
Antkowiak M, Szydlowski M. Uncovering structural variants associated with body weight and obesity risk in labrador retrievers: a genome-wide study. Front Genet 2023; 14:1235821. [PMID: 37799139 PMCID: PMC10548226 DOI: 10.3389/fgene.2023.1235821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 09/11/2023] [Indexed: 10/07/2023] Open
Abstract
Although obesity in the domestic dog (Canis lupus familiaris) is known to decrease well-being and shorten lifespan, the genetic risk variants associated with canine obesity remain largely unknown. In our study, which focused on the obesity-prone Labrador Retriever breed, we conducted a genome-wide analysis to identify structural variants linked to body weight and obesity. Obesity status was based on a 5-point body condition score (BCS) and the obese dog group included all dogs with a BCS of 5, along with dogs with the highest body weight within the BCS 4 group. Data from whole-gene sequencing of fifty dogs, including 28 obese dogs, were bioinformatically analyzed to identify potential structural variants that varied in frequency between obese and healthy dogs. The seven most promising variants were further analyzed by droplet digital PCR in a group of 110 dogs, including 63 obese. Our statistical evidence suggests that common structural mutations in or near six genes, specifically ALPL, KCTD8, SGSM1, SLC12A6, RYR3, and VPS26C, may contribute to the variability observed in body weight and body condition scores among Labrador Retriever dogs. These findings emphasize the need for additional research to validate the associations and explore the specific functions of these genes in relation to canine obesity.
Collapse
Affiliation(s)
| | - Maciej Szydlowski
- Department of Genetics and Animal Breeding, Poznań University of Life Sciences, Poznań, Poland
| |
Collapse
|
13
|
Chen X, Liu Y, Lv K, Wang M, Liu X, Li B. FASTdRNA: a workflow for the analysis of ONT direct RNA sequencing. BIOINFORMATICS ADVANCES 2023; 3:vbad099. [PMID: 37521311 PMCID: PMC10375421 DOI: 10.1093/bioadv/vbad099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 07/11/2023] [Accepted: 07/19/2023] [Indexed: 08/01/2023]
Abstract
Motivation Direct RNA-seq (dRNA-seq) using Oxford Nanopore Technology (ONT) has revolutionized transcript mapping by offering enhanced precision due to its long-read length. Unlike traditional techniques, dRNA-seq eliminates the need for PCR amplification, reducing the impact of GC bias, and preserving valuable base physical information, such as RNA modification and poly(A) length estimation. However, the rapid advancement of ONT devices has set higher standards for analytical software, resulting in potential challenges of software incompatibility and reduced efficiency. Results We present a novel workflow, called FASTdRNA, to manipulate dRNA-seq data efficiently. This workflow comprises two modules: a data preprocessing module and a data analysis module. The preprocessing data module, dRNAmain, encompasses basecalling, mapping, and transcript counting, which are essential for subsequent analyses. The data analysis module consists of a range of downstream analyses that facilitate the estimation of poly(A) length, prediction of RNA modifications, and assessment of alternative splicing events across different conditions with duplication. The FASTdRNA workflow is designed for the Snakemake framework and can be efficiently executed locally or in the cloud. Comparative experiments have demonstrated its superior performance compared to previous methods. This innovative workflow enhances the research capabilities of dRNA-seq data analysis pipelines by optimizing existing processes and expanding the scope of analysis. Availability and implementation The workflow is freely available at https://github.com/Tomcxf/FASTdRNA under an MIT license. Detailed install and usage guidance can be found in the GitHub repository.
Collapse
Affiliation(s)
- Xiaofeng Chen
- Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Peking University Institute of Advanced Agricultural Sciences, Weifang, Shandong 261000, China
- National Key Laboratory of Wheat Improvement, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Weifang, Shandong 261325, China
| | - Yongqi Liu
- Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Peking University Institute of Advanced Agricultural Sciences, Weifang, Shandong 261000, China
- National Key Laboratory of Wheat Improvement, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Weifang, Shandong 261325, China
| | - Kaiwen Lv
- Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Peking University Institute of Advanced Agricultural Sciences, Weifang, Shandong 261000, China
- National Key Laboratory of Wheat Improvement, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Weifang, Shandong 261325, China
| | - Meiling Wang
- Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Peking University Institute of Advanced Agricultural Sciences, Weifang, Shandong 261000, China
- National Key Laboratory of Wheat Improvement, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Weifang, Shandong 261325, China
| | - Xiaoqin Liu
- Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Peking University Institute of Advanced Agricultural Sciences, Weifang, Shandong 261000, China
- National Key Laboratory of Wheat Improvement, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Weifang, Shandong 261325, China
| | - Bosheng Li
- Corresponding author. Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Peking University Institute of Advanced Agricultural Sciences, 699 Binhu Road, Xiashan Ecological Economic Development District, Weifang, Shandong 261000, China. E-mail:
| |
Collapse
|
14
|
Gil JV, Such E, Sargas C, Simarro J, Miralles A, Pérez G, de Juan I, Palanca S, Avetisyan G, Santiago M, Fuentes C, Fernández JM, Vicente AI, Romero S, Llop M, Barragán E. Design and Validation of a Custom Next-Generation Sequencing Panel in Pediatric Acute Lymphoblastic Leukemia. Int J Mol Sci 2023; 24:4440. [PMID: 36901871 PMCID: PMC10002321 DOI: 10.3390/ijms24054440] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 02/15/2023] [Accepted: 02/21/2023] [Indexed: 03/05/2023] Open
Abstract
The molecular landscape of acute lymphoblastic leukemia (ALL) is highly heterogeneous, and genetic lesions are clinically relevant for diagnosis, risk stratification, and treatment guidance. Next-generation sequencing (NGS) has become an essential tool for clinical laboratories, where disease-targeted panels are able to capture the most relevant alterations in a cost-effective and fast way. However, comprehensive ALL panels assessing all relevant alterations are scarce. Here, we design and validate an NGS panel including single-nucleotide variants (SNVs), insertion-deletions (indels), copy number variations (CNVs), fusions, and gene expression (ALLseq). ALLseq sequencing metrics were acceptable for clinical use and showed 100% sensitivity and specificity for virtually all types of alterations. The limit of detection was established at a 2% variant allele frequency for SNVs and indels, and at a 0.5 copy number ratio for CNVs. Overall, ALLseq is able to provide clinically relevant information to more than 83% of pediatric patients, making it an attractive tool for the molecular characterization of ALL in clinical settings.
Collapse
Affiliation(s)
- José Vicente Gil
- Accredited Research Group on Hematology, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
| | - Esperanza Such
- Accredited Research Group on Hematology, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
- Hematology Diagnostic Unit, Hematology Service, Hospital Universitario y Politécnico la Fe, 46026 Valencia, Spain
- Centro de Investigación Biomédica en Red de Cáncer, CIBERONC CB16/12/00284, Instituto de Salud Carlos III, 28029 Madrid, Spain
| | - Claudia Sargas
- Accredited Research Group on Hematology, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
| | - Javier Simarro
- Accredited Research Group on Clinical and Translational Cancer Research, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
| | - Alberto Miralles
- Accredited Research Group on Hematology, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
| | - Gema Pérez
- Molecular Biology Unit, Clinical Analysis Service, Hospital Universitario y Politécnico la Fe, 46026 Valencia, Spain
| | - Inmaculada de Juan
- Accredited Research Group on Clinical and Translational Cancer Research, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
- Molecular Biology Unit, Clinical Analysis Service, Hospital Universitario y Politécnico la Fe, 46026 Valencia, Spain
| | - Sarai Palanca
- Accredited Research Group on Clinical and Translational Cancer Research, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
- Molecular Biology Unit, Clinical Analysis Service, Hospital Universitario y Politécnico la Fe, 46026 Valencia, Spain
- Department of Biochemistry and Molecular Biology, University of Valencia, 46010 Valencia, Spain
| | - Gayane Avetisyan
- Accredited Research Group on Hematology, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
| | - Marta Santiago
- Accredited Research Group on Hematology, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
| | - Carolina Fuentes
- Accredited Research Group on Clinical and Translational Cancer Research, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
- Onco-Hematology Unit, Pediatrics Service, Hospital Universitario y Politécnico la Fe, 46026 Valencia, Spain
| | - José María Fernández
- Accredited Research Group on Clinical and Translational Cancer Research, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
- Onco-Hematology Unit, Pediatrics Service, Hospital Universitario y Politécnico la Fe, 46026 Valencia, Spain
| | - Ana Isabel Vicente
- Hematology Diagnostic Unit, Hematology Service, Hospital Universitario y Politécnico la Fe, 46026 Valencia, Spain
| | - Samuel Romero
- Accredited Research Group on Hematology, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
| | - Marta Llop
- Accredited Research Group on Hematology, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
- Centro de Investigación Biomédica en Red de Cáncer, CIBERONC CB16/12/00284, Instituto de Salud Carlos III, 28029 Madrid, Spain
- Molecular Biology Unit, Clinical Analysis Service, Hospital Universitario y Politécnico la Fe, 46026 Valencia, Spain
| | - Eva Barragán
- Accredited Research Group on Hematology, Instituto de Investigación Sanitaria la Fe, 46026 Valencia, Spain
- Centro de Investigación Biomédica en Red de Cáncer, CIBERONC CB16/12/00284, Instituto de Salud Carlos III, 28029 Madrid, Spain
- Molecular Biology Unit, Clinical Analysis Service, Hospital Universitario y Politécnico la Fe, 46026 Valencia, Spain
| |
Collapse
|
15
|
Gudkov M, Thibaut L, Khushi M, Blue GM, Winlaw DS, Dunwoodie SL, Giannoulatou E. ConanVarvar: a versatile tool for the detection of large syndromic copy number variation from whole-genome sequencing data. BMC Bioinformatics 2023; 24:49. [PMID: 36792982 PMCID: PMC9930243 DOI: 10.1186/s12859-023-05154-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 01/19/2023] [Indexed: 02/17/2023] Open
Abstract
BACKGROUND A wide range of tools are available for the detection of copy number variants (CNVs) from whole-genome sequencing (WGS) data. However, none of them focus on clinically-relevant CNVs, such as those that are associated with known genetic syndromes. Such variants are often large in size, typically 1-5 Mb, but currently available CNV callers have been developed and benchmarked for the discovery of smaller variants. Thus, the ability of these programs to detect tens of real syndromic CNVs remains largely unknown. RESULTS Here we present ConanVarvar, a tool which implements a complete workflow for the targeted analysis of large germline CNVs from WGS data. ConanVarvar comes with an intuitive R Shiny graphical user interface and annotates identified variants with information about 56 associated syndromic conditions. We benchmarked ConanVarvar and four other programs on a dataset containing real and simulated syndromic CNVs larger than 1 Mb. In comparison to other tools, ConanVarvar reports 10-30 times less false-positive variants without compromising sensitivity and is quicker to run, especially on large batches of samples. CONCLUSIONS ConanVarvar is a useful instrument for primary analysis in disease sequencing studies, where large CNVs could be the cause of disease.
Collapse
Affiliation(s)
- Mikhail Gudkov
- grid.1057.30000 0000 9472 3971Victor Chang Cardiac Research Institute, Sydney, NSW 2010 Australia ,grid.1013.30000 0004 1936 834XSchool of Biomedical Engineering, The University of Sydney, Sydney, NSW 2006 Australia ,grid.1005.40000 0004 4902 0432St Vincent’s Clinical Campus, School of Clinical Medicine, Faculty of Medicine and Health, UNSW Sydney, Sydney, NSW 2010 Australia
| | - Loïc Thibaut
- grid.1057.30000 0000 9472 3971Victor Chang Cardiac Research Institute, Sydney, NSW 2010 Australia ,grid.1005.40000 0004 4902 0432School of Mathematics and Statistics, UNSW Sydney, Sydney, NSW 2052 Australia
| | - Matloob Khushi
- grid.1013.30000 0004 1936 834XSchool of Computer Science, The University of Sydney, Sydney, NSW 2006 Australia
| | - Gillian M. Blue
- grid.1013.30000 0004 1936 834XSydney Medical School, The University of Sydney, Sydney, NSW 2006 Australia ,grid.413973.b0000 0000 9690 854XHeart Centre for Children, The Children’s Hospital at Westmead, Sydney, NSW 2145 Australia
| | - David S. Winlaw
- grid.1013.30000 0004 1936 834XSydney Medical School, The University of Sydney, Sydney, NSW 2006 Australia ,grid.413973.b0000 0000 9690 854XHeart Centre for Children, The Children’s Hospital at Westmead, Sydney, NSW 2145 Australia
| | - Sally L. Dunwoodie
- grid.1057.30000 0000 9472 3971Victor Chang Cardiac Research Institute, Sydney, NSW 2010 Australia ,grid.1005.40000 0004 4902 0432St Vincent’s Clinical Campus, School of Clinical Medicine, Faculty of Medicine and Health, UNSW Sydney, Sydney, NSW 2010 Australia ,grid.1005.40000 0004 4902 0432School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW 2052 Australia
| | - Eleni Giannoulatou
- Victor Chang Cardiac Research Institute, Sydney, NSW, 2010, Australia. .,St Vincent's Clinical Campus, School of Clinical Medicine, Faculty of Medicine and Health, UNSW Sydney, Sydney, NSW, 2010, Australia.
| |
Collapse
|
16
|
Points to consider in the detection of germline structural variants using next-generation sequencing: A statement of the American College of Medical Genetics and Genomics (ACMG). Genet Med 2023; 25:100316. [PMID: 36507974 DOI: 10.1016/j.gim.2022.09.017] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 09/29/2022] [Accepted: 09/30/2022] [Indexed: 12/14/2022] Open
|
17
|
Hassan S, Bahar R, Johan MF, Mohamed Hashim EK, Abdullah WZ, Esa E, Abdul Hamid FS, Zulkafli Z. Next-Generation Sequencing (NGS) and Third-Generation Sequencing (TGS) for the Diagnosis of Thalassemia. Diagnostics (Basel) 2023; 13:diagnostics13030373. [PMID: 36766477 PMCID: PMC9914462 DOI: 10.3390/diagnostics13030373] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 01/11/2023] [Accepted: 01/16/2023] [Indexed: 01/20/2023] Open
Abstract
Thalassemia is one of the most heterogeneous diseases, with more than a thousand mutation types recorded worldwide. Molecular diagnosis of thalassemia by conventional PCR-based DNA analysis is time- and resource-consuming owing to the phenotype variability, disease complexity, and molecular diagnostic test limitations. Moreover, genetic counseling must be backed-up by an extensive diagnosis of the thalassemia-causing phenotype and the possible genetic modifiers. Data coming from advanced molecular techniques such as targeted sequencing by next-generation sequencing (NGS) and third-generation sequencing (TGS) are more appropriate and valuable for DNA analysis of thalassemia. While NGS is superior at variant calling to TGS thanks to its lower error rates, the longer reads nature of the TGS permits haplotype-phasing that is superior for variant discovery on the homologous genes and CNV calling. The emergence of many cutting-edge machine learning-based bioinformatics tools has improved the accuracy of variant and CNV calling. Constant improvement of these sequencing and bioinformatics will enable precise thalassemia detections, especially for the CNV and the homologous HBA and HBG genes. In conclusion, laboratory transiting from conventional DNA analysis to NGS or TGS and following the guidelines towards a single assay will contribute to a better diagnostics approach of thalassemia.
Collapse
Affiliation(s)
- Syahzuwan Hassan
- Department of Hematology, School of Medical Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian 16150, Malaysia
- Institute for Medical Research, Shah Alam 40170, Malaysia
| | - Rosnah Bahar
- Department of Hematology, School of Medical Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian 16150, Malaysia
| | - Muhammad Farid Johan
- Department of Hematology, School of Medical Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian 16150, Malaysia
| | | | - Wan Zaidah Abdullah
- Department of Hematology, School of Medical Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian 16150, Malaysia
| | - Ezalia Esa
- Institute for Medical Research, Shah Alam 40170, Malaysia
| | | | - Zefarina Zulkafli
- Department of Hematology, School of Medical Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian 16150, Malaysia
- Correspondence:
| |
Collapse
|
18
|
Liu G, Yang H, Yuan X. A shortest path-based approach for copy number variation detection from next-generation sequencing data. Front Genet 2023; 13:1084974. [PMID: 36733945 PMCID: PMC9887524 DOI: 10.3389/fgene.2022.1084974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 12/27/2022] [Indexed: 01/18/2023] Open
Abstract
Copy number variation (CNV) is one of the main structural variations in the human genome and accounts for a considerable proportion of variations. As CNVs can directly or indirectly cause cancer, mental illness, and genetic disease in humans, their effective detection in humans is of great interest in the fields of oncogene discovery, clinical decision-making, bioinformatics, and drug discovery. The advent of next-generation sequencing data makes CNV detection possible, and a large number of CNV detection tools are based on next-generation sequencing data. Due to the complexity (e.g., bias, noise, alignment errors) of next-generation sequencing data and CNV structures, the accuracy of existing methods in detecting CNVs remains low. In this work, we design a new CNV detection approach, called shortest path-based Copy number variation (SPCNV), to improve the detection accuracy of CNVs. SPCNV calculates the k nearest neighbors of each read depth and defines the shortest path, shortest path relation, and shortest path cost sets based on which further calculates the mean shortest path cost of each read depth and its k nearest neighbors. We utilize the ratio between the mean shortest path cost for each read depth and the mean of the mean shortest path cost of its k nearest neighbors to construct a relative shortest path score formula that is able to determine a score for each read depth. Based on the score profile, a boxplot is then applied to predict CNVs. The performance of the proposed method is verified by simulation data experiments and compared against several popular methods of the same type. Experimental results show that the proposed method achieves the best balance between recall and precision in each set of simulated samples. To further verify the performance of the proposed method in real application scenarios, we then select real sample data from the 1,000 Genomes Project to conduct experiments. The proposed method achieves the best F1-scores in almost all samples. Therefore, the proposed method can be used as a more reliable tool for the routine detection of CNVs.
Collapse
Affiliation(s)
- Guojun Liu
- School of Statistics, Xi’an University of Finance and Economics, Xi’an, China,*Correspondence: Guojun Liu, ; Xiguo Yuan,
| | - Hongzhi Yang
- Medical Imaging Center, Xidian Group Hospital, Xi’an, China
| | - Xiguo Yuan
- Hangzhou Institute of Technology, Xidian University, Hangzhou, China,*Correspondence: Guojun Liu, ; Xiguo Yuan,
| |
Collapse
|
19
|
Kim H, Shim Y, Lee TG, Won D, Choi JR, Shin S, Lee ST. Copy-number analysis by base-level normalization: An intuitive visualization tool for evaluating copy number variations. Clin Genet 2023; 103:35-44. [PMID: 36152294 DOI: 10.1111/cge.14236] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 09/19/2022] [Accepted: 09/20/2022] [Indexed: 12/13/2022]
Abstract
Next-generation sequencing (NGS) facilitates comprehensive molecular analyses that help with diagnosing unsolved disorders. In addition to detecting single-nucleotide variations and small insertions/deletions, bioinformatics tools can identify copy number variations (CNVs) in NGS data, which improves the diagnostic yield. However, due to the possibility of false positives, subsequent confirmation tests are generally performed. Here, we introduce Copy-number Analysis by BAse-level NormAlization (CABANA), a visualization tool that allows users to intuitively identify candidate CNVs using the normalized single-base-level read depth calculated from NGS data. To demonstrate how CABANA works, NGS data were obtained from 474 patients with neuromuscular disorders. CNVs were screened using a conventional bioinformatics tool, ExomeDepth, and then we normalized and visualized those data at the single-base level using CABANA, followed by manual inspection by geneticists to filter out false positives and determine candidate CNVs. In doing so, we identified 31 candidate CNVs (7%) in 474 patients and subsequently confirmed all of them to be true using multiplex ligation-dependent probe amplification. The performance of CABANA was deemed acceptable by comparing its diagnostic yield with previous data about neuromuscular disorders. Despite some limitations, we expect CABANA to help researchers accurately identify CNVs and reduce the need for subsequent confirmation testing.
Collapse
Affiliation(s)
- Hongkyung Kim
- Department of Laboratory Medicine, Yonsei University College of Medicine, Severance Hospital, Seoul, Republic of Korea
| | - Yeeun Shim
- Brain Korea 21 PLUS Project for Medical Science, Yonsei University, Seoul, Republic of Korea
| | - Taek Gyu Lee
- Brain Korea 21 PLUS Project for Medical Science, Yonsei University, Seoul, Republic of Korea
| | - Dongju Won
- Department of Laboratory Medicine, Yonsei University College of Medicine, Severance Hospital, Seoul, Republic of Korea
| | - Jong Rak Choi
- Department of Laboratory Medicine, Yonsei University College of Medicine, Severance Hospital, Seoul, Republic of Korea.,Dxome Co. Ltd, Seongnam-si, Gyeonggi-do, Republic of Korea
| | - Saeam Shin
- Department of Laboratory Medicine, Yonsei University College of Medicine, Severance Hospital, Seoul, Republic of Korea
| | - Seung-Tae Lee
- Department of Laboratory Medicine, Yonsei University College of Medicine, Severance Hospital, Seoul, Republic of Korea.,Dxome Co. Ltd, Seongnam-si, Gyeonggi-do, Republic of Korea
| |
Collapse
|
20
|
Chai AH. Whole Genome Sequencing for Detection of Structural Variants in Patients with Retinitis Pigmentosa. Methods Mol Biol 2022; 2560:73-79. [PMID: 36481884 DOI: 10.1007/978-1-0716-2651-1_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Retinitis pigmentosa (RP) is a group of inherited retinal diseases characterized by the progressive degeneration of rod then cone photoreceptors. Most of the known mutations that cause RP reside in the protein-coding portions of DNA; however, a growing number of pathogenic mutations have been identified within the non-coding portions. This chapter details a brief method for the detection of structural variants throughout the genome for the identification of novel mutations and to ultimately provide patients with a precise molecular diagnosis.
Collapse
|
21
|
Davoudi P, Do DN, Rathgeber B, Colombo SM, Sargolzaei M, Plastow G, Wang Z, Karimi K, Hu G, Valipour S, Miar Y. Genome-wide detection of copy number variation in American mink using whole-genome sequencing. BMC Genomics 2022; 23:649. [PMID: 36096727 PMCID: PMC9468235 DOI: 10.1186/s12864-022-08874-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 09/05/2022] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Copy number variations (CNVs) represent a major source of genetic diversity and contribute to the phenotypic variation of economically important traits in livestock species. In this study, we report the first genome-wide CNV analysis of American mink using whole-genome sequence data from 100 individuals. The analyses were performed by three complementary software programs including CNVpytor, DELLY and Manta. RESULTS A total of 164,733 CNVs (144,517 deletions and 20,216 duplications) were identified representing 5378 CNV regions (CNVR) after merging overlapping CNVs, covering 47.3 Mb (1.9%) of the mink autosomal genome. Gene Ontology and KEGG pathway enrichment analyses of 1391 genes that overlapped CNVR revealed potential role of CNVs in a wide range of biological, molecular and cellular functions, e.g., pathways related to growth (regulation of actin cytoskeleton, and cAMP signaling pathways), behavior (axon guidance, circadian entrainment, and glutamatergic synapse), lipid metabolism (phospholipid binding, sphingolipid metabolism and regulation of lipolysis in adipocytes), and immune response (Wnt signaling, Fc receptor signaling, and GTPase regulator activity pathways). Furthermore, several CNVR-harbored genes associated with fur characteristics and development (MYO5A, RAB27B, FGF12, SLC7A11, EXOC2), and immune system processes (SWAP70, FYN, ORAI1, TRPM2, and FOXO3). CONCLUSIONS This study presents the first genome-wide CNV map of American mink. We identified 5378 CNVR in the mink genome and investigated genes that overlapped with CNVR. The results suggest potential links with mink behaviour as well as their possible impact on fur quality and immune response. Overall, the results provide new resources for mink genome analysis, serving as a guideline for future investigations in which genomic structural variations are present.
Collapse
Affiliation(s)
- Pourya Davoudi
- Department of Animal Science and Aquaculture, Dalhousie University, Truro, NS, Canada
| | - Duy Ngoc Do
- Department of Animal Science and Aquaculture, Dalhousie University, Truro, NS, Canada
| | - Bruce Rathgeber
- Department of Animal Science and Aquaculture, Dalhousie University, Truro, NS, Canada
| | - Stefanie M Colombo
- Department of Animal Science and Aquaculture, Dalhousie University, Truro, NS, Canada
| | - Mehdi Sargolzaei
- Department of Pathobiology, University of Guelph, Guelph, ON, Canada
- Select Sires Inc., Plain City, OH, USA
| | - Graham Plastow
- Livestock Gentec, Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB, Canada
| | - Zhiquan Wang
- Livestock Gentec, Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB, Canada
| | - Karim Karimi
- Department of Animal Science and Aquaculture, Dalhousie University, Truro, NS, Canada
| | - Guoyu Hu
- Department of Animal Science and Aquaculture, Dalhousie University, Truro, NS, Canada
| | - Shafagh Valipour
- Department of Animal Science and Aquaculture, Dalhousie University, Truro, NS, Canada
| | - Younes Miar
- Department of Animal Science and Aquaculture, Dalhousie University, Truro, NS, Canada.
| |
Collapse
|
22
|
Gao T, Chen F, Li M. Sequencing of cerebrospinal fluid in non-small-cell lung cancer patients with leptomeningeal metastasis: A systematic review. Cancer Med 2022; 12:2248-2261. [PMID: 36000927 PMCID: PMC9939157 DOI: 10.1002/cam4.5163] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Revised: 07/14/2022] [Accepted: 08/12/2022] [Indexed: 11/07/2022] Open
Abstract
Leptomeningeal metastasis (LM) refers to the dissemination of malignant cells in the subarachnoid space, pia, and arachnoid mater and is a severe condition associated with metastatic solid tumors. The most common solid tumor that develops into LM is lung cancer and the incidence increased in patients with advanced non-small-cell lung cancer (NSCLC) with targetable mutations. However, tissue biopsy of LM is inaccessible, leading to the paucity of genomic profiles of LM to guide targeted treatments and explore biological mechanisms. In recent years, liquid biopsy is considered a minimally invasive and dynamic method to trace the genomic alterations of cancer cells and some studies started to perform sequencing of cerebrospinal fluid (CSF) in patients with LM to reveal the targeted mutations and genomic profiles. In this review, we focused on studies performed sequencing of CSF in NSCLC patients with LM and summarized the sequencing results and their commonality. As the only way to reveal the genomic landscapes of LM, our review provided evidence that sequencing of CSF is a promising management method in LM patients to dynamically guide target therapy and monitor intracranial tumor response. Furthermore, it reveals a unique genomic profile of LM including driver genes, drug-resistant mutations, and a number of copy number variations. Sequencing of CSF in LM patients seems to provide more comprehensive genomic information than we expected and the biological significance behind the genomic alternations needs further study.
Collapse
Affiliation(s)
- Tianqi Gao
- Department of OncologyThe Second Hospital of Dalian Medical UniversityDalianChina
| | - Fengxi Chen
- Department of OncologyThe Second Hospital of Dalian Medical UniversityDalianChina
| | - Man Li
- Department of OncologyThe Second Hospital of Dalian Medical UniversityDalianChina
| |
Collapse
|
23
|
Benito-Sánchez B, Barroso A, Fernández V, Mercadillo F, Núñez-Torres R, Pita G, Pombo L, Morales-Chamorro R, Cano-Cano JM, Urioste M, González-Neira A, Osorio A. Apparent regional differences in the spectrum of BARD1 pathogenic variants in Spanish population and importance of copy number variants. Sci Rep 2022; 12:8547. [PMID: 35595798 PMCID: PMC9122922 DOI: 10.1038/s41598-022-12480-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Accepted: 05/11/2022] [Indexed: 12/22/2022] Open
Abstract
Only up to 25% of the cases in which there is a familial aggregation of breast and/or ovarian cancer are explained by germline mutations in the well-known BRCA1 and BRCA2 high-risk genes. Recently, the BRCA1-associated ring domain (BARD1), that partners BRCA1 in DNA repair, has been confirmed as a moderate-risk breast cancer susceptibility gene. Taking advantage of next-generation sequencing techniques, and with the purpose of defining the whole spectrum of possible pathogenic variants (PVs) in this gene, here we have performed a comprehensive mutational analysis of BARD1 in a cohort of 1946 Spanish patients who fulfilled criteria to be tested for germline pathogenic mutations in BRCA1 and BRCA2. We identified 22 different rare germline variants, being 5 of them clearly pathogenic or likely pathogenic large deletions, which account for 0.26% of the patients tested. Our results show that the prevalence and spectrum of mutations in the BARD1 gene might vary between different regions of Spain and expose the relevance to test for copy number variations.
Collapse
Affiliation(s)
- B Benito-Sánchez
- Familial Cancer Clinical Unit, Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain
| | - A Barroso
- Familial Cancer Clinical Unit, Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain
| | - V Fernández
- Familial Cancer Clinical Unit, Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain
| | - F Mercadillo
- Familial Cancer Clinical Unit, Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain
| | - R Núñez-Torres
- Human Genotyping Unit (CEGEN), Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain
| | - G Pita
- Human Genotyping Unit (CEGEN), Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain
| | - L Pombo
- Medical Oncology Section, Universitary Hospital Complex of Albacete, Albacete, Spain
| | - R Morales-Chamorro
- Medical Oncology Section, Hospitalary Compex La Mancha Centro, Alcázar de San Juan, Ciudad Real, Spain
| | - J M Cano-Cano
- Medical Oncology Service, Universitary General Hospital of Ciudad Real, Ciudad Real, Spain
| | - M Urioste
- Familial Cancer Clinical Unit, Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain
| | - A González-Neira
- Human Genotyping Unit (CEGEN), Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain
| | - A Osorio
- Familial Cancer Clinical Unit, Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain.
- Spanish Network On Rare Diseases (CIBERER), 28029, Madrid, Spain.
- Familial Cancer Clinical Unit, Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), C/Melchor Fernández Almagro 3, 29029, Madrid, Spain.
| |
Collapse
|
24
|
Combining callers improves the detection of copy number variants from whole-genome sequencing. Eur J Hum Genet 2022; 30:178-186. [PMID: 34744167 PMCID: PMC8821561 DOI: 10.1038/s41431-021-00983-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 09/23/2021] [Accepted: 10/04/2021] [Indexed: 01/03/2023] Open
Abstract
Copy Number Variants (CNVs) are deletions, duplications or insertions larger than 50 base pairs. They account for a large percentage of the normal genome variation and play major roles in human pathology. While array-based approaches have long been used to detect them in clinical practice, whole-genome sequencing (WGS) bears the promise to allow concomitant exploration of CNVs and smaller variants. However, accurately calling CNVs from WGS remains a difficult computational task, for which a consensus is still lacking. In this paper, we explore practical calling options to reach the best compromise between sensitivity and sensibility. We show that callers based on different signal (paired-end reads, split reads, coverage depth) yield complementary results. We suggest approaches combining four selected callers (Manta, Delly, ERDS, CNVnator) and a regenotyping tool (SV2), and show that this is applicable in everyday practice in terms of computation time and further interpretation. We demonstrate the superiority of these approaches over array-based Comparative Genomic Hybridization (aCGH), specifically regarding the lack of resolution in breakpoint definition and the detection of potentially relevant CNVs. Finally, we confirm our results on the NA12878 benchmark genome, as well as one clinically validated sample. In conclusion, we suggest that WGS constitutes a timely and economically valid alternative to the combination of aCGH and whole-exome sequencing.
Collapse
|
25
|
Agaoglu NB, Unal B, Akgun Dogan O, Zolfagharian P, Sharifli P, Karakurt A, Can Senay B, Kizilboga T, Yildiz J, Dinler Doganay G, Doganay L. Determining the accuracy of next generation sequencing based copy number variation analysis in Hereditary Breast and Ovarian Cancer. Expert Rev Mol Diagn 2022; 22:239-246. [PMID: 35240897 DOI: 10.1080/14737159.2022.2048373] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 02/24/2022] [Indexed: 11/04/2022]
Abstract
BACKGROUND Copy number variations (CNVs) are commonly associated with malignancies, including hereditary breast and ovarian cancers. Next generation sequencing (NGS) provides solutions for CNV detection in a single run. This study aimed to compare the accuracy of CNV detection by NGS analyzing tool against Multiplex Ligation Dependent Probe Amplification (MLPA). RESEARCH DESIGN AND METHODS In total, 1276 cases were studied by targeted NGS panels and 691 cases (61 calls in 58 NGS-CNV positive and 633 NGS-CNV negative cases) were validated by MLPA. RESULTS Twenty-eight (46%) NGS-CNV positive calls were consistent, whereas 33 (54%) calls showed discordance with MLPA. Two cases were detected as SNV by the NGS and CNV by the MLPA analysis. In total, 2% of the cases showed an MLPA confirmed CNV region in BRCA1/2. The results of this study showed that despite the high false positive call rate of the NGS-CNV algorithm, there were no false negative calls. The cases that were determined to be negative by the NGS and positive by the MLPA were actually carrying SNVs that were located on the MLPA probe binding sites. CONCLUSION The diagnostic performance of NGS-CNV analysis is promising; however, the need for confirmation by different methods remains.
Collapse
Affiliation(s)
- Nihat Bugra Agaoglu
- Genomic Laboratory (GLAB), Umraniye Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
- Department of Medical Genetics, Umraniye Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
| | - Busra Unal
- Genomic Laboratory (GLAB), Umraniye Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
| | - Ozlem Akgun Dogan
- Genomic Laboratory (GLAB), Umraniye Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
- Department of Pediatric Genetics, Umraniye Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
| | - Payam Zolfagharian
- Genomic Laboratory (GLAB), Umraniye Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
| | - Pari Sharifli
- Genomic Laboratory (GLAB), Umraniye Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
| | - Aylin Karakurt
- Genomic Laboratory (GLAB), Umraniye Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
| | - Burak Can Senay
- Genomic Laboratory (GLAB), Umraniye Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
| | - Tugba Kizilboga
- Genomic Laboratory (GLAB), Umraniye Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
- Department of Molecular Biology and Genetics, Istanbul Technical University, Istanbul, Turkey
| | - Jale Yildiz
- Genomic Laboratory (GLAB), Umraniye Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
- Department of Molecular Biology and Genetics, Istanbul Technical University, Istanbul, Turkey
| | - Gizem Dinler Doganay
- Department of Molecular Biology and Genetics, Istanbul Technical University, Istanbul, Turkey
| | - Levent Doganay
- Genomic Laboratory (GLAB), Umraniye Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
| |
Collapse
|
26
|
Pei Z, Deng K, Lei C, Du D, Yu G, Sun X, Xu C, Zhang S. Identifying Balanced Chromosomal Translocations in Human Embryos by Oxford Nanopore Sequencing and Breakpoints Region Analysis. Front Genet 2022; 12:810900. [PMID: 35116057 PMCID: PMC8804325 DOI: 10.3389/fgene.2021.810900] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 12/13/2021] [Indexed: 01/02/2023] Open
Abstract
Background: Balanced chromosomal aberrations, especially balanced translocations, can cause infertility, recurrent miscarriage or having chromosomally defective offspring. Preimplantation genetic testing for structural rearrangement (PGT-SR) has been widely implemented to improve the clinical outcomes by selecting euploid embryos for transfer, whereas embryos with balanced translocation karyotype were difficult to be distinguished by routine genetic techniques from those with a normal karyotype. Method: In this present study, we developed a clinically applicable method for reciprocal translocation carriers to reduce the risk of pregnancy loss. In the preclinical phase, we identified reciprocal translocation breakpoints in blood of translocation carriers by long-read Oxford Nanopore sequencing, followed by junction-spanning polymerase chain reaction (PCR) and Sanger sequencing. In the clinical phase of embryo diagnosis, aneuploidies and unbalanced translocations were screened by comprehensive chromosomal screening (CCS) with single nucleotide polymorphism (SNP) microarray, carrier embryos were diagnosed by junction-spanning PCR and family haplotype linkage analysis of the breakpoints region. Amniocentesis and cytogenetic analysis of fetuses in the second trimester were performed after embryo transfer to conform the results diagnosed by the presented method. Results: All the accurate reciprocal translocation breakpoints were effectively identified by Nanopore sequencing and confirmed by Sanger sequencing. Twelve embryos were biopsied and detected, the results of junction-spanning PCR and haplotype linkage analysis were consistent. In total, 12 biopsied blastocysts diagnosed to be euploid, in which 6 were aneuploid or unbalanced, three blastocysts were identified to be balanced translocation carriers and three to be normal karyotypes. Two euploid embryos were subsequently transferred back to patients and late prenatal karyotype analysis of amniotic fluid cells was performed. The outcomes diagnosed by the current approach were totally consistent with the fetal karyotypes. Conclusions: In summary, these investigations in our study illustrated that chromosomal reciprocal translocations in embryos can be accurately diagnosed. Long-read Nanopore sequencing and breakpoint analysis contributes to precisely evaluate the genetic risk of disrupted genes, and provides a way of selecting embryos with normal karyotype, especially for couples those without a reference.
Collapse
Affiliation(s)
- Zhenle Pei
- Shanghai Ji Ai Genetics and IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Ke Deng
- Shanghai Ji Ai Genetics and IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Caixai Lei
- Shanghai Ji Ai Genetics and IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Danfeng Du
- Shanghai Ji Ai Genetics and IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Guoliang Yu
- Chigene (Beijing) Translational Medical Research Center Co. Ltd., Beijing, China
| | - Xiaoxi Sun
- Shanghai Ji Ai Genetics and IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Congjian Xu
- Shanghai Ji Ai Genetics and IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
- *Correspondence: Congjian Xu, ; Shuo Zhang,
| | - Shuo Zhang
- Shanghai Ji Ai Genetics and IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
- *Correspondence: Congjian Xu, ; Shuo Zhang,
| |
Collapse
|
27
|
Xie K, Liu K, Alvi HAK, Chen Y, Wang S, Yuan X. KNNCNV: A K-Nearest Neighbor Based Method for Detection of Copy Number Variations Using NGS Data. Front Cell Dev Biol 2022; 9:796249. [PMID: 35004691 PMCID: PMC8728060 DOI: 10.3389/fcell.2021.796249] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Accepted: 11/23/2021] [Indexed: 11/19/2022] Open
Abstract
Copy number variation (CNV) is a well-known type of genomic mutation that is associated with the development of human cancer diseases. Detection of CNVs from the human genome is a crucial step for the pipeline of starting from mutation analysis to cancer disease diagnosis and treatment. Next-generation sequencing (NGS) data provides an unprecedented opportunity for CNVs detection at the base-level resolution, and currently, many methods have been developed for CNVs detection using NGS data. However, due to the intrinsic complexity of CNVs structures and NGS data itself, accurate detection of CNVs still faces many challenges. In this paper, we present an alternative method, called KNNCNV (K-Nearest Neighbor based CNV detection), for the detection of CNVs using NGS data. Compared to current methods, KNNCNV has several distinctive features: 1) it assigns an outlier score to each genome segment based solely on its first k nearest-neighbor distances, which is not only easy to extend to other data types but also improves the power of discovering CNVs, especially the local CNVs that are likely to be masked by their surrounding regions; 2) it employs the variational Bayesian Gaussian mixture model (VBGMM) to transform these scores into a series of binary labels without a user-defined threshold. To evaluate the performance of KNNCNV, we conduct both simulation and real sequencing data experiments and make comparisons with peer methods. The experimental results show that KNNCNV could derive better performance than others in terms of F1-score.
Collapse
Affiliation(s)
- Kun Xie
- School of Computer Science and Technology, Xidian University, Xi'an, China.,Hangzhou Institute of Technology, Xidian University, Hangzhou, China
| | - Kang Liu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Haque A K Alvi
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Yuehui Chen
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan, China
| | - Shuzhen Wang
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Xiguo Yuan
- School of Computer Science and Technology, Xidian University, Xi'an, China.,Hangzhou Institute of Technology, Xidian University, Hangzhou, China
| |
Collapse
|
28
|
Shah M, Selvanathan A, Baynam G, Berman Y, Boughtwood T, Freckmann M, Parasivam G, White SM, Grainger N, Kirk EP, Ma ASL, Sachdev R. Paediatric genomic testing: Navigating genomic reports for the general paediatrician. J Paediatr Child Health 2022; 58:8-15. [PMID: 34427008 PMCID: PMC9292248 DOI: 10.1111/jpc.15703] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 08/05/2021] [Accepted: 08/06/2021] [Indexed: 11/28/2022]
Abstract
Monogenic rare disorders contribute significantly to paediatric morbidity and mortality, and elucidation of the underlying genetic cause may have benefits for patients, families and clinicians. Advances in genomic technology have enabled diagnostic yields of up to 50% in some paediatric cohorts. This has led to an increase in the uptake of genetic testing across paediatric disciplines. This can place an increased burden on paediatricians, who may now be responsible for interpreting and explaining test results to patients. However, genomic results can be complex, and sometimes inconclusive for the ordering paediatrician. Results may also cause uncertainty and anxiety for patients and their families. The paediatrician's genetic literacy and knowledge of genetic principles are therefore critical to inform discussions with families and guide ongoing patient care. Here, we present four hypothetical case vignettes where genomic testing is undertaken, and discuss possible results and their implications for paediatricians and families. We also provide a list of key terms for paediatricians.
Collapse
Affiliation(s)
- Margit Shah
- Centre for Clinical GeneticsSydney Children's Hospital, Sydney Children's Hospitals NetworkSydneyNew South WalesAustralia,Department of Clinical GeneticsChildren's Hospital at Westmead, Sydney Children's Hospitals NetworkSydneyNew South WalesAustralia,Faculty of Health and Medical ScienceUniversity of SydneySydneyNew South WalesAustralia
| | - Arthavan Selvanathan
- Genetic Metabolic Disorders ServiceSydney Children's Hospitals NetworkSydneyNew South WalesAustralia
| | - Gareth Baynam
- Genetic Services of Western AustraliaKing Edward Memorial HospitalPerthWestern AustraliaAustralia,Western Australian Register of Developmental AnomaliesKing Edward Memorial HospitalPerthWestern AustraliaAustralia
| | - Yemima Berman
- Department of Clinical GeneticsRoyal North Shore HospitalSydneyNew South WalesAustralia,Sydney Medical SchoolUniversity of SydneySydneyNew South WalesAustralia
| | - Tiffany Boughtwood
- Australian GenomicsMelbourneVictoriaAustralia,Murdoch Children’s Research InstituteMelbourneVictoriaAustralia
| | - Mary‐Louise Freckmann
- Department of Clinical GeneticsRoyal North Shore HospitalSydneyNew South WalesAustralia,ACT Genetics ServiceThe Canberra HospitalCanberraAustralian Capital TerritoryAustralia
| | - Gayathri Parasivam
- NSW Health Centre for Genetics EducationRoyal North Shore HospitalSydneyNew South WalesAustralia,Present address:
Women's and Children's HospitalAdelaideSouth AustraliaAustralia
| | - Susan M White
- Victorian Clinical Genetics ServicesMelbourneVictoriaAustralia,Department of PaediatricsUniversity of MelbourneMelbourneVictoriaAustralia
| | - Natalie Grainger
- NSW Health Centre for Genetics EducationRoyal North Shore HospitalSydneyNew South WalesAustralia
| | - Edwin P Kirk
- Centre for Clinical GeneticsSydney Children's Hospital, Sydney Children's Hospitals NetworkSydneyNew South WalesAustralia,School of Women's and Children's HealthUniversity of New South WalesSydneyNew South WalesAustralia,NSW Health Pathology Randwick Genomics LaboratorySydneyNew South WalesAustralia
| | - Alan SL Ma
- Department of Clinical GeneticsChildren's Hospital at Westmead, Sydney Children's Hospitals NetworkSydneyNew South WalesAustralia,Specialty of Genomic MedicineUniversity of SydneySydneyNew South WalesAustralia
| | - Rani Sachdev
- Centre for Clinical GeneticsSydney Children's Hospital, Sydney Children's Hospitals NetworkSydneyNew South WalesAustralia,School of Women's and Children's HealthUniversity of New South WalesSydneyNew South WalesAustralia
| |
Collapse
|
29
|
Identification of Copy Number Alterations from Next-Generation Sequencing Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:55-74. [DOI: 10.1007/978-3-030-91836-1_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
30
|
Wang T, Sun J, Zhang X, Wang WJ, Zhou Q. CNV-P: a machine-learning framework for predicting high confident copy number variations. PeerJ 2021; 9:e12564. [PMID: 34917425 PMCID: PMC8645205 DOI: 10.7717/peerj.12564] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 11/08/2021] [Indexed: 12/27/2022] Open
Abstract
Background Copy-number variants (CNVs) have been recognized as one of the major causes of genetic disorders. Reliable detection of CNVs from genome sequencing data has been a strong demand for disease research. However, current software for detecting CNVs has high false-positive rates, which needs further improvement. Methods Here, we proposed a novel and post-processing approach for CNVs prediction (CNV-P), a machine-learning framework that could efficiently remove false-positive fragments from results of CNVs detecting tools. A series of CNVs signals such as read depth (RD), split reads (SR) and read pair (RP) around the putative CNV fragments were defined as features to train a classifier. Results The prediction results on several real biological datasets showed that our models could accurately classify the CNVs at over 90% precision rate and 85% recall rate, which greatly improves the performance of state-of-the-art algorithms. Furthermore, our results indicate that CNV-P is robust to different sizes of CNVs and the platforms of sequencing. Conclusions Our framework for classifying high-confident CNVs could improve both basic research and clinical diagnosis of genetic diseases.
Collapse
Affiliation(s)
| | - Jinghua Sun
- BGI-Shenzhen, Shenzhen, China.,College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Xiuqing Zhang
- BGI-Shenzhen, Shenzhen, China.,College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.,Guangdong Enterprise Key Laboratory of Human Disease Genomics, Beishan Industrial Zone, Shenzhen, China
| | | | | |
Collapse
|
31
|
Barcelona-Cabeza R, Sanseverino W, Aiese Cigliano R. isoCNV: in silico optimization of copy number variant detection from targeted or exome sequencing data. BMC Bioinformatics 2021; 22:530. [PMID: 34715772 PMCID: PMC8555218 DOI: 10.1186/s12859-021-04452-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 10/19/2021] [Indexed: 01/02/2023] Open
Abstract
Background Accurate copy number variant (CNV) detection is especially challenging for both targeted sequencing (TS) and whole‐exome sequencing (WES) data. To maximize the performance, the parameters of the CNV calling algorithms should be optimized for each specific dataset. This requires obtaining validated CNV information using either multiplex ligation-dependent probe amplification (MLPA) or array comparative genomic hybridization (aCGH). They are gold standard but time-consuming and costly approaches. Results We present isoCNV which optimizes the parameters of DECoN algorithm using only NGS data. The parameter optimization process is performed using an in silico CNV validated dataset obtained from the overlapping calls of three algorithms: CNVkit, panelcn.MOPS and DECoN. We evaluated the performance of our tool and showed that increases the sensitivity in both TS and WES real datasets. Conclusions isoCNV provides an easy-to-use pipeline to optimize DECoN that allows the detection of analysis-ready CNV from a set of DNA alignments obtained under the same conditions. It increases the sensitivity of DECoN without the need for orthogonal methods. isoCNV is available at https://gitlab.com/sequentiateampublic/isocnv.
Collapse
Affiliation(s)
- Rosa Barcelona-Cabeza
- Sequentia Biotech, Carrer de Valencia, Barcelona, Spain.,Departamento de Matemáticas, Escuela Técnica Superior de Ingeniería Industrial de Barcelona (ETSEIB), Universitat Politècnica de Catalunya (UPC), Diagonal 647, Barcelona, Spain
| | | | | |
Collapse
|
32
|
Xiao F, Lu Y, Wu B, Liu B, Li G, Zhang P, Zhou Q, Sun J, Wang H, Zhou W. High-Frequency Exon Deletion of DNA Cross-Link Repair 1C Accounting for Severe Combined Immunodeficiency May Be Missed by Whole-Exome Sequencing. Front Genet 2021; 12:677748. [PMID: 34421990 PMCID: PMC8372405 DOI: 10.3389/fgene.2021.677748] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 06/28/2021] [Indexed: 11/18/2022] Open
Abstract
Next-generation sequencing (NGS) has been used to detect severe combined immunodeficiency (SCID) in patients, and some patients with DNA cross-link repair 1C (DCLRE1C) variants have been identified. Moreover, some compound variants, such as copy number variants (CNV) and single nucleotide variants (SNV), have been reported. The purpose of this study was to expand the genetic data related to patients with SCID carrying the compound DCLRE1C variant. Whole-exome sequencing (WES) was performed for genetic analysis, and variants were verified by performing Sanger sequencing or quantitative PCR. Moreover, we searched PubMed and summarized the data of the reported variants. Four SCID patients with DCLRE1C variants were identified in this study. WES revealed a homozygous deletion in the DCLRE1C gene from exons 1–5 in patient 1, exons 1–3 deletion and a novel rare variant (c.92T>C, p.L31P) in patient 2, exons 1–3 deletion and a novel rare variant (c.328C>G, p.L110V) in patient 3, and exons 1–4 deletion and a novel frameshift variant (c.449dup, p.His151Alafs*20) in patient 4. Based on literature review, exons 1–3 was recognized as a hotspot region for deletion variation. Moreover, we found that compound variations (CNV + SNV) accounted for approximately 7% variations in all variants. When patients are screened for T-cell receptor excision circles (TRECs), NGS can be used to expand genetic testing. Deletion of the DCLRE1C gene should not be ignored when a variant has been found in patients with SCID.
Collapse
Affiliation(s)
- Feifan Xiao
- Center for Molecular Medicine, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China
| | - Yulan Lu
- Center for Molecular Medicine, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China
| | - Bingbing Wu
- Center for Molecular Medicine, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China
| | - Bo Liu
- Center for Molecular Medicine, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China
| | - Gang Li
- Center for Molecular Medicine, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China
| | - Ping Zhang
- Center for Molecular Medicine, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China
| | - Qinhua Zhou
- Department of Immunology, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China
| | - Jinqiao Sun
- Department of Immunology, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China
| | - Huijun Wang
- Center for Molecular Medicine, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China
| | - Wenhao Zhou
- Center for Molecular Medicine, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China.,Key Laboratory of Neonatal Diseases, Ministry of Health, Department of Neonates, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China
| |
Collapse
|
33
|
Applying Bioinformatic Platforms, In Vitro, and In Vivo Functional Assays in the Characterization of Genetic Variants in the GH/IGF Pathway Affecting Growth and Development. Cells 2021; 10:cells10082063. [PMID: 34440832 PMCID: PMC8392544 DOI: 10.3390/cells10082063] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Revised: 08/06/2021] [Accepted: 08/09/2021] [Indexed: 02/07/2023] Open
Abstract
Heritability accounts for over 80% of adult human height, indicating that genetic variability is the main determinant of stature. The rapid technological development of Next-Generation Sequencing (NGS), particularly Whole Exome Sequencing (WES), has resulted in the characterization of several genetic conditions affecting growth and development. The greatest challenge of NGS remains the high number of candidate variants identified. In silico bioinformatic tools represent the first approach for classifying these variants. However, solving the complicated problem of variant interpretation requires the use of experimental approaches such as in vitro and, when needed, in vivo functional assays. In this review, we will discuss a rational approach to apply to the gene variants identified in children with growth and developmental defects including: (i) bioinformatic tools; (ii) in silico modeling tools; (iii) in vitro functional assays; and (iv) the development of in vivo models. While bioinformatic tools are useful for a preliminary selection of potentially pathogenic variants, in vitro—and sometimes also in vivo—functional assays are further required to unequivocally determine the pathogenicity of a novel genetic variant. This long, time-consuming, and expensive process is the only scientifically proven method to determine causality between a genetic variant and a human genetic disease.
Collapse
|
34
|
Linderman MD, Paudyal C, Shakeel M, Kelley W, Bashir A, Gelb BD. NPSV: A simulation-driven approach to genotyping structural variants in whole-genome sequencing data. Gigascience 2021; 10:giab046. [PMID: 34195837 PMCID: PMC8246072 DOI: 10.1093/gigascience/giab046] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 05/04/2021] [Accepted: 06/07/2021] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Structural variants (SVs) play a causal role in numerous diseases but are difficult to detect and accurately genotype (determine zygosity) in whole-genome next-generation sequencing data. SV genotypers that assume that the aligned sequencing data uniformly reflect the underlying SV or use existing SV call sets as training data can only partially account for variant and sample-specific biases. RESULTS We introduce NPSV, a machine learning-based approach for genotyping previously discovered SVs that uses next-generation sequencing simulation to model the combined effects of the genomic region, sequencer, and alignment pipeline on the observed SV evidence. We evaluate NPSV alongside existing SV genotypers on multiple benchmark call sets. We show that NPSV consistently achieves or exceeds state-of-the-art genotyping accuracy across SV call sets, samples, and variant types. NPSV can specifically identify putative de novo SVs in a trio context and is robust to offset SV breakpoints. CONCLUSIONS Growing SV databases and the increasing availability of SV calls from long-read sequencing make stand-alone genotyping of previously identified SVs an increasingly important component of genome analyses. By treating potential biases as a "black box" that can be simulated, NPSV provides a framework for accurately genotyping a broad range of SVs in both targeted and genome-scale applications.
Collapse
Affiliation(s)
- Michael D Linderman
- Department of Computer Science, Middlebury College, 14 Old Chapel Road, Middlebury, VT 05753, USA
| | - Crystal Paudyal
- Department of Computer Science, Middlebury College, 14 Old Chapel Road, Middlebury, VT 05753, USA
| | - Musab Shakeel
- Department of Computer Science, Middlebury College, 14 Old Chapel Road, Middlebury, VT 05753, USA
| | - William Kelley
- Department of Computer Science, Middlebury College, 14 Old Chapel Road, Middlebury, VT 05753, USA
| | - Ali Bashir
- Google, 1600 Amphitheatre Parkway, Mountain View, CA 94043, USA
| | - Bruce D Gelb
- Mindich Child Health and Development Institute and the Departments of Pediatrics and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave Levy Place, Box 1040, New York, NY 10029, USA
| |
Collapse
|
35
|
Navrkalova V, Plevova K, Hynst J, Pal K, Mareckova A, Reigl T, Jelinkova H, Vrzalova Z, Stranska K, Pavlova S, Panovska A, Janikova A, Doubek M, Kotaskova J, Pospisilova S. LYmphoid NeXt-Generation Sequencing (LYNX) Panel: A Comprehensive Capture-Based Sequencing Tool for the Analysis of Prognostic and Predictive Markers in Lymphoid Malignancies. J Mol Diagn 2021; 23:959-974. [PMID: 34082072 DOI: 10.1016/j.jmoldx.2021.05.007] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2020] [Revised: 04/21/2021] [Accepted: 05/03/2021] [Indexed: 02/07/2023] Open
Abstract
B-cell neoplasms represent a clinically heterogeneous group of hematologic malignancies with considerably diverse genomic architecture recently endorsed by next-generation sequencing (NGS) studies. Because multiple genetic defects have a potential or confirmed clinical impact, a tendency toward more comprehensive testing of diagnostic, prognostic, and predictive markers is desired. This study introduces the design, validation, and implementation of an integrative, custom-designed, capture-based NGS panel titled LYmphoid NeXt-generation sequencing (LYNX) for the analysis of standard and novel molecular markers in the most common lymphoid neoplasms (chronic lymphocytic leukemia, acute lymphoblastic leukemia, diffuse large B-cell lymphoma, follicular lymphoma, and mantle cell lymphoma). A single LYNX test provides the following: i) accurate detection of mutations in all coding exons and splice sites of 70 lymphoma-related genes with a sensitivity of 5% variant allele frequency, ii) reliable identification of large genome-wide (≥6 Mb) and recurrent chromosomal aberrations (≥300 kb) in at least 20% of the clonal cell fraction, iii) the assessment of immunoglobulin and T-cell receptor gene rearrangements, and iv) lymphoma-specific translocation detection. Dedicated bioinformatic pipelines were designed to detect all markers mentioned above. The LYNX panel represents a comprehensive, up-to-date tool suitable for routine testing of lymphoid neoplasms with research and clinical applicability. It allows a wide adoption of capture-based targeted NGS in clinical practice and personalized management of patients with lymphoproliferative diseases.
Collapse
Affiliation(s)
- Veronika Navrkalova
- Department of Internal Medicine - Hematology and Oncology, Masaryk University and University Hospital Brno, Brno, Czech Republic; Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Karla Plevova
- Department of Internal Medicine - Hematology and Oncology, Masaryk University and University Hospital Brno, Brno, Czech Republic; Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic; Institute of Medical Genetics and Genomics, Faculty of Medicine, Masaryk University and University Hospital Brno, Brno, Czech Republic
| | - Jakub Hynst
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Karol Pal
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic; Department of Internal Medicine II - Hematology and Oncology, University Medical Center Schleswig-Holstein, Kiel, Germany
| | - Andrea Mareckova
- Department of Internal Medicine - Hematology and Oncology, Masaryk University and University Hospital Brno, Brno, Czech Republic
| | - Tomas Reigl
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Hana Jelinkova
- Department of Internal Medicine - Hematology and Oncology, Masaryk University and University Hospital Brno, Brno, Czech Republic
| | - Zuzana Vrzalova
- Department of Internal Medicine - Hematology and Oncology, Masaryk University and University Hospital Brno, Brno, Czech Republic; Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Kamila Stranska
- Department of Internal Medicine - Hematology and Oncology, Masaryk University and University Hospital Brno, Brno, Czech Republic
| | - Sarka Pavlova
- Department of Internal Medicine - Hematology and Oncology, Masaryk University and University Hospital Brno, Brno, Czech Republic; Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Anna Panovska
- Department of Internal Medicine - Hematology and Oncology, Masaryk University and University Hospital Brno, Brno, Czech Republic
| | - Andrea Janikova
- Department of Internal Medicine - Hematology and Oncology, Masaryk University and University Hospital Brno, Brno, Czech Republic
| | - Michael Doubek
- Department of Internal Medicine - Hematology and Oncology, Masaryk University and University Hospital Brno, Brno, Czech Republic; Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic; Institute of Medical Genetics and Genomics, Faculty of Medicine, Masaryk University and University Hospital Brno, Brno, Czech Republic
| | - Jana Kotaskova
- Department of Internal Medicine - Hematology and Oncology, Masaryk University and University Hospital Brno, Brno, Czech Republic; Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Sarka Pospisilova
- Department of Internal Medicine - Hematology and Oncology, Masaryk University and University Hospital Brno, Brno, Czech Republic; Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic; Institute of Medical Genetics and Genomics, Faculty of Medicine, Masaryk University and University Hospital Brno, Brno, Czech Republic.
| |
Collapse
|
36
|
Krude H, Mundlos S, Øien NC, Opitz R, Schuelke M. What can go wrong in the non-coding genome and how to interpret whole genome sequencing data. MED GENET-BERLIN 2021; 33:121-131. [PMID: 38836035 PMCID: PMC11007630 DOI: 10.1515/medgen-2021-2071] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Accepted: 06/24/2021] [Indexed: 06/06/2024]
Abstract
Whole exome sequencing discovers causative mutations in less than 50 % of rare disease patients, suggesting the presence of additional mutations in the non-coding genome. So far, non-coding mutations have been identified in less than 0.2 % of individuals with genetic diseases listed in the ClinVar database and exhibit highly diverse molecular mechanisms. In contrast to our capability to sequence the whole genome, our ability to discover and functionally confirm such non-coding mutations is lagging behind severely. We discuss the problems and present examples of confirmed mutations in deep intronic sequences, non-coding triplet repeats, enhancers, and larger structural variants and highlight their proposed disease mechanisms. Finally, we discuss the type of data that would be required to establish non-coding mutation detection in routine diagnostics.
Collapse
Affiliation(s)
- Heiko Krude
- Institute of Experimental Pediatric Endocrinology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Stefan Mundlos
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Nancy Christine Øien
- Department of Neuropediatrics, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Robert Opitz
- Institute of Experimental Pediatric Endocrinology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Markus Schuelke
- Department of Neuropediatrics, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- NeuroCure Cluster of Excellence, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| |
Collapse
|
37
|
Nandolo W, Mészáros G, Wurzinger M, Banda LJ, Gondwe TN, Mulindwa HA, Nakimbugwe HN, Clark EL, Woodward-Greene MJ, Liu M, Liu GE, Van Tassell CP, Rosen BD, Sölkner J. Detection of copy number variants in African goats using whole genome sequence data. BMC Genomics 2021; 22:398. [PMID: 34051743 PMCID: PMC8164248 DOI: 10.1186/s12864-021-07703-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Accepted: 05/11/2021] [Indexed: 12/21/2022] Open
Abstract
Background Copy number variations (CNV) are a significant source of variation in the genome and are therefore essential to the understanding of genetic characterization. The aim of this study was to develop a fine-scaled copy number variation map for African goats. We used sequence data from multiple breeds and from multiple African countries. Results A total of 253,553 CNV (244,876 deletions and 8677 duplications) were identified, corresponding to an overall average of 1393 CNV per animal. The mean CNV length was 3.3 kb, with a median of 1.3 kb. There was substantial differentiation between the populations for some CNV, suggestive of the effect of population-specific selective pressures. A total of 6231 global CNV regions (CNVR) were found across all animals, representing 59.2 Mb (2.4%) of the goat genome. About 1.6% of the CNVR were present in all 34 breeds and 28.7% were present in all 5 geographical areas across Africa, where animals had been sampled. The CNVR had genes that were highly enriched in important biological functions, molecular functions, and cellular components including retrograde endocannabinoid signaling, glutamatergic synapse and circadian entrainment. Conclusions This study presents the first fine CNV map of African goat based on WGS data and adds to the growing body of knowledge on the genetic characterization of goats. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07703-1.
Collapse
Affiliation(s)
- Wilson Nandolo
- University of Natural Resources and Life Sciences, Vienna, Austria.,Lilongwe University of Agriculture and Natural Resources, Lilongwe, Malawi
| | - Gábor Mészáros
- University of Natural Resources and Life Sciences, Vienna, Austria
| | - Maria Wurzinger
- University of Natural Resources and Life Sciences, Vienna, Austria
| | - Liveness J Banda
- Lilongwe University of Agriculture and Natural Resources, Lilongwe, Malawi
| | - Timothy N Gondwe
- Lilongwe University of Agriculture and Natural Resources, Lilongwe, Malawi
| | | | | | - Emily L Clark
- The Roslin Institute, University of Edinburgh, Edinburgh, Scotland, UK
| | - M Jennifer Woodward-Greene
- Animal Genomics and Improvement Laboratory, USDA-ARS, Beltsville, MD, USA.,National Agricultural Library, USDA-ARS, Beltsville, MD, USA
| | - Mei Liu
- Animal Genomics and Improvement Laboratory, USDA-ARS, Beltsville, MD, USA
| | | | - George E Liu
- Animal Genomics and Improvement Laboratory, USDA-ARS, Beltsville, MD, USA
| | | | - Benjamin D Rosen
- Animal Genomics and Improvement Laboratory, USDA-ARS, Beltsville, MD, USA.
| | - Johann Sölkner
- University of Natural Resources and Life Sciences, Vienna, Austria
| |
Collapse
|
38
|
Gong T, Hayes VM, Chan EKF. Detection of somatic structural variants from short-read next-generation sequencing data. Brief Bioinform 2021; 22:bbaa056. [PMID: 32379294 PMCID: PMC8138798 DOI: 10.1093/bib/bbaa056] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2019] [Revised: 03/05/2020] [Accepted: 03/29/2020] [Indexed: 01/09/2023] Open
Abstract
Somatic structural variants (SVs), which are variants that typically impact >50 nucleotides, play a significant role in cancer development and evolution but are notoriously more difficult to detect than small variants from short-read next-generation sequencing (NGS) data. This is due to a combination of challenges attributed to the purity of tumour samples, tumour heterogeneity, limitations of short-read information from NGS and sequence alignment ambiguities. In spite of active development of SV detection tools (callers) over the past few years, each method has inherent advantages and limitations. In this review, we highlight some of the important factors affecting somatic SV detection and compared the performance of seven commonly used SV callers. In particular, we focus on the extent of change in sensitivity and precision for detecting different SV types and size ranges from samples with differing variant allele frequencies and sequencing depths of coverage. We highlight the reasons for why some SV callers perform well in some settings but not others, allowing our evaluation findings to be extended beyond the seven SV callers examined in this paper. As the importance of large SVs become increasingly recognized in cancer genomics, this paper provides a timely review on some of the most impactful factors influencing somatic SV detection that should be considered when choosing SV callers.
Collapse
Affiliation(s)
| | - Vanessa M Hayes
- Corresponding authors: Eva K.F. Chan, New South Wales Health Pathology, Newcastle, NSW 2300, Australia. E-mail: ; Vanessa M. Hayes, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia. Tel.: +61-2-9355-5841; Fax: +61 2-2-9295-8151; E-mail:
| | - Eva K F Chan
- Corresponding authors: Eva K.F. Chan, New South Wales Health Pathology, Newcastle, NSW 2300, Australia. E-mail: ; Vanessa M. Hayes, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia. Tel.: +61-2-9355-5841; Fax: +61 2-2-9295-8151; E-mail:
| |
Collapse
|
39
|
Moreno-Cabrera JM, Del Valle J, Castellanos E, Feliubadaló L, Pineda M, Serra E, Capellá G, Lázaro C, Gel B. CNVfilteR: an R/bioconductor package to identify false positives produced by germline NGS CNV detection tools. Bioinformatics 2021; 37:4227-4229. [PMID: 33983414 PMCID: PMC9502136 DOI: 10.1093/bioinformatics/btab356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 03/06/2021] [Accepted: 05/12/2021] [Indexed: 11/14/2022] Open
Abstract
Germline copy-number variants (CNVs) are relevant mutations for multiple genetics fields, such as the study of hereditary diseases. However, available benchmarks show that all next-generation sequencing (NGS) CNV calling tools produce false positives. We developed CNVfilteR, an R package that uses the single nucleotide variant calls usually obtained in germline NGS pipelines to identify those false positives. The package can detect both false deletions and false duplications. We evaluated CNVfilteR performance on callsets generated by 13 CNV calling tools on 3 whole-genome sequencing and 541 panel samples, showing a decrease of up to 44.8% in false positives and consistent F1-score increase. Using CNVfilteR to detect false-positive calls can improve the overall performance of existing CNV calling pipelines. AVAILABILITY CNVfilteR is released under Artistic-2.0 License. Source code and documentation are freely available at Bioconductor (http://www.bioconductor.org/packages/CNVfilteR). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- José Marcos Moreno-Cabrera
- Hereditary Cancer Group, Program for Predictive and Personalized Medicine of Cancer, Germans Trias i Pujol Research Institute (PMPPC-IGTP), Campus, Ruti Badalona Barcelona, Can Spain.,Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge-IDIBELL, L'Hospitalet de Llobregat, Barcelona, Spain.,Instituto de Salud Carlos III, Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Madrid, Spain
| | - Jesús Del Valle
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge-IDIBELL, L'Hospitalet de Llobregat, Barcelona, Spain.,Instituto de Salud Carlos III, Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Madrid, Spain
| | - Elisabeth Castellanos
- Hereditary Cancer Group, Program for Predictive and Personalized Medicine of Cancer, Germans Trias i Pujol Research Institute (PMPPC-IGTP), Campus, Ruti Badalona Barcelona, Can Spain.,Clinical Genomics Unit, Clinical Genetics Service, Northern Metropolitan Clinical Laboratory, Germans Trias i Pujol University Hospital (HUGTiP), Ruti, Campus Badalona Barcelona, Can Spain
| | - Lidia Feliubadaló
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge-IDIBELL, L'Hospitalet de Llobregat, Barcelona, Spain.,Instituto de Salud Carlos III, Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Madrid, Spain
| | - Marta Pineda
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge-IDIBELL, L'Hospitalet de Llobregat, Barcelona, Spain.,Instituto de Salud Carlos III, Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Madrid, Spain
| | - Eduard Serra
- Hereditary Cancer Group, Program for Predictive and Personalized Medicine of Cancer, Germans Trias i Pujol Research Institute (PMPPC-IGTP), Campus, Ruti Badalona Barcelona, Can Spain.,Instituto de Salud Carlos III, Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Madrid, Spain
| | - Gabriel Capellá
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge-IDIBELL, L'Hospitalet de Llobregat, Barcelona, Spain.,Instituto de Salud Carlos III, Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Madrid, Spain
| | - Conxi Lázaro
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge-IDIBELL, L'Hospitalet de Llobregat, Barcelona, Spain.,Instituto de Salud Carlos III, Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Madrid, Spain
| | - Bernat Gel
- Hereditary Cancer Group, Program for Predictive and Personalized Medicine of Cancer, Germans Trias i Pujol Research Institute (PMPPC-IGTP), Campus, Ruti Badalona Barcelona, Can Spain
| |
Collapse
|
40
|
Cayuela H, Dorant Y, Mérot C, Laporte M, Normandeau E, Gagnon-Harvey S, Clément M, Sirois P, Bernatchez L. Thermal adaptation rather than demographic history drives genetic structure inferred by copy number variants in a marine fish. Mol Ecol 2021; 30:1624-1641. [PMID: 33565147 DOI: 10.1111/mec.15835] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 01/15/2021] [Accepted: 02/01/2021] [Indexed: 12/22/2022]
Abstract
Increasing evidence shows that structural variants represent an overlooked aspect of genetic variation with consequential evolutionary roles. Among those, copy number variants (CNVs), including duplicated genomic regions and transposable elements (TEs), may contribute to local adaptation and/or reproductive isolation among divergent populations. Those mechanisms suppose that CNVs could be used to infer neutral and/or adaptive population genetic structure, whose study has been restricted to microsatellites, mitochondrial DNA and Amplified fragment length polymorphism markers in the past and more recently the use of single nucleotide polymorphisms (SNPs). Taking advantage of recent developments allowing CNV analysis from RAD-seq data, we investigated how variation in fitness-related traits, local environmental conditions and demographic history are associated with CNVs, and how subsequent copy number variation drives population genetic structure in a marine fish, the capelin (Mallotus villosus). We collected 1538 DNA samples from 35 sampling sites in the north Atlantic Ocean and identified 6620 putative CNVs. We found associations between CNVs and the gonadosomatic index, suggesting that six duplicated regions could affect female fitness by modulating oocyte production. We also detected 105 CNV candidates associated with water temperature, among which 20% corresponded to genomic regions located within the sequence of protein-coding genes, suggesting local adaptation to cold water by means of gene sequence amplification. We also identified 175 CNVs associated with the divergence of three previously defined parapatric glacial lineages, of which 24% were located within protein-coding genes, making those loci potential candidates for reproductive isolation. Lastly, our analyses unveiled a hierarchical, complex CNV population structure determined by temperature and local geography, which was in stark contrast to that inferred based on SNPs in a previous study. Our findings underline the complementarity of those two types of genomic variation in population genomics studies.
Collapse
Affiliation(s)
- Hugo Cayuela
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada.,Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Yann Dorant
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
| | - Claire Mérot
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
| | - Martin Laporte
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
| | - Eric Normandeau
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
| | - Stéphane Gagnon-Harvey
- Département des sciences fondamentales, Université du Québec à Chicoutimi, Chicoutimi, QC, Canada
| | - Marie Clément
- Center for Fisheries Ecosystems Research, Fisheries and Marine Institute of Memorial, University of Newfoundland, St. John's, NL, Canada.,Labrador Institute of Memorial University of Newfoundland, Happy Valley-Goose Bay, NL, Canada
| | - Pascal Sirois
- Département des sciences fondamentales, Université du Québec à Chicoutimi, Chicoutimi, QC, Canada
| | - Louis Bernatchez
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
| |
Collapse
|
41
|
Robust Benchmark Structural Variant Calls of An Asian Using the State-of-art Long Fragment Sequencing Technologies. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 20:192-204. [PMID: 33662625 PMCID: PMC9510867 DOI: 10.1016/j.gpb.2020.10.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 09/17/2020] [Accepted: 12/26/2020] [Indexed: 12/12/2022]
Abstract
The importance of structural variants (SVs) for human phenotypes and diseases is now recognized. Although a variety of SV detection platforms and strategies that vary in sensitivity and specificity have been developed, few benchmarking procedures are available to confidently assess their performances in biological and clinical research. To facilitate the validation and application of these SV detection approaches, we established an Asian reference material by characterizing the genome of an Epstein-Barr virus (EBV)-immortalized B lymphocyte line along with identified benchmark regions and high-confidence SV calls. We established a high-confidence SV callset with 8938 SVs by integrating four alignment-based SV callers, including 109× Pacific Biosciences (PacBio) continuous long reads (CLRs), 22× PacBio circular consensus sequencing (CCS) reads, 104× Oxford Nanopore Technologies (ONT) long reads, and 114× Bionano optical mapping platform, and one de novo assembly-based SV caller using CCS reads. A total of 544 randomly selected SVs were validated by PCR amplification and Sanger sequencing, demonstrating the robustness of our SV calls. Combining trio-binning-based haplotype assemblies, we established an SV benchmark for identifying false negatives and false positives by constructing the continuous high-confidence regions (CHCRs), which covered 1.46 gigabase pairs (Gb) and 6882 SVs supported by at least one diploid haplotype assembly. Establishing high-confidence SV calls for a benchmark sample that has been characterized by multiple technologies provides a valuable resource for investigating SVs in human biology, disease, and clinical research.
Collapse
|
42
|
Searles Quick VB, Wang B, State MW. Leveraging large genomic datasets to illuminate the pathobiology of autism spectrum disorders. Neuropsychopharmacology 2021; 46:55-69. [PMID: 32668441 PMCID: PMC7688655 DOI: 10.1038/s41386-020-0768-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Revised: 06/26/2020] [Accepted: 07/06/2020] [Indexed: 12/15/2022]
Abstract
"Big data" approaches in the form of large-scale human genomic studies have led to striking advances in autism spectrum disorder (ASD) genetics. Similar to many other psychiatric syndromes, advances in genotyping technology, allowing for inexpensive genome-wide assays, has confirmed the contribution of polygenic inheritance involving common alleles of small effect, a handful of which have now been definitively identified. However, the past decade of gene discovery in ASD has been most notable for the application, in large family-based cohorts, of high-density microarray studies of submicroscopic chromosomal structure as well as high-throughput DNA sequencing-leading to the identification of an increasingly long list of risk regions and genes disrupted by rare, de novo germline mutations of large effect. This genomic architecture offers particular advantages for the illumination of biological mechanisms but also presents distinctive challenges. While the tremendous locus heterogeneity and functional pleiotropy associated with the more than 100 identified ASD-risk genes and regions is daunting, a growing armamentarium of comprehensive, large, foundational -omics databases, across species and capturing developmental trajectories, are increasingly contributing to a deeper understanding of ASD pathology.
Collapse
Affiliation(s)
- Veronica B Searles Quick
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Belinda Wang
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Matthew W State
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA.
| |
Collapse
|
43
|
Moreno-Cabrera JM, Del Valle J, Feliubadaló L, Pineda M, González S, Campos O, Cuesta R, Brunet J, Serra E, Capellà G, Gel B, Lázaro C. Screening of CNVs using NGS data improves mutation detection yield and decreases costs in genetic testing for hereditary cancer. J Med Genet 2020; 59:75-78. [PMID: 33219106 DOI: 10.1136/jmedgenet-2020-107366] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 09/23/2020] [Accepted: 09/24/2020] [Indexed: 01/23/2023]
Abstract
INTRODUCTION Germline CNVs are important contributors to hereditary cancer. In genetic diagnostics, multiplex ligation-dependent probe amplification (MLPA) is commonly used to identify them. However, MLPA is time-consuming and expensive if applied to many genes, hence many routine laboratories test only a subset of genes of interest. METHODS AND RESULTS We evaluated a next-generation sequencing (NGS)-based CNV detection tool (DECoN) as first-tier screening to decrease costs and turnaround time and expand CNV analysis to all genes of clinical interest in our diagnostics routine. We used DECoN in a retrospective cohort of 1860 patients where a limited number of genes were previously analysed by MLPA, and in a prospective cohort of 2041 patients, without MLPA analysis. In the retrospective cohort, 6 new CNVs were identified and confirmed by MLPA. In the prospective cohort, 19 CNVs were identified and confirmed by MLPA, 8 of these would have been lost in our previous MLPA-restricted detection strategy. Also, the number of genes tested by MLPA across all samples decreased by 93.0% in the prospective cohort. CONCLUSION Including an in silico germline NGS CNV detection tool improved our genetic diagnostics strategy in hereditary cancer, both increasing the number of CNVs detected and reducing turnaround time and costs.
Collapse
Affiliation(s)
- José Marcos Moreno-Cabrera
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge - IDIBELL-ONCOBELL, L'Hospitalet de Llobregat, Spain.,Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain.,Hereditary Cancer Group, Program for Predictive and Personalized Medicine of Cancer - Germans Trias i Pujol Research Institute (PMPPC-IGTP), Campus Can Ruti, Badalona, Spain
| | - Jesús Del Valle
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge - IDIBELL-ONCOBELL, L'Hospitalet de Llobregat, Spain.,Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain
| | - Lidia Feliubadaló
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge - IDIBELL-ONCOBELL, L'Hospitalet de Llobregat, Spain.,Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain
| | - Marta Pineda
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge - IDIBELL-ONCOBELL, L'Hospitalet de Llobregat, Spain.,Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain
| | - Sara González
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge - IDIBELL-ONCOBELL, L'Hospitalet de Llobregat, Spain.,Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain
| | - Olga Campos
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge - IDIBELL-ONCOBELL, L'Hospitalet de Llobregat, Spain.,Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain
| | - Raquel Cuesta
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge - IDIBELL-ONCOBELL, L'Hospitalet de Llobregat, Spain.,Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain
| | - Joan Brunet
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge - IDIBELL-ONCOBELL, L'Hospitalet de Llobregat, Spain.,Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain.,Hereditary Cancer Program, Catalan Institute of Oncology, IDIBGi, Girona, Spain
| | - Eduard Serra
- Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain.,Hereditary Cancer Group, Program for Predictive and Personalized Medicine of Cancer - Germans Trias i Pujol Research Institute (PMPPC-IGTP), Campus Can Ruti, Badalona, Spain
| | - Gabriel Capellà
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge - IDIBELL-ONCOBELL, L'Hospitalet de Llobregat, Spain.,Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain
| | - Bernat Gel
- Hereditary Cancer Group, Program for Predictive and Personalized Medicine of Cancer - Germans Trias i Pujol Research Institute (PMPPC-IGTP), Campus Can Ruti, Badalona, Spain
| | - Conxi Lázaro
- Hereditary Cancer Program, Joint Program on Hereditary Cancer, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge - IDIBELL-ONCOBELL, L'Hospitalet de Llobregat, Spain .,Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain
| |
Collapse
|
44
|
Zhuang X, Ye R, So MT, Lam WY, Karim A, Yu M, Ngo ND, Cherny SS, Tam PKH, Garcia-Barcelo MM, Tang CSM, Sham PC. A random forest-based framework for genotyping and accuracy assessment of copy number variations. NAR Genom Bioinform 2020; 2:lqaa071. [PMID: 33575619 PMCID: PMC7671382 DOI: 10.1093/nargab/lqaa071] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 08/18/2020] [Accepted: 08/26/2020] [Indexed: 12/24/2022] Open
Abstract
Detection of copy number variations (CNVs) is essential for uncovering genetic factors underlying human diseases. However, CNV detection by current methods is prone to error, and precisely identifying CNVs from paired-end whole genome sequencing (WGS) data is still challenging. Here, we present a framework, CNV-JACG, for Judging the Accuracy of CNVs and Genotyping using paired-end WGS data. CNV-JACG is based on a random forest model trained on 21 distinctive features characterizing the CNV region and its breakpoints. Using the data from the 1000 Genomes Project, Genome in a Bottle Consortium, the Human Genome Structural Variation Consortium and in-house technical replicates, we show that CNV-JACG has superior sensitivity over the latest genotyping method, SV2, particularly for the small CNVs (≤1 kb). We also demonstrate that CNV-JACG outperforms SV2 in terms of Mendelian inconsistency in trios and concordance between technical replicates. Our study suggests that CNV-JACG would be a useful tool in assessing the accuracy of CNVs to meet the ever-growing needs for uncovering the missing heritability linked to CNVs.
Collapse
Affiliation(s)
- Xuehan Zhuang
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Rui Ye
- Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Man-Ting So
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Wai-Yee Lam
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Anwarul Karim
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Michelle Yu
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Ngoc Diem Ngo
- National Hospital of Pediatrics, Ha Noi 100000, Vietnam
| | - Stacey S Cherny
- Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Paul Kwong-Hang Tam
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | | | - Clara Sze-Man Tang
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Pak Chung Sham
- Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| |
Collapse
|
45
|
Lujan SA, Longley MJ, Humble MH, Lavender CA, Burkholder A, Blakely EL, Alston CL, Gorman GS, Turnbull DM, McFarland R, Taylor RW, Kunkel TA, Copeland WC. Ultrasensitive deletion detection links mitochondrial DNA replication, disease, and aging. Genome Biol 2020; 21:248. [PMID: 32943091 PMCID: PMC7500033 DOI: 10.1186/s13059-020-02138-5] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Accepted: 08/07/2020] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Acquired human mitochondrial genome (mtDNA) deletions are symptoms and drivers of focal mitochondrial respiratory deficiency, a pathological hallmark of aging and late-onset mitochondrial disease. RESULTS To decipher connections between these processes, we create LostArc, an ultrasensitive method for quantifying deletions in circular mtDNA molecules. LostArc reveals 35 million deletions (~ 470,000 unique spans) in skeletal muscle from 22 individuals with and 19 individuals without pathogenic variants in POLG. This nuclear gene encodes the catalytic subunit of replicative mitochondrial DNA polymerase γ. Ablation, the deleted mtDNA fraction, suffices to explain skeletal muscle phenotypes of aging and POLG-derived disease. Unsupervised bioinformatic analyses reveal distinct age- and disease-correlated deletion patterns. CONCLUSIONS These patterns implicate replication by DNA polymerase γ as the deletion driver and suggest little purifying selection against mtDNA deletions by mitophagy in postmitotic muscle fibers. Observed deletion patterns are best modeled as mtDNA deletions initiated by replication fork stalling during strand displacement mtDNA synthesis.
Collapse
Affiliation(s)
- Scott A Lujan
- Genome Integrity and Structural Biology Laboratory, DNA Replication Fidelity Group, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, 27709, USA
| | - Matthew J Longley
- Genome Integrity and Structural Biology Laboratory, Mitochondrial DNA Replication Group, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, 27709, USA
| | - Margaret H Humble
- Genome Integrity and Structural Biology Laboratory, Mitochondrial DNA Replication Group, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, 27709, USA
| | - Christopher A Lavender
- Integrative Bioinformatics, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, 27709, USA
| | - Adam Burkholder
- Integrative Bioinformatics, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, 27709, USA
| | - Emma L Blakely
- Wellcome Centre for Mitochondrial Research, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
- NHS Highly Specialised Mitochondrial Diagnostic Laboratory, Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, NE1 4LP, UK
| | - Charlotte L Alston
- Wellcome Centre for Mitochondrial Research, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
- NHS Highly Specialised Mitochondrial Diagnostic Laboratory, Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, NE1 4LP, UK
| | - Grainne S Gorman
- Wellcome Centre for Mitochondrial Research, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
| | - Doug M Turnbull
- Wellcome Centre for Mitochondrial Research, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
| | - Robert McFarland
- Wellcome Centre for Mitochondrial Research, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
| | - Robert W Taylor
- Wellcome Centre for Mitochondrial Research, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
- NHS Highly Specialised Mitochondrial Diagnostic Laboratory, Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, NE1 4LP, UK
| | - Thomas A Kunkel
- Genome Integrity and Structural Biology Laboratory, DNA Replication Fidelity Group, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, 27709, USA
| | - William C Copeland
- Genome Integrity and Structural Biology Laboratory, Mitochondrial DNA Replication Group, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, 27709, USA.
| |
Collapse
|
46
|
Spealman P, Burrell J, Gresham D. Inverted duplicate DNA sequences increase translocation rates through sequencing nanopores resulting in reduced base calling accuracy. Nucleic Acids Res 2020; 48:4940-4945. [PMID: 32255181 PMCID: PMC7229812 DOI: 10.1093/nar/gkaa206] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Revised: 03/14/2020] [Accepted: 04/03/2020] [Indexed: 12/27/2022] Open
Abstract
Inverted duplicated DNA sequences are a common feature of structural variants (SVs) and copy number variants (CNVs). Analysis of CNVs containing inverted duplicated DNA sequences using nanopore sequencing identified recurrent aberrant behavior characterized by low confidence, incorrect and missed base calls. Inverted duplicate DNA sequences in both yeast and human samples were observed to have systematic elevation in the electrical current detected at the nanopore, increased translocation rates and decreased sampling rates. The coincidence of inverted duplicated DNA sequences with dramatically reduced sequencing accuracy and an increased translocation rate suggests that secondary DNA structures may interfere with the dynamics of transit of the DNA through the nanopore.
Collapse
Affiliation(s)
- Pieter Spealman
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003, USA
| | - Jaden Burrell
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003, USA
| | - David Gresham
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003, USA
| |
Collapse
|
47
|
Dorant Y, Cayuela H, Wellband K, Laporte M, Rougemont Q, Mérot C, Normandeau E, Rochette R, Bernatchez L. Copy number variants outperform SNPs to reveal genotype–temperature association in a marine species. Mol Ecol 2020; 29:4765-4782. [PMID: 32803780 DOI: 10.1111/mec.15565] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 07/16/2020] [Accepted: 07/21/2020] [Indexed: 12/12/2022]
Affiliation(s)
- Yann Dorant
- Institut de Biologie Intégrative des Systèmes (IBIS) Université Laval Québec QC Canada
| | - Hugo Cayuela
- Institut de Biologie Intégrative des Systèmes (IBIS) Université Laval Québec QC Canada
| | - Kyle Wellband
- Institut de Biologie Intégrative des Systèmes (IBIS) Université Laval Québec QC Canada
| | - Martin Laporte
- Institut de Biologie Intégrative des Systèmes (IBIS) Université Laval Québec QC Canada
| | - Quentin Rougemont
- Institut de Biologie Intégrative des Systèmes (IBIS) Université Laval Québec QC Canada
| | - Claire Mérot
- Institut de Biologie Intégrative des Systèmes (IBIS) Université Laval Québec QC Canada
| | - Eric Normandeau
- Institut de Biologie Intégrative des Systèmes (IBIS) Université Laval Québec QC Canada
| | - Rémy Rochette
- Department of Biology University of New Brunswick Saint John NB Canada
| | - Louis Bernatchez
- Institut de Biologie Intégrative des Systèmes (IBIS) Université Laval Québec QC Canada
| |
Collapse
|
48
|
Jang H, Lee H. Multiresolution correction of GC bias and application to identification of copy number alterations. Bioinformatics 2020; 35:3890-3897. [PMID: 30865265 DOI: 10.1093/bioinformatics/btz174] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Revised: 03/03/2019] [Accepted: 03/12/2019] [Indexed: 01/03/2023] Open
Abstract
MOTIVATION Whole-genome sequencing (WGS) data are affected by various sequencing biases such as GC bias and mappability bias. These biases degrade performance on detection of genetic variations such as copy number alterations. The existing methods use a relation between the GC proportion and depth of coverage (DOC) of markers by means of regression models. Nonetheless, severity of the GC bias varies from sample to sample. We developed a new method for correction of GC bias on the basis of multiresolution analysis. We used a translation-invariant wavelet transform to decompose biased raw signals into high- and low-frequency coefficients. Then, we modeled the relation between GC proportion and DOC of the genomic regions and constructed new control DOC signals that reflect the GC bias. The control DOC signals are used for normalizing genomic sequences by correcting the GC bias. RESULTS When we applied our method to simulated sequencing data with various degrees of GC bias, our method showed more robust performance on correcting the GC bias than the other methods did. We also applied our method to real-world cancer sequencing datasets and successfully identified cancer-related focal alterations even when cancer genomes were not normalized to normal control samples. In conclusion, our method can be employed for WGS data with different degrees of GC bias. AVAILABILITY AND IMPLEMENTATION The code is available at http://gcancer.org/wabico. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ho Jang
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, South Korea
| | - Hyunju Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, South Korea
| |
Collapse
|
49
|
Wei YC, Huang GH. CONY: A Bayesian procedure for detecting copy number variations from sequencing read depths. Sci Rep 2020; 10:10493. [PMID: 32591545 PMCID: PMC7319969 DOI: 10.1038/s41598-020-64353-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 04/15/2020] [Indexed: 12/26/2022] Open
Abstract
Copy number variations (CNVs) are genomic structural mutations consisting of abnormal numbers of fragment copies. Next-generation sequencing of read-depth signals mirrors these variants. Some tools used to predict CNVs by depth have been published, but most of these tools can be applied to only a specific data type due to modeling limitations. We develop a tool for copy number variation detection by a Bayesian procedure, i.e., CONY, that adopts a Bayesian hierarchical model and an efficient reversible-jump Markov chain Monte Carlo inference algorithm for whole genome sequencing of read-depth data. CONY can be applied not only to individual samples for estimating the absolute number of copies but also to case-control pairs for detecting patient-specific variations. We evaluate the performance of CONY and compare CONY with competing approaches through simulations and by using experimental data from the 1000 Genomes Project. CONY outperforms the other methods in terms of accuracy in both single-sample and paired-samples analyses. In addition, CONY performs well regardless of whether the data coverage is high or low. CONY is useful for detecting both absolute and relative CNVs from read-depth data sequences. The package is available at https://github.com/weiyuchung/CONY.
Collapse
Affiliation(s)
- Yu-Chung Wei
- Graduate Institute of Statistics and Information Science, National Changhua University of Education, No.1 Jinde Road, Changhua City, Changhua County, 50007, Taiwan
| | - Guan-Hua Huang
- Institute of Statistics, National Chiao Tung University, 1001 University Road, Hsinchu, 30010, Taiwan.
| |
Collapse
|
50
|
Evaluation of CNV detection tools for NGS panel data in genetic diagnostics. Eur J Hum Genet 2020; 28:1645-1655. [PMID: 32561899 PMCID: PMC7784926 DOI: 10.1038/s41431-020-0675-z] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 04/21/2020] [Accepted: 04/28/2020] [Indexed: 01/01/2023] Open
Abstract
Although germline copy-number variants (CNVs) are the genetic cause of multiple hereditary diseases, detecting them from targeted next-generation sequencing data (NGS) remains a challenge. Existing tools perform well for large CNVs but struggle with single and multi-exon alterations. The aim of this work is to evaluate CNV calling tools working on gene panel NGS data and their suitability as a screening step before orthogonal confirmation in genetic diagnostics strategies. Five tools (DECoN, CoNVaDING, panelcn.MOPS, ExomeDepth, and CODEX2) were tested against four genetic diagnostics datasets (two in-house and two external) for a total of 495 samples with 231 single and multi-exon validated CNVs. The evaluation was performed using the default and sensitivity-optimized parameters. Results showed that most tools were highly sensitive and specific, but the performance was dataset dependant. When evaluating them in our diagnostics scenario, DECoN and panelcn.MOPS detected all CNVs with the exception of one mosaic CNV missed by DECoN. However, DECoN outperformed panelcn.MOPS specificity achieving values greater than 0.90 when using the optimized parameters. In our in-house datasets, DECoN and panelcn.MOPS showed the highest performance for CNV screening before orthogonal confirmation. Benchmarking and optimization code is freely available at https://github.com/TranslationalBioinformaticsIGTP/CNVbenchmarkeR .
Collapse
|