1
|
De La Vega FM, Irvine SA, Anur P, Potts K, Kraft L, Torres R, Kang P, Truong S, Lee Y, Han S, Onuchic V, Han J. Benchmarking of germline copy number variant callers from whole genome sequencing data for clinical applications. BIOINFORMATICS ADVANCES 2025; 5:vbaf071. [PMID: 40248358 PMCID: PMC12005901 DOI: 10.1093/bioadv/vbaf071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2024] [Revised: 03/11/2025] [Accepted: 04/08/2025] [Indexed: 04/19/2025]
Abstract
Motivation Whole-genome sequencing (WGS) is increasingly preferred for clinical applications due to its comprehensive coverage, effectiveness in detecting copy number variants (CNVs), and declining costs. However, systematic evaluations of WGS CNV callers tailored to germline clinical testing-where high sensitivity and confirmation of reported CNVs are essential-remain necessary. Clinical reporting typically emphasizes CNVs affecting coding regions over precise breakpoint detection. This study benchmarks several short-read WGS CNV detection tools using reference cell lines to inform their clinical use. Results While tools vary in sensitivity (7%-83%) and precision (1%-76%), few meet the sensitivity needed for clinical testing. Callers generally perform better for deletions (up to 88% sensitivity) than duplications (up to 47% sensitivity), with poor detection of duplications under 5 kb. Notably, for CNVs in genes commonly included in clinical panels, significantly improved sensitivity and precision were observed when benchmarking against 25 cell lines with known CNVs. DRAGEN v4.2 high-sensitivity CNV calls, post-processed with custom filters, achieved 100% sensitivity and 77% precision on the optimized gene panel after excluding recurring artifacts. This level of performance may support clinical use with orthogonal confirmation of reportable CNVs, pending validation on laboratory-specific samples. Availability and implementation The data underlying this article are available in the European Nucleo-tide Archive under project accession PRJEB87628.
Collapse
Affiliation(s)
- Francisco M De La Vega
- Tempus AI, Inc., Chicago, IL 60654, United States
- Department of Biomedical Data Sciences, Stanford University School of Medicine, Palo Alto, CA 94304, United States
| | - Sean A Irvine
- Real Time Genomics, Ltd., Hamilton 3204, New Zealand
| | - Pavana Anur
- Tempus AI, Inc., Chicago, IL 60654, United States
| | - Kelly Potts
- Tempus AI, Inc., Chicago, IL 60654, United States
| | - Lewis Kraft
- Tempus AI, Inc., Chicago, IL 60654, United States
| | - Raul Torres
- Tempus AI, Inc., Chicago, IL 60654, United States
| | - Peter Kang
- Tempus AI, Inc., Chicago, IL 60654, United States
| | - Sean Truong
- llumina, Inc., San Diego, CA 92122, United States
| | - Yeonghun Lee
- llumina, Inc., San Diego, CA 92122, United States
| | - Shunhua Han
- llumina, Inc., San Diego, CA 92122, United States
| | | | - James Han
- llumina, Inc., San Diego, CA 92122, United States
| |
Collapse
|
2
|
Kopernik A, Sayganova M, Zobkova G, Doroschuk N, Smirnova A, Molodtsova-Zolotukhina D, Sagaydak O, Ryzhkova O, Kutsev S, Groznova O, Melikyan L, Bondarchuk E, Woroncow M, Albert E, Bogdanov V, Volchkov P. Sanger validation of WGS variants. Sci Rep 2025; 15:3621. [PMID: 39881150 PMCID: PMC11779820 DOI: 10.1038/s41598-025-87814-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Accepted: 01/22/2025] [Indexed: 01/31/2025] Open
Abstract
With the development of next-generation sequencing (NGS) technologies it became possible to simultaneously analyze millions of variants. Despite the quality improvement, it is generally still required to confirm the variants before reporting. However, in recent years the dominant idea is that one could define the quality thresholds for "high quality" variants which do not require orthogonal validation. Despite that, no works to date report the concordance between variants from whole genome sequencing and their gold-standard Sanger validation. In this study we analyzed the concordance for 1756 WGS variants in order to establish the appropriate thresholds for high-quality variants filtering. Resulting thresholds allowed us to drastically reduce the number of variants which require validation, to 4.8% and 1.2% of the initial set for caller-agnostic (DP, AF) and caller-dependent (QUAL) thresholds, respectively.
Collapse
Affiliation(s)
- Arina Kopernik
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia, 125315
| | - Mariia Sayganova
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia, 125315
| | | | | | | | - Daria Molodtsova-Zolotukhina
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia, 125315
- Evogen LLC, Moscow, Russia
| | | | - Oxana Ryzhkova
- Research Centre for Medical Genetics, Moscow, Russia, 115478
| | - Sergey Kutsev
- Research Centre for Medical Genetics, Moscow, Russia, 115478
| | - Olga Groznova
- Veltischev Research and Clinical Institute for Pediatrics and Pediatric Surgery on the Pirogov Russian National Research Medical University of the Ministry of Health of the Russian Federation, Moscow, Russia
- Charity Fund for Medical and Social Genetic Aid Projects «Life Genome», Moscow, Russia
- The Pirogov Russian National Research Medical University of the Ministry of Health of the Russian Federation, Moscow, Russia
| | - Lyusya Melikyan
- Veltischev Research and Clinical Institute for Pediatrics and Pediatric Surgery on the Pirogov Russian National Research Medical University of the Ministry of Health of the Russian Federation, Moscow, Russia
- Charity Fund for Medical and Social Genetic Aid Projects «Life Genome», Moscow, Russia
| | - Elizaveta Bondarchuk
- Veltischev Research and Clinical Institute for Pediatrics and Pediatric Surgery on the Pirogov Russian National Research Medical University of the Ministry of Health of the Russian Federation, Moscow, Russia
- Charity Fund for Medical and Social Genetic Aid Projects «Life Genome», Moscow, Russia
| | | | - Eugene Albert
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia, 125315
| | - Viktor Bogdanov
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia, 125315.
| | - Pavel Volchkov
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia, 125315.
| |
Collapse
|
3
|
Chevrier S, Richard C, Mille M, Bertrand D, Boidot R. Nanopore adaptive sampling accurately detects nucleotide variants and improves the characterization of large-scale rearrangement for the diagnosis of cancer predisposition. Clin Transl Med 2025; 15:e70138. [PMID: 39783935 PMCID: PMC11714230 DOI: 10.1002/ctm2.70138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 11/14/2024] [Accepted: 12/05/2024] [Indexed: 01/12/2025] Open
Abstract
BACKGROUND Molecular diagnosis has become highly significant for patient management in oncology. METHODS Here, 30 well-characterized clinical germline samples were studied with adaptive sampling to enrich the full sequence of 152 cancer predisposition genes. Sequencing was performed on Oxford Nanopore (ONT) R10.4.1 MinION flowcells with the Q20+ chemistry. RESULTS In our cohort, 11 samples had large-scale rearrangements (LSR), which were all detected with ONT sequencing. In addition to perfectly detecting the locus of the LSR, we found a known MLPA amplification of exon 13 in the BRCA1 (NM_7294) gene corresponded to a duplication in tandem of both exons 12 and 13 of the reference NM_7300. Similarly, in another sample with a known total deletion of the BRCA1 gene, ONT sequencing highlighted this complete deletion was the consequence of a large deletion of almost 140 000 bp carrying over five different genes. ONT sequencing was also able to detect all pathogenic nucleotide variants present in 16 samples at low coverage. As we analyzed complete genes and more genes than with short-read sequencing, we detected novel unknown variants. We randomly selected six new variants with a coverage larger than 10× and an average quality higher than 14, and confirmed all of them by Sanger sequencing, suggesting that variants detected with ONT (coverage >10× and quality score >14) could be considered as real variants. CONCLUSIONS We showed that ONT adaptive sampling sequencing is suitable for the analysis of germline alterations, improves characterization of LSR, and detects single nucleotide variations even at low coverage. KEY POINTS Adaptive sampling is suitable for the analysis of germline alterations. Improves the characterization of Large Scale Rearrangement and detects SNV at a minimum coverage of 10x. Allows flexibility of sequencing.
Collapse
Affiliation(s)
- Sandy Chevrier
- Unit of Molecular BiologyGeorges‐François Leclerc Cancer centerUNICANCERDijonFrance
| | - Corentin Richard
- Unit of Molecular BiologyGeorges‐François Leclerc Cancer centerUNICANCERDijonFrance
| | | | | | - Romain Boidot
- Unit of Molecular BiologyGeorges‐François Leclerc Cancer centerUNICANCERDijonFrance
| |
Collapse
|
4
|
Azeem A, Ahmed AN, Khan N, Voutsina N, Ullah I, Ubeyratna N, Yasin M, Baple EL, Crosby AH, Rawlins LE, Saleha S. Investigating the genetic basis of hereditary spastic paraplegia and cerebellar Ataxia in Pakistani families. BMC Neurol 2024; 24:354. [PMID: 39304850 DOI: 10.1186/s12883-024-03855-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Accepted: 09/06/2024] [Indexed: 09/22/2024] Open
Abstract
BACKGROUND Hereditary Spastic Paraplegias (HSPs) and Hereditary Cerebellar Ataxias (HCAs) are progressive neurodegenerative disorders encompassing a spectrum of neurogenetic conditions with significant overlaps of clinical features. Spastic ataxias are a group of conditions that have features of both cerebellar ataxia and spasticity, and these conditions are frequently clinically challenging to distinguish. Accurate genetic diagnosis is crucial but challenging, particularly in resource-limited settings. This study aims to investigate the genetic basis of HSPs and HCAs in Pakistani families. METHODS Families from Khyber Pakhtunkhwa with at least two members showing HSP or HCA phenotypes, and who had not previously been analyzed genetically, were included. Families were referred for genetic analysis by local neurologists based on the proband's clinical features and signs of a potential genetic neurodegenerative disorder. Whole Exome Sequencing (WES) and Sanger sequencing were then used to identify and validate genetic variants, and to analyze variant segregation within families to determine inheritance patterns. The mean age of onset and standard deviation were calculated to assess variability among affected individuals, and the success rate was compared with literature reports using differences in proportions and Cohen's h. RESULTS Pathogenic variants associated with these conditions were identified in five of eight families, segregating according to autosomal recessive inheritance. These variants included previously reported SACS c.2182 C > T, p.(Arg728*), FA2H c.159_176del, p.(Arg53_Ile58del) and SPG11 c.2146 C > T, p.(Gln716*) variants, and two previously unreported variants in SACS c.2229del, p.(Phe743Leufs*8) and ZFYVE26 c.1926_1941del, p.(Tyr643Metfs*2). Additionally, FA2H and SPG11 variants were found to have recurrent occurrences, suggesting a potential founder effect within the Pakistani population. Onset age among affected individuals ranged from 1 to 14 years (M = 6.23, SD = 3.96). The diagnostic success rate was 62.5%, with moderate effect sizes compared to previous studies. CONCLUSIONS The findings of this study expand the genotypic and phenotypic spectrum of HSPs and HCAs in Pakistan and emphasize the importance of utilizing exome/genome sequencing for accurate diagnosis or support accurate differential diagnosis. This approach can improve genetic counseling and clinical management, addressing the challenges of diagnosing neurodegenerative disorders in resource-limited settings.
Collapse
Affiliation(s)
- Arfa Azeem
- Department of Biotechnology and Genetic Engineering, Kohat University of Science and Technology, Kohat, 26000, Khyber Pakhtunkhwa, Pakistan
| | - Asif Naveed Ahmed
- Department of Biotechnology and Genetic Engineering, Kohat University of Science and Technology, Kohat, 26000, Khyber Pakhtunkhwa, Pakistan
| | - Niamat Khan
- Department of Biotechnology and Genetic Engineering, Kohat University of Science and Technology, Kohat, 26000, Khyber Pakhtunkhwa, Pakistan
| | - Nikol Voutsina
- 2Medical Research, RILD Wellcome Wolfson Centre (Level 4), Royal Devon and Exeter NHS Foundation Trust, Exeter, Devon, EX2 5DW, UK
| | - Irfan Ullah
- Department of Neurology, Khyber Teaching Hospital, Peshawar, 25000, Khyber Pakhtunkhwa, Pakistan
| | - Nishanka Ubeyratna
- 2Medical Research, RILD Wellcome Wolfson Centre (Level 4), Royal Devon and Exeter NHS Foundation Trust, Exeter, Devon, EX2 5DW, UK
| | - Muhammad Yasin
- Department of Biotechnology and Genetic Engineering, Kohat University of Science and Technology, Kohat, 26000, Khyber Pakhtunkhwa, Pakistan
| | - Emma L Baple
- 2Medical Research, RILD Wellcome Wolfson Centre (Level 4), Royal Devon and Exeter NHS Foundation Trust, Exeter, Devon, EX2 5DW, UK
| | - Andrew H Crosby
- 2Medical Research, RILD Wellcome Wolfson Centre (Level 4), Royal Devon and Exeter NHS Foundation Trust, Exeter, Devon, EX2 5DW, UK
| | - Lettie E Rawlins
- 2Medical Research, RILD Wellcome Wolfson Centre (Level 4), Royal Devon and Exeter NHS Foundation Trust, Exeter, Devon, EX2 5DW, UK.
- Peninsula Clinical Genetics Service, Royal Devon & Exeter Hospital (Heavitree), Exeter, UK.
| | - Shamim Saleha
- Department of Biotechnology and Genetic Engineering, Kohat University of Science and Technology, Kohat, 26000, Khyber Pakhtunkhwa, Pakistan.
| |
Collapse
|
5
|
Wicklund CAL, Ramos ER. Equity in the Laboratory: Expanding the Role of Genetic Counselors. J Appl Lab Med 2024; 9:187-190. [PMID: 38167760 DOI: 10.1093/jalm/jfad087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 08/25/2023] [Indexed: 01/05/2024]
Affiliation(s)
| | - Erica R Ramos
- Medical and Scientific Affairs, Genome Medical, Inc., South San Francisco, CA, United States
| |
Collapse
|
6
|
Choate LA, Koleilat A, Harris K, Vidal-Folch N, Guenzel A, Newman J, Peterson BJ, Peterson SE, Rice CS, Train LJ, Hasadsri L, Marcou CA, Moyer AM, Baudhuin LM. Confirmation of Insertion, Deletion, and Deletion-Insertion Variants Detected by Next-Generation Sequencing. Clin Chem 2023; 69:1155-1162. [PMID: 37566393 DOI: 10.1093/clinchem/hvad110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 07/03/2023] [Indexed: 08/12/2023]
Abstract
BACKGROUND Despite clinically demonstrated accuracy in next generation sequencing (NGS) data, many clinical laboratories continue to confirm variants with Sanger sequencing, which increases cost of testing and turnaround time. Several studies have assessed the accuracy of NGS in detecting single nucleotide variants; however, less has been reported about insertion, deletion, and deletion-insertion variants (indels). METHODS We performed a retrospective analysis from 2015-2022 of indel results from a subset of NGS targeted gene panel tests offered through the Mayo Clinic Genomics Laboratories. We compared results from NGS and Sanger sequencing of indels observed in clinical runs and during the intra-assay validation of the tests. RESULTS Results demonstrated 100% concordance between NGS and Sanger sequencing for over 490 indels (217 unique), ranging in size from 1 to 68 basepairs (bp). The majority of indels were deletions (77%) and 1 to 5 bp in length (90%). Variant frequencies ranged from 11.4% to 67.4% and 85.1% to 100% for heterozygous and homozygous variants, respectively, with a median depth of coverage of 2562×. A subset of indels (7%) were located in complex regions of the genome, and these were accurately detected by NGS. We also demonstrated 100% reproducibility of indel detection (n = 179) during intra-assay validation. CONCLUSIONS Together this data demonstrates that reportable indel variants up to 68 bp can be accurately assessed using NGS, even when they occur in complex regions. Depending on the complexity of the region or variant, Sanger sequence confirmation of indels is usually not necessary if the variants meet appropriate coverage and allele frequency thresholds.
Collapse
Affiliation(s)
- Lauren A Choate
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Alaa Koleilat
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Kimberley Harris
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Noemi Vidal-Folch
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Adam Guenzel
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Jessica Newman
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Brenda J Peterson
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Sandra E Peterson
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Christopher S Rice
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Laura J Train
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Linda Hasadsri
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Cherisse A Marcou
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Ann M Moyer
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Linnea M Baudhuin
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|