1
|
Barthélémy D, Belmonte E, Pilla LD, Bardel C, Duport E, Gautier V, Payen L. Direct Comparative Analysis of a Pharmacogenomics Panel with PacBio Hifi ® Long-Read and Illumina Short-Read Sequencing. J Pers Med 2023; 13:1655. [PMID: 38138882 PMCID: PMC10744512 DOI: 10.3390/jpm13121655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 11/10/2023] [Accepted: 11/23/2023] [Indexed: 12/24/2023] Open
Abstract
BACKGROUND Pharmacogenetics (PGx) aims to determine genetic signatures that can be used in clinical settings to individualize treatment for each patient, including anti-cancer drugs, anti-psychotics, and painkillers. Taken together, a better understanding of the impacts of genetic variants on the corresponding protein function or expression permits the prediction of the pharmacological response: responders, non-responders, and those with adverse drug reactions (ADRs). OBJECTIVE This work provides a comparison between innovative long-read sequencing (LRS) and short-read sequencing (SRS) techniques. METHODS AND MATERIALS The gene panel captured using PacBio HiFi® sequencing was tested on thirteen clinical samples on GENTYANE's platform. SRS, using a comprehensive pharmacogenetics panel, was performed in routine settings at the Civil Hospitals of Lyon. We focused on complex regions analysis, including copy number variations (CNVs), structural variants, repeated regions, and phasing-haplotyping for three key pharmacogenes: CYP2D6, UGT1A1, and NAT2. RESULTS Variants and the corresponding expected star (*) alleles were reported. Although only 38.4% concordance was found for haplotype determination and 61.5% for diplotype, this did not affect the metabolism scoring. A better accuracy of LRS was obtained for the detection of the CYP2D6*5 haplotype in the presence of the duplicated wild-type CYP2D6*2 form. A total concordance was performed for UGT1A1 TA repeat detection. Direct phasing using the LRS approach allowed us to correct certain NAT2 profiles. CONCLUSIONS Combining an optimized variant-calling pipeline and with direct phasing analysis, LRS is a robust technique for PGx analysis that can minimize the risk of mis-haplotyping.
Collapse
Affiliation(s)
- David Barthélémy
- Institut of Pharmaceutical and Biological Sciences of Lyon, Claude Bernard Lyon I, 69373 Lyon, France; (D.B.); (C.B.)
- Department of Biochemistry and Molecular Biology, Lyon-Sud Hospital, Hospices Civils de Lyon, Réseau Francophone de Pharmacogénétique (RNPGx), 69495 Pierre-Bénite, France; (L.D.P.); (E.D.)
- Center for Innovation in Cancerology of Lyon (CICLY) EA 3738, Faculty of Medicine and Maieutic Lyon Sud, Claude Bernard University Lyon I, 69921 Oullins, France
| | - Elodie Belmonte
- Plateforme Génotypage et Séquençage en Auvergne (GENTYANE) UMR 1095 Génétique, Diversité Ecophysiologie des Céréales INRAE, Université Clermont Auvergne, 63100 Clermont Ferrand, France; (E.B.); (V.G.)
| | - Laurie Di Pilla
- Department of Biochemistry and Molecular Biology, Lyon-Sud Hospital, Hospices Civils de Lyon, Réseau Francophone de Pharmacogénétique (RNPGx), 69495 Pierre-Bénite, France; (L.D.P.); (E.D.)
| | - Claire Bardel
- Institut of Pharmaceutical and Biological Sciences of Lyon, Claude Bernard Lyon I, 69373 Lyon, France; (D.B.); (C.B.)
- Department of Bioinformatics, Hospices Civils de Lyon, 69008 Lyon, France
| | - Eve Duport
- Department of Biochemistry and Molecular Biology, Lyon-Sud Hospital, Hospices Civils de Lyon, Réseau Francophone de Pharmacogénétique (RNPGx), 69495 Pierre-Bénite, France; (L.D.P.); (E.D.)
| | - Veronique Gautier
- Plateforme Génotypage et Séquençage en Auvergne (GENTYANE) UMR 1095 Génétique, Diversité Ecophysiologie des Céréales INRAE, Université Clermont Auvergne, 63100 Clermont Ferrand, France; (E.B.); (V.G.)
| | - Léa Payen
- Institut of Pharmaceutical and Biological Sciences of Lyon, Claude Bernard Lyon I, 69373 Lyon, France; (D.B.); (C.B.)
- Department of Biochemistry and Molecular Biology, Lyon-Sud Hospital, Hospices Civils de Lyon, Réseau Francophone de Pharmacogénétique (RNPGx), 69495 Pierre-Bénite, France; (L.D.P.); (E.D.)
- Center for Innovation in Cancerology of Lyon (CICLY) EA 3738, Faculty of Medicine and Maieutic Lyon Sud, Claude Bernard University Lyon I, 69921 Oullins, France
| |
Collapse
|
2
|
Adelson RP, Renton AE, Li W, Barzilai N, Atzmon G, Goate AM, Davies P, Freudenberg-Hua Y. Empirical design of a variant quality control pipeline for whole genome sequencing data using replicate discordance. Sci Rep 2019; 9:16156. [PMID: 31695094 PMCID: PMC6834861 DOI: 10.1038/s41598-019-52614-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Accepted: 10/18/2019] [Indexed: 12/29/2022] Open
Abstract
The success of next-generation sequencing depends on the accuracy of variant calls. Few objective protocols exist for QC following variant calling from whole genome sequencing (WGS) data. After applying QC filtering based on Genome Analysis Tool Kit (GATK) best practices, we used genotype discordance of eight samples that were sequenced twice each to evaluate the proportion of potentially inaccurate variant calls. We designed a QC pipeline involving hard filters to improve replicate genotype concordance, which indicates improved accuracy of genotype calls. Our pipeline analyzes the efficacy of each filtering step. We initially applied this strategy to well-characterized variants from the ClinVar database, and subsequently to the full WGS dataset. The genome-wide biallelic pipeline removed 82.11% of discordant and 14.89% of concordant genotypes, and improved the concordance rate from 98.53% to 99.69%. The variant-level read depth filter most improved the genome-wide biallelic concordance rate. We also adapted this pipeline for triallelic sites, given the increasing proportion of multiallelic sites as sample sizes increase. For triallelic sites containing only SNVs, the concordance rate improved from 97.68% to 99.80%. Our QC pipeline removes many potentially false positive calls that pass in GATK, and may inform future WGS studies prior to variant effect analysis.
Collapse
Affiliation(s)
- Robert P Adelson
- Litwin-Zucker Center for Alzheimer's Disease, The Feinstein Institute for Medical Research, Northwell Health, Manhasset, New York, 11030, USA
| | - Alan E Renton
- Ronald M. Loeb Center for Alzheimer's Disease and Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| | - Wentian Li
- Robert S. Boas Center for Genomics & Human Genetics, The Feinstein Institute for Medical Research, Northwell Health, Manhasset, New York, 11030, USA
| | - Nir Barzilai
- Robert S. Boas Center for Genomics & Human Genetics, The Feinstein Institute for Medical Research, Northwell Health, Manhasset, New York, 11030, USA
| | - Gil Atzmon
- Institute for Aging Research, Albert Einstein College of Medicine, Bronx, New York, 10461, USA
- Faculty of Natural Sciences, University of Haifa, Haifa, 31905, Israel
| | - Alison M Goate
- Ronald M. Loeb Center for Alzheimer's Disease and Departments of Neuroscience, Genetics and Genomic Sciences, and Neurology, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| | - Peter Davies
- Litwin-Zucker Center for Alzheimer's Disease, The Feinstein Institute for Medical Research, Northwell Health, Manhasset, New York, 11030, USA
| | - Yun Freudenberg-Hua
- Litwin-Zucker Center for Alzheimer's Disease, The Feinstein Institute for Medical Research, Northwell Health, Manhasset, New York, 11030, USA.
- Division of Geriatric Psychiatry, Zucker Hillside Hospital, Northwell Health, Glen Oaks, New York, 11004, USA.
| |
Collapse
|
3
|
Tom N, Tom O, Malcikova J, Pavlova S, Kubesova B, Rausch T, Kolarik M, Benes V, Bystry V, Pospisilova S. ToTem: a tool for variant calling pipeline optimization. BMC Bioinformatics 2018; 19:243. [PMID: 29940847 PMCID: PMC6020218 DOI: 10.1186/s12859-018-2227-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 05/31/2018] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters for optimal precision and recall. RESULTS Here we introduce ToTem, a tool for automated pipeline optimization. ToTem is a stand-alone web application with a comprehensive graphical user interface (GUI). ToTem is written in Java and PHP with an underlying connection to a MySQL database. Its primary role is to automatically generate, execute and benchmark different variant calling pipeline settings. Our tool allows an analysis to be started from any level of the process and with the possibility of plugging almost any tool or code. To prevent an over-fitting of pipeline parameters, ToTem ensures the reproducibility of these by using cross validation techniques that penalize the final precision, recall and F-measure. The results are interpreted as interactive graphs and tables allowing an optimal pipeline to be selected, based on the user's priorities. Using ToTem, we were able to optimize somatic variant calling from ultra-deep targeted gene sequencing (TGS) data and germline variant detection in whole genome sequencing (WGS) data. CONCLUSIONS ToTem is a tool for automated pipeline optimization which is freely available as a web application at https://totem.software .
Collapse
Affiliation(s)
- Nikola Tom
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic
- Department of Internal Medicine - Hematology and Oncology, Medical Faculty, Masaryk University and University Hospital Brno, Brno, Czech Republic
| | - Ondrej Tom
- Department of Computer Science, Faculty of Science, Palacky University, Olomouc, Czech Republic
| | - Jitka Malcikova
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic
- Department of Internal Medicine - Hematology and Oncology, Medical Faculty, Masaryk University and University Hospital Brno, Brno, Czech Republic
| | - Sarka Pavlova
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic
- Department of Internal Medicine - Hematology and Oncology, Medical Faculty, Masaryk University and University Hospital Brno, Brno, Czech Republic
| | - Blanka Kubesova
- Department of Internal Medicine - Hematology and Oncology, Medical Faculty, Masaryk University and University Hospital Brno, Brno, Czech Republic
| | - Tobias Rausch
- Genomics Core Facility, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Miroslav Kolarik
- Department of Computer Science, Faculty of Science, Palacky University, Olomouc, Czech Republic
| | - Vladimir Benes
- Genomics Core Facility, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Vojtech Bystry
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Sarka Pospisilova
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic
- Department of Internal Medicine - Hematology and Oncology, Medical Faculty, Masaryk University and University Hospital Brno, Brno, Czech Republic
| |
Collapse
|
4
|
High-depth whole genome sequencing of an Ashkenazi Jewish reference panel: enhancing sensitivity, accuracy, and imputation. Hum Genet 2018; 137:343-355. [PMID: 29705978 DOI: 10.1007/s00439-018-1886-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2018] [Accepted: 04/21/2018] [Indexed: 12/31/2022]
Abstract
While increasingly large reference panels for genome-wide imputation have been recently made available, the degree to which imputation accuracy can be enhanced by population-specific reference panels remains an open question. Here, we sequenced at full-depth (≥ 30×), across two platforms (Illumina X Ten and Complete Genomics, Inc.), a moderately large (n = 738) cohort of samples drawn from the Ashkenazi Jewish population. We developed a series of quality control steps to optimize sensitivity, specificity, and comprehensiveness of variant calls in the reference panel, and then tested the accuracy of imputation against target cohorts drawn from the same population. Quality control (QC) thresholds for the Illumina X Ten platform were identified that permitted highly accurate calling of single nucleotide variants across 94% of the genome. QC procedures also identified numerous regions that are poorly mapped using current reference or alternate assemblies. After stringent QC, the population-specific reference panel produced more accurate and comprehensive imputation results relative to publicly available, large cosmopolitan reference panels, especially in the range of rare variants that may be most critical to further progress in mapping of complex phenotypes. The population-specific reference panel also permitted enhanced filtering of clinically irrelevant variants from personal genomes.
Collapse
|
5
|
Yang Y, Botton MR, Scott ER, Scott SA. Sequencing the CYP2D6 gene: from variant allele discovery to clinical pharmacogenetic testing. Pharmacogenomics 2017; 18:673-685. [PMID: 28470112 DOI: 10.2217/pgs-2017-0033] [Citation(s) in RCA: 76] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
CYP2D6 is one of the most studied enzymes in the field of pharmacogenetics. The CYP2D6 gene is highly polymorphic with over 100 catalogued star (*) alleles, and clinical CYP2D6 testing is increasingly accessible and supported by practice guidelines. However, the degree of variation at the CYP2D6 locus and homology with its pseudogenes make interrogating CYP2D6 by short-read sequencing challenging. Moreover, accurate prediction of CYP2D6 metabolizer status necessitates analysis of duplicated alleles when an increased copy number is detected. These challenges have recently been overcome by long-read CYP2D6 sequencing; however, such platforms are not widely available. This review highlights the genomic complexities of CYP2D6, current sequencing methods and the evolution of CYP2D6 from allele discovery to clinical pharmacogenetic testing.
Collapse
Affiliation(s)
- Yao Yang
- Department of Genetics & Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.,Icahn Institute for Genomics & Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Mariana R Botton
- Department of Genetics & Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Erick R Scott
- Department of Genetics & Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.,Icahn Institute for Genomics & Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Stuart A Scott
- Department of Genetics & Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| |
Collapse
|