1
|
Chirinskaite AV, Rotov AY, Ermolaeva ME, Tkachenko LA, Vaganova AN, Danilov LG, Fedoseeva KN, Kostin NA, Sopova JV, Firsov ML, Leonova EI. Does Background Matter? A Comparative Characterization of Mouse Models of Autosomal Retinitis Pigmentosa rd1 and Pde6b-KO. Int J Mol Sci 2023; 24:17180. [PMID: 38139011 PMCID: PMC10742838 DOI: 10.3390/ijms242417180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/02/2023] [Accepted: 12/04/2023] [Indexed: 12/24/2023] Open
Abstract
Many retinal degenerative diseases result in vision impairment or permanent blindness due to photoreceptor loss or dysfunction. It has been observed that Pde6brd1 mice (rd1), which carry a spontaneous nonsense mutation in the pde6b gene, have a strong phenotypic similarity to patients suffering from autosomal recessive retinitis pigmentosa. In this study, we present a novel mouse model of retinitis pigmentosa generated through pde6b gene knockout using CRISPR/Cas9 technology. We compare this Pde6b-KO mouse model to the rd1 mouse model to gain insights into the progression of retinal degeneration. The functional assessment of the mouse retina and the tracking of degeneration dynamics were performed using electrophysiological methods, while retinal morphology was analyzed through histology techniques. Interestingly, the Pde6b-KO mouse model demonstrated a higher amplitude of photoresponse than the rd1 model of the same age. At postnatal day 12, the thickness of the photoreceptor layer in both mouse models did not significantly differ from that of control animals; however, by day 15, a substantial reduction was observed. Notably, the decline in the number of photoreceptors in the rd1 model occurred at a significantly faster rate. These findings suggest that the C3H background may play a significant role in the early stages of retinal degeneration.
Collapse
Affiliation(s)
- Angelina V. Chirinskaite
- Center of Transgenesis and Genome Editing, St. Petersburg State University, Universitetskaja Emb., 7/9, 199034 St. Petersburg, Russia (J.V.S.)
| | - Alexander Yu. Rotov
- Laboratory of Evolution of Sense Organs, Sechenov Institute of Evolutionary Physiology and Biochemistry, Russian Academy of Sciences, Thorez Ave., 44, 194223 St. Petersburg, Russia (M.L.F.)
| | - Mariia E. Ermolaeva
- Laboratory of Evolution of Sense Organs, Sechenov Institute of Evolutionary Physiology and Biochemistry, Russian Academy of Sciences, Thorez Ave., 44, 194223 St. Petersburg, Russia (M.L.F.)
| | - Lyubov A. Tkachenko
- Department of Cytology and Histology, St. Petersburg State University, Universitetskaja Emb., 7/9, 199034 St. Petersburg, Russia
| | - Anastasia N. Vaganova
- Institute of Translational Biomedicine, St. Petersburg State University, Universitetskaja Emb., 7/9, 199034 St. Petersburg, Russia
| | - Lavrentii G. Danilov
- Department of Genetics and biotechnology, St. Petersburg State University, Universitetskaja Emb., 7/9, 199034 St. Petersburg, Russia
| | - Ksenia N. Fedoseeva
- Resource Center “Molecular and Cell Technologies”, St. Petersburg State University, Universitetskaja Emb., 7/9, 199034 St. Petersburg, Russia
| | - Nicolay A. Kostin
- Resource Center “Molecular and Cell Technologies”, St. Petersburg State University, Universitetskaja Emb., 7/9, 199034 St. Petersburg, Russia
| | - Julia V. Sopova
- Center of Transgenesis and Genome Editing, St. Petersburg State University, Universitetskaja Emb., 7/9, 199034 St. Petersburg, Russia (J.V.S.)
- Laboratory of Amyloid Biology, St. Petersburg State University, Universitetskaja Emb., 7/9, 199034 St. Petersburg, Russia
| | - Michael L. Firsov
- Laboratory of Evolution of Sense Organs, Sechenov Institute of Evolutionary Physiology and Biochemistry, Russian Academy of Sciences, Thorez Ave., 44, 194223 St. Petersburg, Russia (M.L.F.)
| | - Elena I. Leonova
- Center of Transgenesis and Genome Editing, St. Petersburg State University, Universitetskaja Emb., 7/9, 199034 St. Petersburg, Russia (J.V.S.)
| |
Collapse
|
2
|
An empirical pipeline for personalized diagnosis of Lafora disease mutations. iScience 2021; 24:103276. [PMID: 34755096 PMCID: PMC8564118 DOI: 10.1016/j.isci.2021.103276] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Revised: 09/14/2021] [Accepted: 10/12/2021] [Indexed: 11/23/2022] Open
Abstract
Lafora disease (LD) is a fatal childhood dementia characterized by progressive myoclonic epilepsy manifesting in the teenage years, rapid neurological decline, and death typically within ten years of onset. Mutations in either EPM2A, encoding the glycogen phosphatase laforin, or EPM2B, encoding the E3 ligase malin, cause LD. Whole exome sequencing has revealed many EPM2A variants associated with late-onset or slower disease progression. We established an empirical pipeline for characterizing the functional consequences of laforin missense mutations in vitro using complementary biochemical approaches. Analysis of 26 mutations revealed distinct functional classes associated with different outcomes that were supported by clinical cases. For example, F321C and G279C mutations have attenuated functional defects and are associated with slow progression. This pipeline enabled rapid characterization and classification of newly identified EPM2A mutations, providing clinicians and researchers genetic information to guide treatment of LD patients. Lafora disease (LD) patients present with varying clinical progression LD missense mutations differentially affect laforin function An empirical in vitro pipeline is used to classify laforin missense mutations Patient progression can be predicted based on mutation class
Collapse
|
3
|
Woerner AC, Gallagher RC, Vockley J, Adhikari AN. The Use of Whole Genome and Exome Sequencing for Newborn Screening: Challenges and Opportunities for Population Health. Front Pediatr 2021; 9:663752. [PMID: 34350142 PMCID: PMC8326411 DOI: 10.3389/fped.2021.663752] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 06/07/2021] [Indexed: 01/01/2023] Open
Abstract
Newborn screening (NBS) is a population-based program with a goal of reducing the burden of disease for conditions with significant clinical impact on neonates. Screening tests were originally developed and implemented one at a time, but newer methods have allowed the use of multiplex technologies to expand additions more rapidly to standard panels. Recent improvements in next-generation sequencing are also evolving rapidly from first focusing on individual genes, then panels, and finally all genes as encompassed by whole exome and genome sequencing. The intersection of these two technologies brings the revolutionary possibility of identifying all genetic disorders in newborns, allowing implementation of therapies at the optimum time regardless of symptoms. This article reviews the history of newborn screening and early studies examining the use of whole genome and exome sequencing as a screening tool. Lessons learned from these studies are discussed, along with technical, ethical, and societal challenges to broad implementation.
Collapse
Affiliation(s)
- Audrey C Woerner
- Department of Pediatrics, University of Pittsburgh Medical Center Children's Hospital of Pittsburgh, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| | - Renata C Gallagher
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA, United States
| | - Jerry Vockley
- Department of Pediatrics, University of Pittsburgh Medical Center Children's Hospital of Pittsburgh, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States.,Department of Human Genetics, University of Pittsburgh Graduate School of Public Health, Pittsburgh, PA, United States
| | - Aashish N Adhikari
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, United States.,Artificial Intelligence Lab, Illumina Inc, Foster City, CA, United States
| |
Collapse
|
4
|
Adhikari AN. Gene-specific features enhance interpretation of mutational impact on acid α-glucosidase enzyme activity. Hum Mutat 2019; 40:1507-1518. [PMID: 31228295 DOI: 10.1002/humu.23846] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 05/21/2019] [Accepted: 06/17/2019] [Indexed: 01/30/2023]
Abstract
We present a computational model for predicting mutational impact on enzymatic activity of human acid α-glucosidase (GAA), an enzyme associated with Pompe disease. Using a model that combines features specific to GAA with other general evolutionary and physiochemical features, we made blind predictions of enzymatic activity relative to wildtype human GAA for >300 GAA mutants, as part of the Critical Assessment of Genome Interpretation 5 GAA challenge. We found that gene-specific features can improve the performance of existing impact prediction tools that mostly rely on general features for pathogenicity prediction. Majority of the poorly predicted mutants that lower wildtype GAA enzyme activity occurred on the surface of the GAA protein. We also found that gene-specific features were uncorrelated with existing methods and provided orthogonal information for interpreting the origin of pathogenicity, particular in variants that are poorly predicted by existing general methods. Specific variants in GAA, when investigated in the context of its protein structure, suggested gene-specific information like the disruption of local backbone torsional geometry and disruption of particular sidechain-sidechain hydrogen bonds as some potential sources for pathogenicity.
Collapse
Affiliation(s)
- Aashish N Adhikari
- Department of Plant and Microbial Biology, University of California, Berkeley, California
| |
Collapse
|
5
|
Giacopuzzi E, Laffranchi M, Berardelli R, Ravasio V, Ferrarotti I, Gooptu B, Borsani G, Fra A. Real-world clinical applicability of pathogenicity predictors assessed on SERPINA1
mutations in alpha-1-antitrypsin deficiency. Hum Mutat 2018; 39:1203-1213. [DOI: 10.1002/humu.23562] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 05/20/2018] [Accepted: 06/05/2018] [Indexed: 01/08/2023]
Affiliation(s)
- Edoardo Giacopuzzi
- Division of Biology and Genetics; Department of Molecular and Translational Medicine; University of Brescia; Brescia Italy
| | - Mattia Laffranchi
- Experimental Oncology and Immunology; Department of Molecular and Translational Medicine; University of Brescia; Brescia Italy
| | - Romina Berardelli
- Experimental Oncology and Immunology; Department of Molecular and Translational Medicine; University of Brescia; Brescia Italy
| | - Viola Ravasio
- Division of Biology and Genetics; Department of Molecular and Translational Medicine; University of Brescia; Brescia Italy
| | - Ilaria Ferrarotti
- Centre for Diagnosis of Inherited Alpha-1 Antitrypsin Deficiency; Department of Internal Medicine and Therapeutics; University of Pavia; Pavia Italy
| | - Bibek Gooptu
- Leicester Institute of Structural and Chemical Biology / NIHR Leicester BRC - Respiratory; University of Leicester; Leicester UK
| | - Giuseppe Borsani
- Division of Biology and Genetics; Department of Molecular and Translational Medicine; University of Brescia; Brescia Italy
| | - Annamaria Fra
- Experimental Oncology and Immunology; Department of Molecular and Translational Medicine; University of Brescia; Brescia Italy
| |
Collapse
|
6
|
Bergougnoux A, Taulan-Cadars M, Claustres M, Raynal C. Current and future molecular approaches in the diagnosis of cystic fibrosis. Expert Rev Respir Med 2018; 12:415-426. [PMID: 29580110 DOI: 10.1080/17476348.2018.1457438] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
INTRODUCTION Cystic Fibrosis is among the first diseases to have general population genetic screening tests and one of the most common indications of prenatal and preimplantation genetic diagnosis for single gene disorders. During the past twenty years, thanks to the evolution of diagnostic techniques, our knowledge of CFTR genetics and pathophysiological mechanisms involved in cystic fibrosis has significantly improved. Areas covered: Sanger sequencing and quantitative methods greatly contributed to the identification of more than 2,000 sequence variations reported worldwide in the CFTR gene. We are now entering a new technological age with the generalization of high throughput approaches such as Next Generation Sequencing and Droplet Digital PCR technologies in diagnostics laboratories. These powerful technologies open up new perspectives for scanning the entire CFTR locus, exploring modifier factors that possibly influence the clinical evolution of patients, and for preimplantation and prenatal diagnosis. Expert commentary: Such breakthroughs would, however, require powerful bioinformatics tools and relevant functional tests of variants for analysis and interpretation of the resulting data. Ultimately, an optimal use of all those resources may improve patient care and therapeutic decision-making.
Collapse
Affiliation(s)
- Anne Bergougnoux
- a Laboratoire de Génétique Moléculaire , Centre Hospitalier Universitaire de Montpellier , Montpellier , France.,b EA 7402 , Université de Montpellier , Montpellier , France
| | | | | | - Caroline Raynal
- a Laboratoire de Génétique Moléculaire , Centre Hospitalier Universitaire de Montpellier , Montpellier , France
| |
Collapse
|
7
|
Guidugli L, Shimelis H, Masica DL, Pankratz VS, Lipton GB, Singh N, Hu C, Monteiro AN, Lindor NM, Goldgar DE, Karchin R, Iversen ES, Couch FJ. Assessment of the Clinical Relevance of BRCA2 Missense Variants by Functional and Computational Approaches. Am J Hum Genet 2018. [DOI: 10.1016/j.ajhg.2017.12.013 helena] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/30/2022] Open
|
8
|
Guidugli L, Shimelis H, Masica DL, Pankratz VS, Lipton GB, Singh N, Hu C, Monteiro ANA, Lindor NM, Goldgar DE, Karchin R, Iversen ES, Couch FJ. Assessment of the Clinical Relevance of BRCA2 Missense Variants by Functional and Computational Approaches. Am J Hum Genet 2018; 102:233-248. [PMID: 29394989 DOI: 10.1016/j.ajhg.2017.12.013] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Accepted: 12/18/2017] [Indexed: 11/30/2022] Open
Abstract
Many variants of uncertain significance (VUS) have been identified in BRCA2 through clinical genetic testing. VUS pose a significant clinical challenge because the contribution of these variants to cancer risk has not been determined. We conducted a comprehensive assessment of VUS in the BRCA2 C-terminal DNA binding domain (DBD) by using a validated functional assay of BRCA2 homologous recombination (HR) DNA-repair activity and defined a classifier of variant pathogenicity. Among 139 variants evaluated, 54 had ?99% probability of pathogenicity, and 73 had ?95% probability of neutrality. Functional assay results were compared with predictions of variant pathogenicity from the Align-GVGD protein-sequence-based prediction algorithm, which has been used for variant classification. Relative to the HR assay, Align-GVGD significantly (p < 0.05) over-predicted pathogenic variants. We subsequently combined functional and Align-GVGD prediction results in a Bayesian hierarchical model (VarCall) to estimate the overall probability of pathogenicity for each VUS. In addition, to predict the effects of all other BRCA2 DBD variants and to prioritize variants for functional studies, we used the endoPhenotype-Optimized Sequence Ensemble (ePOSE) algorithm to train classifiers for BRCA2 variants by using data from the HR functional assay. Together, the results show that systematic functional assays in combination with in silico predictors of pathogenicity provide robust tools for clinical annotation of BRCA2 VUS.
Collapse
Affiliation(s)
- Lucia Guidugli
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA
| | - Hermela Shimelis
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA
| | - David L Masica
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Vernon S Pankratz
- Division of Nephrology, University of New Mexico, Albuquerque, NM 87131, USA
| | - Gary B Lipton
- Department of Statistical Science, Duke University, Durham, NC 27708, USA
| | - Namit Singh
- Department of Structural Biology, University of California, San Diego, San Diego, CA 92093, USA
| | - Chunling Hu
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA
| | - Alvaro N A Monteiro
- Cancer Epidemiology Program, H. Lee Moffitt Cancer Center, Tampa, FL 33612, USA
| | - Noralane M Lindor
- Department of Health Sciences Research, Mayo Clinic, Scottsdale, AZ 85259, USA
| | - David E Goldgar
- Huntsman Cancer Institute and Department of Dermatology, University of Utah, Salt Lake City, UT 84132, USA
| | - Rachel Karchin
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD 21205, USA; Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, MD 21218, USA
| | - Edwin S Iversen
- Department of Statistical Science, Duke University, Durham, NC 27708, USA
| | - Fergus J Couch
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA.
| |
Collapse
|
9
|
Niroula A, Vihinen M. PON-P and PON-P2 predictor performance in CAGI challenges: Lessons learned. Hum Mutat 2017; 38:1085-1091. [PMID: 28224672 DOI: 10.1002/humu.23199] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Revised: 01/25/2017] [Accepted: 02/17/2017] [Indexed: 01/14/2023]
Abstract
Computational tools are widely used for ranking and prioritizing variants for characterizing their disease relevance. Since numerous tools have been developed, they have to be properly assessed before being applied. Critical Assessment of Genome Interpretation (CAGI) experiments have significantly contributed toward the assessment of prediction methods for various tasks. Within and outside the CAGI, we have addressed several questions that facilitate development and assessment of variation interpretation tools. These areas include collection and distribution of benchmark datasets, their use for systematic large-scale method assessment, and the development of guidelines for reporting methods and their performance. For us, CAGI has provided a chance to experiment with new ideas, test the application areas of our methods, and network with other prediction method developers. In this article, we discuss our experiences and lessons learned from the various CAGI challenges. We describe our approaches, their performance, and impact of CAGI on our research. Finally, we discuss some of the possibilities that CAGI experiments have opened up and make some suggestions for future experiments.
Collapse
Affiliation(s)
- Abhishek Niroula
- Protein Structure and Bioinformatics Group, Department of Experimental Medical Science, Lund University, Lund, Sweden
| | - Mauno Vihinen
- Protein Structure and Bioinformatics Group, Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
10
|
Niroula A, Vihinen M. Predicting Severity of Disease-Causing Variants. Hum Mutat 2017; 38:357-364. [PMID: 28070986 DOI: 10.1002/humu.23173] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Revised: 12/07/2016] [Accepted: 01/06/2017] [Indexed: 12/22/2022]
Abstract
Most diseases, including those of genetic origin, express a continuum of severity. Clinical interventions for numerous diseases are based on the severity of the phenotype. Predicting severity due to genetic variants could facilitate diagnosis and choice of therapy. Although computational predictions have been used as evidence for classifying the disease relevance of genetic variants, special tools for predicting disease severity in large scale are missing. Here, we manually curated a dataset containing variants leading to severe and less severe phenotypes and studied the abilities of variation impact predictors to distinguish between them. We found that these tools cannot separate the two groups of variants. Then, we developed a novel machine-learning-based method, PON-PS (http://structure.bmc.lu.se/PON-PS), for the classification of amino acid substitutions associated with benign, severe, and less severe phenotypes. We tested the method using an independent test dataset and variants in four additional proteins. For distinguishing severe and nonsevere variants, PON-PS showed an accuracy of 61% in the test dataset, which is higher than for existing tolerance prediction methods. PON-PS is the first generic tool developed for this task. The tool can be used together with other evidence for improving diagnosis and prognosis and for prioritization of preventive interventions, clinical monitoring, and molecular tests.
Collapse
Affiliation(s)
- Abhishek Niroula
- Department of Experimental Medical Science, Lund University, Lund, SE-22184, Sweden
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, SE-22184, Sweden
| |
Collapse
|
11
|
Masica DL, Karchin R. Towards Increasing the Clinical Relevance of In Silico Methods to Predict Pathogenic Missense Variants. PLoS Comput Biol 2016; 12:e1004725. [PMID: 27171182 PMCID: PMC4865359 DOI: 10.1371/journal.pcbi.1004725] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Affiliation(s)
- David L. Masica
- Department of Biomedical Engineering and The Institute for Computational Medicine, The Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Rachel Karchin
- Department of Biomedical Engineering and The Institute for Computational Medicine, The Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Oncology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
- * E-mail:
| |
Collapse
|
12
|
Niroula A, Vihinen M. Variation Interpretation Predictors: Principles, Types, Performance, and Choice. Hum Mutat 2016; 37:579-97. [DOI: 10.1002/humu.22987] [Citation(s) in RCA: 90] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2015] [Accepted: 03/07/2016] [Indexed: 12/18/2022]
Affiliation(s)
- Abhishek Niroula
- Department of Experimental Medical Science; Lund University; BMC B13 Lund SE-22184 Sweden
| | - Mauno Vihinen
- Department of Experimental Medical Science; Lund University; BMC B13 Lund SE-22184 Sweden
| |
Collapse
|
13
|
Masica DL, Sosnay PR, Raraigh KS, Cutting GR, Karchin R. Missense variants in CFTR nucleotide-binding domains predict quantitative phenotypes associated with cystic fibrosis disease severity. Hum Mol Genet 2014; 24:1908-17. [PMID: 25489051 DOI: 10.1093/hmg/ddu607] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Predicting the impact of genetic variation on human health remains an important and difficult challenge. Often, algorithmic classifiers are tasked with predicting binary traits (e.g. positive or negative for a disease) from missense variation. Though useful, this arrangement is limiting and contrived, because human diseases often comprise a spectrum of severities, rather than a discrete partitioning of patient populations. Furthermore, labeling variants as causal or benign can be error prone, which is problematic for training supervised learning algorithms (the so-called garbage in, garbage out phenomenon). We explore the potential value of training classifiers using continuous-valued quantitative measurements, rather than binary traits. Using 20 variants from cystic fibrosis transmembrane conductance regulator (CFTR) nucleotide-binding domains and six quantitative measures of cystic fibrosis (CF) severity, we trained classifiers to predict CF severity from CFTR variants. Employing cross validation, classifier prediction and measured clinical/functional values were significantly correlated for four of six quantitative traits (correlation P-values from 1.35 × 10(-4) to 4.15 × 10(-3)). Classifiers were also able to stratify variants by three clinically relevant risk categories with 85-100% accuracy, depending on which of the six quantitative traits was used for training. Finally, we characterized 11 additional CFTR variants using clinical sweat chloride testing, two functional assays, or all three diagnostics, and validated our classifier using blind prediction. Predictions were within the measured sweat chloride range for seven of eight variants, and captured the differential impact of specific variants on the two functional assays. This work demonstrates a promising and novel framework for assessing the impact of genetic variation.
Collapse
Affiliation(s)
- David L Masica
- Department of Biomedical Engineering and Institute for Computational Medicine, The Johns Hopkins University, Baltimore, MD, USA
| | | | | | | | - Rachel Karchin
- Department of Biomedical Engineering and Institute for Computational Medicine, The Johns Hopkins University, Baltimore, MD, USA, Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
14
|
Masica DL, Li S, Douville C, Manola J, Ferris RL, Burtness B, Forastiere AA, Koch WM, Chung CH, Karchin R. Predicting survival in head and neck squamous cell carcinoma from TP53 mutation. Hum Genet 2014; 134:497-507. [PMID: 25108461 DOI: 10.1007/s00439-014-1470-0] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Accepted: 07/17/2014] [Indexed: 12/20/2022]
Abstract
For TP53-mutated head and neck squamous cell carcinomas (HNSCCs), the codon and specific amino acid sequence change resulting from a patient's mutation can be prognostic. Thus, developing a framework to predict patient survival for specific mutations in TP53 would be valuable. There are many bioinformatics and functional methods for predicting the phenotypic impact of genetic variation, but their overall clinical value remains unclear. Here, we assess the ability of 15 different methods to predict HNSCC patient survival from TP53 mutation, using TP53 mutation and clinical data from patients enrolled in E4393 by the Eastern Cooperative Oncology Group (ECOG), which investigated whether TP53 mutations in surgical margins were predictive of disease recurrence. These methods include: server-based computational tools SIFT, PolyPhen-2, and Align-GVGD; our in-house POSE and VEST algorithms; the rules devised in Poeta et al. with and without considerations for splice-site mutations; location of mutation in the DNA-bound TP53 protein structure; and a functional assay measuring WAF1 transactivation in TP53-mutated yeast. We assessed method performance using overall survival (OS) and progression-free survival (PFS) from 420 HNSCC patients, of whom 224 had TP53 mutations. Each mutation was categorized as "disruptive" or "non-disruptive". For each method, we compared the outcome between the disruptive group vs. the non-disruptive group. The rules devised by Poeta et al. with or without our splice-site modification were observed to be superior to others. While the differences in OS (disruptive vs. non-disruptive) appear to be marginally significant (Poeta rules + splice rules, P = 0.089; Poeta rules, P = 0.053), both algorithms identified the disruptive group as having significantly worse PFS outcome (Poeta rules + splice rules, P = 0.011; Poeta rules, P = 0.027). In general, prognostic performance was low among assessed methods. Further studies are required to develop and validate methods that can predict functional and clinical significance of TP53 mutations in HNSCC patients.
Collapse
Affiliation(s)
- David L Masica
- Department of Biomedical Engineering, Institute for Computational Medicine, The Johns Hopkins University, Baltimore, MD, USA,
| | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Grosu DS, Hague L, Chelliserry M, Kruglyak KM, Lenta R, Klotzle B, San J, Goldstein WM, Moturi S, Devers P, Woolworth J, Peters E, Elashoff B, Stoerker J, Wolff DJ, Friedman KJ, Highsmith WE, Lin E, Ong FS. Clinical investigational studies for validation of a next-generation sequencingin vitrodiagnostic device for cystic fibrosis testing. Expert Rev Mol Diagn 2014; 14:605-22. [DOI: 10.1586/14737159.2014.916618] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
16
|
McKeone R, Wikstrom M, Kiel C, Rakoczy PE. Assessing the correlation between mutant rhodopsin stability and the severity of retinitis pigmentosa. Mol Vis 2014; 20:183-99. [PMID: 24520188 PMCID: PMC3919671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Accepted: 02/05/2014] [Indexed: 10/27/2022] Open
Abstract
PURPOSE Following a previous study that demonstrated a correlation between rhodopsin stability and the severity of retinitis pigmentosa (RP), we investigated whether predictions of severity can be improved with a regional analysis of this correlation. The association between changes to the stability of the protein and the relative amount of rhodopsin reaching the plasma membrane was assessed. METHODS Crystallography-based estimations of mutant rhodopsin stability were compared with descriptions in the scientific literature of the visual function of mutation carriers to determine the extent of associations between rhodopsin stability and clinical phenotype. To test the findings of this analysis, three residues of a green fluorescent protein (GFP) tagged rhodopsin plasmid were targeted with site-directed random mutagenesis to generate mutant variants with a range of stability changes. These plasmids were transfected into HEK-293 cells, and then flow cytometry was used to measure rhodopsin on the cells' plasma membrane. The GFP signal was used to measure the ratio between this membrane-bound rhodopsin and total cellular rhodopsin. FoldX stability predictions were then compared with the surface staining data and clinical data from the database to characterize the relationship between rhodopsin stability, the severity of RP, and the expression of rhodopsin at the cell surface. RESULTS There was a strong linear correlation between the scale of the destabilization of mutant variants and the severity of retinal disease. A correlation was also seen in vitro between stability and the amount of rhodopsin at the plasma membrane. Rhodopsin is drastically reduced on the surface of cells transfected with variants that differ in their inherent stability from the wild-type by more than 2 kcal/mol. Below this threshold, surface levels are closer to those of the wild-type. CONCLUSIONS There is a correlation between the stability of rhodopsin mutations and disease severity and levels of membrane-bound rhodopsin. Measuring membrane-bound rhodopsin with flow cytometry could improve prognoses for poorly characterized mutations and could provide a platform for measuring the effectiveness of treatments.
Collapse
Affiliation(s)
- Richard McKeone
- Department of Molecular Ophthalmology, Lions Eye Institute, Perth, Western Australia,Centre for Ophthalmology and Visual Science, University of Western Australia, Perth, Western Australia
| | - Matthew Wikstrom
- Centre for Experimental Immunology, Lions Eye Institute, Perth, Western Australia,Centre for Ophthalmology and Visual Science, University of Western Australia, Perth, Western Australia
| | - Christina Kiel
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr. Aiguader 88, 08003 Barcelona, Spain,Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - P. Elizabeth Rakoczy
- Centre for Ophthalmology and Visual Science, University of Western Australia, Perth, Western Australia
| |
Collapse
|
17
|
Defining the disease liability of variants in the cystic fibrosis transmembrane conductance regulator gene. Nat Genet 2013; 45:1160-7. [PMID: 23974870 PMCID: PMC3874936 DOI: 10.1038/ng.2745] [Citation(s) in RCA: 453] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2013] [Accepted: 07/30/2013] [Indexed: 12/16/2022]
Abstract
Allelic heterogeneity in disease-causing genes presents a substantial challenge to the translation of genomic variation to clinical practice. Few of the almost 2,000 variants in the cystic fibrosis transmembrane conductance regulator (CFTR) gene have empirical evidence that they cause cystic fibrosis. To address this gap, we collected both genotype and phenotype data for 39,696 cystic fibrosis patients in registries and clinics in North America and Europe. Among these patients, 159 CFTR variants had an allele frequency of ≥0.01%. These variants were evaluated for both clinical severity and functional consequence with 127 (80%) meeting both clinical and functional criteria consistent with disease. Assessment of disease penetrance in 2,188 fathers of cystic fibrosis patients enabled assignment of 12 of the remaining 32 variants as neutral while the other 20 variants remained indeterminate. This study illustrates that sourcing data directly from well-phenotyped subjects can address the gap in our ability to interpret clinically-relevant genomic variation.
Collapse
|
18
|
Wei Q, Dunbrack RL. The role of balanced training and testing data sets for binary classifiers in bioinformatics. PLoS One 2013; 8:e67863. [PMID: 23874456 PMCID: PMC3706434 DOI: 10.1371/journal.pone.0067863] [Citation(s) in RCA: 145] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2012] [Accepted: 05/23/2013] [Indexed: 12/03/2022] Open
Abstract
Training and testing of conventional machine learning models on binary classification problems depend on the proportions of the two outcomes in the relevant data sets. This may be especially important in practical terms when real-world applications of the classifier are either highly imbalanced or occur in unknown proportions. Intuitively, it may seem sensible to train machine learning models on data similar to the target data in terms of proportions of the two binary outcomes. However, we show that this is not the case using the example of prediction of deleterious and neutral phenotypes of human missense mutations in human genome data, for which the proportion of the binary outcome is unknown. Our results indicate that using balanced training data (50% neutral and 50% deleterious) results in the highest balanced accuracy (the average of True Positive Rate and True Negative Rate), Matthews correlation coefficient, and area under ROC curves, no matter what the proportions of the two phenotypes are in the testing data. Besides balancing the data by undersampling the majority class, other techniques in machine learning include oversampling the minority class, interpolating minority-class data points and various penalties for misclassifying the minority class. However, these techniques are not commonly used in either the missense phenotype prediction problem or in the prediction of disordered residues in proteins, where the imbalance problem is substantial. The appropriate approach depends on the amount of available data and the specific problem at hand.
Collapse
Affiliation(s)
- Qiong Wei
- Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania, United States of America
| | - Roland L. Dunbrack
- Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
19
|
McCorvie TJ, Timson DJ. In silico prediction of the effects of mutations in the human UDP-galactose 4'-epimerase gene: towards a predictive framework for type III galactosemia. Gene 2013; 524:95-104. [PMID: 23644136 DOI: 10.1016/j.gene.2013.04.061] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2013] [Revised: 03/30/2013] [Accepted: 04/11/2013] [Indexed: 10/26/2022]
Abstract
The enzyme UDP-galactose 4'-epimerase (GALE) catalyses the reversible epimerisation of both UDP-galactose and UDP-N-acetyl-galactosamine. Deficiency of the human enzyme (hGALE) is associated with type III galactosemia. The majority of known mutations in hGALE are missense and private thus making clinical guidance difficult. In this study a bioinformatics approach was employed to analyse the structural effects due to each mutation using both the UDP-glucose and UDP-N-acetylglucosamine bound structures of the wild-type protein. Changes to the enzyme's overall stability, substrate/cofactor binding and propensity to aggregate were also predicted. These predictions were found to be in good agreement with previous in vitro and in vivo studies when data was available and allowed for the differentiation of those mutants that severely impair the enzyme's activity against UDP-galactose. Next this combination of techniques were applied to another twenty-six reported variants from the NCBI dbSNP database that have yet to be studied to predict their effects. This identified p.I14T, p.R184H and p.G302R as likely severely impairing mutations. Although severely impaired mutants were predicted to decrease the protein's stability, overall predicted stability changes only weakly correlated with residual activity against UDP-galactose. This suggests other protein functions such as changes in cofactor and substrate binding may also contribute to the mechanism of impairment. Finally this investigation shows that this combination of different in silico approaches is useful in predicting the effects of mutations and that it could be the basis of an initial prediction of likely clinical severity when new hGALE mutants are discovered.
Collapse
Affiliation(s)
- Thomas J McCorvie
- School of Biological Sciences, Queen's University Belfast, Medical Biology Centre, 97 Lisburn Road, Belfast, BT9 7BL, UK
| | | |
Collapse
|