1
|
Sebastiano MR, Hadano S, Cesca F, Ermondi G. Preclinical alternative drug discovery programs for monogenic rare diseases. Should small molecules or gene therapy be used? The case of hereditary spastic paraplegias. Drug Discov Today 2024; 29:104138. [PMID: 39154774 DOI: 10.1016/j.drudis.2024.104138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 06/28/2024] [Accepted: 08/13/2024] [Indexed: 08/20/2024]
Abstract
Patients diagnosed with rare diseases and their and families search desperately to organize drug discovery campaigns. Alternative models that differ from default paradigms offer real opportunities. There are, however, no clear guidelines for the development of such models, which reduces success rates and raises costs. We address the main challenges in making the discovery of new preclinical treatments more accessible, using rare hereditary paraplegia as a paradigmatic case. First, we discuss the necessary expertise, and the patients' clinical and genetic data. Then, we revisit gene therapy, de novo drug development, and drug repurposing, discussing their applicability. Moreover, we explore a pool of recommended in silico tools for pathogenic variant and protein structure prediction, virtual screening, and experimental validation methods, discussing their strengths and weaknesses. Finally, we focus on successful case applications.
Collapse
Affiliation(s)
- Matteo Rossi Sebastiano
- University of Torino, Molecular Biotechnology and Health Sciences Department, CASSMedChem, Piazza Nizza, 10138 Torino, Italy
| | - Shinji Hadano
- Molecular Neuropathobiology Laboratory, Department of Physiology, Tokai University School of Medicine, Isehara, Japan
| | - Fabrizia Cesca
- Department of Life Sciences, University of Trieste, 34127 Trieste, Italy
| | - Giuseppe Ermondi
- University of Torino, Molecular Biotechnology and Health Sciences Department, CASSMedChem, Piazza Nizza, 10138 Torino, Italy.
| |
Collapse
|
2
|
Olvera-León R, Zhang F, Offord V, Zhao Y, Tan HK, Gupta P, Pal T, Robles-Espinoza CD, Arriaga-González FG, Matsuyama LSAS, Delage E, Dicks E, Ezquina S, Rowlands CF, Turnbull C, Pharoah P, Perry JRB, Jasin M, Waters AJ, Adams DJ. High-resolution functional mapping of RAD51C by saturation genome editing. Cell 2024:S0092-8674(24)00968-1. [PMID: 39299233 DOI: 10.1016/j.cell.2024.08.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 02/29/2024] [Accepted: 08/20/2024] [Indexed: 09/22/2024]
Abstract
Pathogenic variants in RAD51C confer an elevated risk of breast and ovarian cancer, while individuals homozygous for specific RAD51C alleles may develop Fanconi anemia. Using saturation genome editing (SGE), we functionally assess 9,188 unique variants, including >99.5% of all possible coding sequence single-nucleotide alterations. By computing changes in variant abundance and Gaussian mixture modeling (GMM), we functionally classify 3,094 variants to be disruptive and use clinical truth sets to reveal an accuracy/concordance of variant classification >99.9%. Cell fitness was the primary assay readout allowing us to observe a phenomenon where specific missense variants exhibit distinct depletion kinetics potentially suggesting that they represent hypomorphic alleles. We further explored our exhaustive functional map, revealing critical residues on the RAD51C structure and resolving variants found in cancer-segregating kindred. Furthermore, through interrogation of UK Biobank and a large multi-center ovarian cancer cohort, we find significant associations between SGE-depleted variants and cancer diagnoses.
Collapse
Affiliation(s)
- Rebeca Olvera-León
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK; Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Querétaro, Querétaro, Mexico
| | - Fang Zhang
- Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, USA; Developmental Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Victoria Offord
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Yajie Zhao
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge, UK
| | - Hong Kee Tan
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Prashant Gupta
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Tuya Pal
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center (VUMC)/Vanderbilt-Ingram Cancer Center (VICC), Nashville, TN, USA
| | - Carla Daniela Robles-Espinoza
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK; Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Querétaro, Querétaro, Mexico
| | - Fernanda G Arriaga-González
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK; Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Querétaro, Querétaro, Mexico
| | | | - Erwan Delage
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Ed Dicks
- Department of Public Health and Primary Care, University of Cambridge, Robinson Way, Cambridge, UK
| | - Suzana Ezquina
- Department of Public Health and Primary Care, University of Cambridge, Robinson Way, Cambridge, UK
| | - Charlie F Rowlands
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London, UK
| | - Clare Turnbull
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London, UK; National Cancer Registration and Analysis Service, National Health Service (NHS) England, London, UK; Cancer Genetics Unit, The Royal Marsden NHS Foundation Trust, London, UK
| | - Paul Pharoah
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - John R B Perry
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge, UK
| | - Maria Jasin
- Developmental Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Andrew J Waters
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
| | - David J Adams
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
| |
Collapse
|
3
|
Ahmad RM, Ali BR, Al-Jasmi F, Al Dhaheri N, Al Turki S, Kizhakkedath P, Mohamad MS. AI-derived comparative assessment of the performance of pathogenicity prediction tools on missense variants of breast cancer genes. Hum Genomics 2024; 18:99. [PMID: 39256852 PMCID: PMC11389290 DOI: 10.1186/s40246-024-00667-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 08/22/2024] [Indexed: 09/12/2024] Open
Abstract
Single nucleotide variants (SNVs) can exert substantial and extremely variable impacts on various cellular functions, making accurate predictions of their consequences challenging, albeit crucial especially in clinical settings such as in oncology. Laboratory-based experimental methods for assessing these effects are time-consuming and often impractical, highlighting the importance of in-silico tools for variant impact prediction. However, the performance metrics of currently available tools on breast cancer missense variants from benchmarking databases have not been thoroughly investigated, creating a knowledge gap in the accurate prediction of pathogenicity. In this study, the benchmarking datasets ClinVar and HGMD were used to evaluate 21 Artificial Intelligence (AI)-derived in-silico tools. Missense variants in breast cancer genes were extracted from ClinVar and HGMD professional v2023.1. The HGMD dataset focused on pathogenic variants only, to ensure balance, benign variants for the same genes were included from the ClinVar database. Interestingly, our analysis of both datasets revealed variants across genes with varying penetrance levels like low and moderate in addition to high, reinforcing the value of disease-specific tools. The top-performing tools on ClinVar dataset identified were MutPred (Accuracy = 0.73), Meta-RNN (Accuracy = 0.72), ClinPred (Accuracy = 0.71), Meta-SVM, REVEL, and Fathmm-XF (Accuracy = 0.70). While on HGMD dataset they were ClinPred (Accuracy = 0.72), MetaRNN (Accuracy = 0.71), CADD (Accuracy = 0.69), Fathmm-MKL (Accuracy = 0.68), and Fathmm-XF (Accuracy = 0.67). These findings offer clinicians and researchers valuable insights for selecting, improving, and developing effective in-silico tools for breast cancer pathogenicity prediction. Bridging this knowledge gap contributes to advancing precision medicine and enhancing diagnostic and therapeutic approaches for breast cancer patients with potential implications for other conditions.
Collapse
Affiliation(s)
- Rahaf M Ahmad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Bassam R Ali
- Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Fatma Al-Jasmi
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
- Division of Metabolic Genetics, Department of Pediatrics, Tawam Hospital, Al Ain, United Arab Emirates
| | - Noura Al Dhaheri
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
- Division of Metabolic Genetics, Department of Pediatrics, Tawam Hospital, Al Ain, United Arab Emirates
| | - Saeed Al Turki
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Praseetha Kizhakkedath
- Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Mohd Saberi Mohamad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates.
- Center for Engineering Computational Intelligence, Faculty of Engineering and Technology, Multimedia University, Melaka, Malaysia.
| |
Collapse
|
4
|
Braddock FL, Gardner JC, Bhattacharyya N, Sanchez-Pintado B, Costa M, Zarouchlioti C, Szabo A, Lišková P, Cheetham ME, Young RD, Thaung C, Davidson AE, Tuft SJ, Hardcastle AJ. Autosomal dominant stromal corneal dystrophy associated with a SPARCL1 missense variant. Eur J Hum Genet 2024:10.1038/s41431-024-01687-8. [PMID: 39169229 DOI: 10.1038/s41431-024-01687-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Revised: 07/29/2024] [Accepted: 08/14/2024] [Indexed: 08/23/2024] Open
Abstract
Corneal dystrophies are phenotypically and genetically heterogeneous, often resulting in visual impairment caused by corneal opacification. We investigated the genetic cause of an autosomal dominant corneal stromal dystrophy in a pedigree with eight affected individuals in three generations. Affected individuals had diffuse central stromal opacity, with reduced visual acuity in older family members. Histopathology of affected cornea tissue removed during surgery revealed mild stromal textural alterations with alcianophilic deposits. Whole genome sequence data were generated for four affected individuals. No rare variants (MAF < 0.001) were identified in established corneal dystrophy genes. However, a novel heterozygous missense variant in exon 4 of SPARCL1, NM_004684: c.334G > A; p.(Glu112Lys), which is predicted to be damaging, segregated with disease. SPARC-like protein 1 (SPARCL1) is a secreted matricellular protein involved in cell migration, cell adhesion, tissue repair, and remodelling. Interestingly, SPARCL1 has been shown to regulate decorin. Heterozygous variants in DCN, encoding decorin, cause autosomal dominant congenital stromal corneal dystrophy, suggesting a common pathogenic pathway. Therefore, we performed immunohistochemistry to compare SPARCL1 and decorin localisation in corneal tissue from an affected family member and an unaffected control. Strikingly, the level of decorin was significantly decreased in the corneal stroma of the affected tissue, and SPARCL1 appeared to be retained in the epithelium. In summary, we describe a novel autosomal dominant corneal stromal dystrophy associated with a missense variant in SPARCL1, extending the phenotypic and genetic heterogeneity of inherited corneal disease.
Collapse
Affiliation(s)
| | - Jessica C Gardner
- UCL Institute of Ophthalmology, University College London, London, UK
| | | | | | - Marcos Costa
- UCL Institute of Ophthalmology, University College London, London, UK
| | | | - Anita Szabo
- UCL Institute of Ophthalmology, University College London, London, UK
| | - Petra Lišková
- Department of Paediatrics and Inherited Metabolic Disorders, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic
- Department of Ophthalmology, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic
| | | | - Robert D Young
- Structural Biophysics Group, School of Optometry & Vision Sciences, Cardiff University, Cardiff, UK
| | - Caroline Thaung
- UCL Institute of Ophthalmology, University College London, London, UK
| | - Alice E Davidson
- UCL Institute of Ophthalmology, University College London, London, UK
| | - Stephen J Tuft
- UCL Institute of Ophthalmology, University College London, London, UK
- Department of Corneal and External Eye Disease, Moorfields Eye Hospital, London, UK
| | | |
Collapse
|
5
|
Andhika NS, Biswas S, Hardcastle C, Green DJ, Ramsden SC, Birney E, Black GC, Sergouniotis PI. Using computational approaches to enhance the interpretation of missense variants in the PAX6 gene. Eur J Hum Genet 2024; 32:1005-1013. [PMID: 38849599 PMCID: PMC11292026 DOI: 10.1038/s41431-024-01638-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 04/12/2024] [Accepted: 05/14/2024] [Indexed: 06/09/2024] Open
Abstract
The PAX6 gene encodes a highly-conserved transcription factor involved in eye development. Heterozygous loss-of-function variants in PAX6 can cause a range of ophthalmic disorders including aniridia. A key molecular diagnostic challenge is that many PAX6 missense changes are presently classified as variants of uncertain significance. While computational tools can be used to assess the effect of genetic alterations, the accuracy of their predictions varies. Here, we evaluated and optimised the performance of computational prediction tools in relation to PAX6 missense variants. Through inspection of publicly available resources (including HGMD, ClinVar, LOVD and gnomAD), we identified 241 PAX6 missense variants that were used for model training and evaluation. The performance of ten commonly used computational tools was assessed and a threshold optimization approach was utilized to determine optimal cut-off values. Validation studies were subsequently undertaken using PAX6 variants from a local database. AlphaMissense, SIFT4G and REVEL emerged as the best-performing predictors; the optimized thresholds of these tools were 0.967, 0.025, and 0.772, respectively. Combining the prediction from these top-three tools resulted in lower performance compared to using AlphaMissense alone. Tailoring the use of computational tools by employing optimized thresholds specific to PAX6 can enhance algorithmic performance. Our findings have implications for PAX6 variant interpretation in clinical settings.
Collapse
Affiliation(s)
- Nadya S Andhika
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
| | - Susmito Biswas
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
- Manchester Royal Eye Hospital, Manchester University NHS Foundation Trust, Manchester, UK
| | - Claire Hardcastle
- Manchester Centre for Genomic Medicine, Saint Mary's Hospital, Manchester University NHS Foundation Trust, Manchester, UK
| | - David J Green
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
| | - Simon C Ramsden
- Manchester Centre for Genomic Medicine, Saint Mary's Hospital, Manchester University NHS Foundation Trust, Manchester, UK
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, UK
| | - Graeme C Black
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
- Manchester Centre for Genomic Medicine, Saint Mary's Hospital, Manchester University NHS Foundation Trust, Manchester, UK
| | - Panagiotis I Sergouniotis
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK.
- Manchester Royal Eye Hospital, Manchester University NHS Foundation Trust, Manchester, UK.
- Manchester Centre for Genomic Medicine, Saint Mary's Hospital, Manchester University NHS Foundation Trust, Manchester, UK.
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, UK.
| |
Collapse
|
6
|
Kováčová M, Hlaváč V, Koževnikovová R, Rauš K, Gatěk J, Souček P. Artificial Intelligence-Driven Prediction Revealed CFTR Associated with Therapy Outcome of Breast Cancer: A Feasibility Study. Oncology 2024:1-12. [PMID: 39025053 DOI: 10.1159/000540395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 07/09/2024] [Indexed: 07/20/2024]
Abstract
INTRODUCTION In silico tools capable of predicting the functional consequences of genomic differences between individuals, many of which are AI-driven, have been the most effective over the past two decades for non-synonymous single nucleotide variants (nsSNVs). When appropriately selected for the purpose of the study, a high predictive performance can be expected. In this feasibility study, we investigate the distribution of nsSNVs with an allele frequency below 5%. To classify the putative functional consequence, a tier-based filtration led by AI-driven predictors and scoring system was implemented to the overall decision-making process, resulting in a list of prioritised genes. METHODS The study has been conducted on breast cancer patients of homogeneous ethnicity. Germline rare variants have been sequenced in genes that influence pharmacokinetic parameters of anticancer drugs or molecular signalling pathways in cancer. After AI-driven functional pathogenicity classification and data mining in pharmacogenomic (PGx) databases, variants were collapsed to the gene level and ranked according to their putative deleterious role. RESULTS In breast cancer patients, seven of the twelve genes prioritised based on the predictions were found to be associated with response to oncotherapy, histological grade, and tumour subtype. Most importantly, we showed that the group of patients with at least one rare nsSNVs in cystic fibrosis transmembrane conductance regulator (CFTR) had significantly reduced disease-free (log rank, p = 0.002) and overall survival (log rank, p = 0.006). CONCLUSION AI-driven in silico analysis with PGx data mining provided an effective approach navigating for functional consequences across germline genetic background, which can be easily integrated into the overall decision-making process for future studies. The study revealed a statistically significant association with numerous clinicopathological parameters, including treatment response. Our study indicates that CFTR may be involved in the processes influencing the effectiveness of oncotherapy or in the malignant progression of the disease itself.
Collapse
Affiliation(s)
- Mária Kováčová
- Third Faculty of Medicine, Charles University, Prague, Czechia
| | - Viktor Hlaváč
- Laboratory of Pharmacogenomics, Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Pilsen, Czechia
- Toxicogenomics Unit, National Institute of Public Health, Prague, Czechia
| | | | - Karel Rauš
- Institute for the Care for Mother and Child, Prague, Czechia
| | - Jiří Gatěk
- Department of Surgery, EUC Hospital and University of Tomas Bata in Zlin, Zlin, Czechia
| | - Pavel Souček
- Laboratory of Pharmacogenomics, Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Pilsen, Czechia
- Toxicogenomics Unit, National Institute of Public Health, Prague, Czechia
| |
Collapse
|
7
|
Haghshenas S, Bout HJ, Schijns JM, Levy MA, Kerkhof J, Bhai P, McConkey H, Jenkins ZA, Williams EM, Halliday BJ, Huisman SA, Lauffer P, de Waard V, Witteveen L, Banka S, Brady AF, Galazzi E, van Gils J, Hurst ACE, Kaiser FJ, Lacombe D, Martinez-Monseny AF, Fergelot P, Monteiro FP, Parenti I, Persani L, Santos-Simarro F, Simpson BN, Alders M, Robertson SP, Sadikovic B, Menke LA. Menke-Hennekam syndrome; delineation of domain-specific subtypes with distinct clinical and DNA methylation profiles. HGG ADVANCES 2024; 5:100287. [PMID: 38553851 PMCID: PMC11040166 DOI: 10.1016/j.xhgg.2024.100287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 03/26/2024] [Accepted: 03/26/2024] [Indexed: 04/18/2024] Open
Abstract
CREB-binding protein (CBP, encoded by CREBBP) and its paralog E1A-associated protein (p300, encoded by EP300) are involved in histone acetylation and transcriptional regulation. Variants that produce a null allele or disrupt the catalytic domain of either protein cause Rubinstein-Taybi syndrome (RSTS), while pathogenic missense and in-frame indel variants in parts of exons 30 and 31 cause phenotypes recently described as Menke-Hennekam syndrome (MKHK). To distinguish MKHK subtypes and define their characteristics, molecular and extended clinical data on 82 individuals (54 unpublished) with variants affecting CBP (n = 71) or p300 (n = 11) (NP_004371.2 residues 1,705-1,875 and NP_001420.2 residues 1,668-1,833, respectively) were summarized. Additionally, genome-wide DNA methylation profiles were assessed in DNA extracted from whole peripheral blood from 54 individuals. Most variants clustered closely around the zinc-binding residues of two zinc-finger domains (ZZ and TAZ2) and within the first α helix of the fourth intrinsically disordered linker (ID4) of CBP/p300. Domain-specific methylation profiles were discerned for the ZZ domain in CBP/p300 (found in nine out of 10 tested individuals) and TAZ2 domain in CBP (in 14 out of 20), while a domain-specific diagnostic episignature was refined for the ID4 domain in CBP/p300 (in 21 out of 21). Phenotypes including intellectual disability of varying degree and distinct physical features were defined for each of the regions. These findings demonstrate existence of at least three MKHK subtypes, which are domain specific (MKHK-ZZ, MKHK-TAZ2, and MKHK-ID4) rather than gene specific (CREBBP/EP300). DNA methylation episignatures enable stratification of molecular pathophysiologic entities within a gene or across a family of paralogous genes.
Collapse
Affiliation(s)
- Sadegheh Haghshenas
- Verspeeten Clinical Genome Centre, London Health Sciences Centre, London ON N6A 5W9, Canada
| | - Hidde J Bout
- Department of Pediatrics, Emma Children's Hospital, Amsterdam UMC, University of Amsterdam, Amsterdam Reproduction and Development Research Institute, 1105 Amsterdam, AZ, the Netherlands
| | - Josephine M Schijns
- Department of Pediatrics, Emma Children's Hospital, Amsterdam UMC, University of Amsterdam, Amsterdam Reproduction and Development Research Institute, 1105 Amsterdam, AZ, the Netherlands
| | - Michael A Levy
- Verspeeten Clinical Genome Centre, London Health Sciences Centre, London ON N6A 5W9, Canada
| | - Jennifer Kerkhof
- Verspeeten Clinical Genome Centre, London Health Sciences Centre, London ON N6A 5W9, Canada
| | - Pratibha Bhai
- Verspeeten Clinical Genome Centre, London Health Sciences Centre, London ON N6A 5W9, Canada
| | - Haley McConkey
- Verspeeten Clinical Genome Centre, London Health Sciences Centre, London ON N6A 5W9, Canada
| | - Zandra A Jenkins
- Department of Women's and Children's Health, Dunedin School of Medicine, University of Otago, Dunedin 9016, New Zealand
| | - Ella M Williams
- Department of Women's and Children's Health, Dunedin School of Medicine, University of Otago, Dunedin 9016, New Zealand
| | - Benjamin J Halliday
- Department of Women's and Children's Health, Dunedin School of Medicine, University of Otago, Dunedin 9016, New Zealand
| | - Sylvia A Huisman
- Department of Pediatrics, Emma Children's Hospital, Amsterdam UMC, University of Amsterdam, Amsterdam Reproduction and Development Research Institute, 1105 Amsterdam, AZ, the Netherlands; Zodiak, Prinsenstichting, Purmerend, JE 1444, the Netherlands
| | - Peter Lauffer
- Department of Human Genetics, Amsterdam UMC, University of Amsterdam, Amsterdam Reproduction and Development Research Institute, Amsterdam 1105 AZ, the Netherlands
| | - Vivian de Waard
- Department of Medical Biochemistry, Amsterdam UMC, University of Amsterdam, Amsterdam Cardiovascular Sciences, Amsterdam, AZ 1105, the Netherlands
| | - Laura Witteveen
- Department of Pediatrics, Emma Children's Hospital, Amsterdam UMC, University of Amsterdam, Amsterdam Reproduction and Development Research Institute, 1105 Amsterdam, AZ, the Netherlands
| | - Siddharth Banka
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9WL, UK; Manchester Centre for Genomic Medicine, Saint Mary's Hospital, Manchester University NHS Foundation Trust, Manchester M13 9WL, UK
| | - Angela F Brady
- North West Thames Regional Genetics Service, Northwick Park Hospital, Harrow HA1 3UJ, UK
| | - Elena Galazzi
- Department of Endocrine & Metabolic Diseases, San Luca Hospital, IRCCS Istituto Auxologico Italiano, 20100 Milan, Italy
| | - Julien van Gils
- Centre Hospitalier Universitaire Bordeaux, 33404 Bordeaux, France
| | - Anna C E Hurst
- Department of Genetics, University of Alabama, Birmingham, AL 35294-0024, USA
| | - Frank J Kaiser
- Institute of Human Genetics, University of Duisburg-Essen, 45122 Essen, Germany; Center for Rare Diseases, University Hospital Essen, 45122 Essen, Germany
| | - Didier Lacombe
- Centre Hospitalier Universitaire Bordeaux, 33404 Bordeaux, France
| | - Antonio F Martinez-Monseny
- Genètica Clínica, Servei de Medicina Genètica i Molecular, Hospital Sant Joan de Déu, 08950 Barcelona, Spain
| | | | | | - Ilaria Parenti
- Institute of Human Genetics, University of Duisburg-Essen, 45122 Essen, Germany
| | - Luca Persani
- Department of Endocrine & Metabolic Diseases, San Luca Hospital, IRCCS Istituto Auxologico Italiano, 20100 Milan, Italy; Department of Medical Biotechnology and Translational Medicine, University of Milan, 20100 Milan, Italy
| | - Fernando Santos-Simarro
- Institute of Medical and Molecular Genetics (INGEMM), Hospital Universitario La Paz, IdiPAZ, CIBERER, ISCIII, 28029 Madrid, Spain; Unit of Molecular Diagnostics and Clinical Genetics, Hospital Universitari Son Espases, Health Research Institute of the Balearic Islands (IdISBa), 07120 Palma, Spain
| | - Brittany N Simpson
- Department of Pediatrics, Division of Human Genetics, Cincinnati Children's Hospital Medical Center, University of Cincinnati School of Medicine, Cincinnati, OH 45206, USA
| | - Mariëlle Alders
- Department of Human Genetics, Amsterdam UMC, University of Amsterdam, Amsterdam Reproduction and Development Research Institute, Amsterdam 1105 AZ, the Netherlands
| | - Stephen P Robertson
- Department of Women's and Children's Health, Dunedin School of Medicine, University of Otago, Dunedin 9016, New Zealand
| | - Bekim Sadikovic
- Verspeeten Clinical Genome Centre, London Health Sciences Centre, London ON N6A 5W9, Canada; Department of Pathology and Laboratory Medicine, Western University, London, ON N6A3K7, Canada.
| | - Leonie A Menke
- Department of Pediatrics, Emma Children's Hospital, Amsterdam UMC, University of Amsterdam, Amsterdam Reproduction and Development Research Institute, 1105 Amsterdam, AZ, the Netherlands.
| |
Collapse
|
8
|
Brock DC, Wang M, Hussain HMJ, Rauch DE, Marra M, Pennesi ME, Yang P, Everett L, Ajlan RS, Colbert J, Porto FBO, Matynia A, Gorin MB, Koenekoop RK, Lopez I, Sui R, Zou G, Li Y, Chen R. Comparative analysis of in-silico tools in identifying pathogenic variants in dominant inherited retinal diseases. Hum Mol Genet 2024; 33:945-957. [PMID: 38453143 PMCID: PMC11102593 DOI: 10.1093/hmg/ddae028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 02/16/2024] [Accepted: 02/19/2024] [Indexed: 03/09/2024] Open
Abstract
Inherited retinal diseases (IRDs) are a group of rare genetic eye conditions that cause blindness. Despite progress in identifying genes associated with IRDs, improvements are necessary for classifying rare autosomal dominant (AD) disorders. AD diseases are highly heterogenous, with causal variants being restricted to specific amino acid changes within certain protein domains, making AD conditions difficult to classify. Here, we aim to determine the top-performing in-silico tools for predicting the pathogenicity of AD IRD variants. We annotated variants from ClinVar and benchmarked 39 variant classifier tools on IRD genes, split by inheritance pattern. Using area-under-the-curve (AUC) analysis, we determined the top-performing tools and defined thresholds for variant pathogenicity. Top-performing tools were assessed using genome sequencing on a cohort of participants with IRDs of unknown etiology. MutScore achieved the highest accuracy within AD genes, yielding an AUC of 0.969. When filtering for AD gain-of-function and dominant negative variants, BayesDel had the highest accuracy with an AUC of 0.997. Five participants with variants in NR2E3, RHO, GUCA1A, and GUCY2D were confirmed to have dominantly inherited disease based on pedigree, phenotype, and segregation analysis. We identified two uncharacterized variants in GUCA1A (c.428T>A, p.Ile143Thr) and RHO (c.631C>G, p.His211Asp) in three participants. Our findings support using a multi-classifier approach comprised of new missense classifier tools to identify pathogenic variants in participants with AD IRDs. Our results provide a foundation for improved genetic diagnosis for people with IRDs.
Collapse
Affiliation(s)
- Daniel C Brock
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
- Medical Scientist Training Program, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - Meng Wang
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - Hafiz Muhammad Jafar Hussain
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - David E Rauch
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - Molly Marra
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
| | - Mark E Pennesi
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
| | - Paul Yang
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
| | - Lesley Everett
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
| | - Radwan S Ajlan
- Department of Ophthalmology, University of Kansas School of Medicine, 3901 Rainbow Blvd, Kansas City, KS 66160, United States
| | - Jason Colbert
- Department of Ophthalmology, University of Kansas School of Medicine, 3901 Rainbow Blvd, Kansas City, KS 66160, United States
| | - Fernanda Belga Ottoni Porto
- INRET Clínica e Centro de Pesquisa, Rua dos Otoni, 735/507 - Santa Efigênia, Belo Horizonte, MG 30150270, Brazil
- Department of Ophthalmology, Santa Casa de Misericórdia de Belo Horizonte, Av. Francisco Sales, 1111 - Santa Efigênia, Belo Horizonte, MG 30150221, Brazil
- Centro Oftalmológico de Minas Gerais, R. Santa Catarina, 941 - Lourdes, Belo Horizonte, MG 30180070, Brazil
| | - Anna Matynia
- College of Optometry, University of Houston, 4401 Martin Luther King Boulevard, Houston, TX 77004, United States
| | - Michael B Gorin
- Jules Stein Eye Institute, University of California Los Angeles, 100 Stein Plaza, Los Angeles, CA 90095, United States
- Department of Ophthalmology, University of California Los Angeles David Geffen School of Medicine, 10833 Le Conte Ave, Los Angeles, CA 90095, United States
| | - Robert K Koenekoop
- McGill Ocular Genetics Laboratory and Centre, Department of Paediatric Surgery, Human Genetics, and Ophthalmology, McGill University Health Centre, 5252 Boul de Maisonneuve ouest, Montreal, QC H4A 3S5, Canada
| | - Irma Lopez
- McGill Ocular Genetics Laboratory and Centre, Department of Paediatric Surgery, Human Genetics, and Ophthalmology, McGill University Health Centre, 5252 Boul de Maisonneuve ouest, Montreal, QC H4A 3S5, Canada
| | - Ruifang Sui
- Department of Ophthalmology, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, WC67+HW Dongcheng, Beijing 100005, China
| | - Gang Zou
- Department of Ophthalmology, Ningxia Eye Hospital, People's Hospital of Ningxia Hui Autonomous Region, First Affiliated Hospital of Northwest University for Nationalities, Ningxia Clinical Research Center on Diseases of Blindness in Eye, F4RJ+43 Xixia District, Yinchuan, Ningxia, China
| | - Yumei Li
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - Rui Chen
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| |
Collapse
|
9
|
Gougeard N, Sancho-Vaello E, Fernández-Murga ML, Martínez-Sinisterra B, Loukili-Hassani B, Häberle J, Marco-Marín C, Rubio V. Use of pure recombinant human enzymes to assess the disease-causing potential of missense mutations in urea cycle disorders, applied to N-acetylglutamate synthase deficiency. J Inherit Metab Dis 2024. [PMID: 38740568 DOI: 10.1002/jimd.12747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 04/21/2024] [Accepted: 04/23/2024] [Indexed: 05/16/2024]
Abstract
N-acetylglutamate synthase (NAGS) makes acetylglutamate, the essential activator of the first, regulatory enzyme of the urea cycle, carbamoyl phosphate synthetase 1 (CPS1). NAGS deficiency (NAGSD) and CPS1 deficiency (CPS1D) present identical phenotypes. However, they must be distinguished, because NAGSD is cured by substitutive therapy with the N-acetyl-L-glutamate analogue N-carbamyl-L-glutamate, while curative therapy of CPS1D requires liver transplantation. Since their differentiation is done genetically, it is important to ascertain the disease-causing potential of CPS1 and NAGS genetic variants. With this goal, we previously carried out site-directed mutagenesis studies with pure recombinant human CPS1. We could not do the same with human NAGS (HuNAGS) because of enzyme instability, leading to our prior utilization of a bacterial NAGS as an imperfect surrogate of HuNAGS. We now use genuine HuNAGS, stabilized as a chimera of its conserved domain (cHuNAGS) with the maltose binding protein (MBP), and produced in Escherichia coli. MBP-cHuNAGS linker cleavage allowed assessment of the enzymatic properties and thermal stability of cHuNAGS, either wild-type or hosting each one of 23 nonsynonymous single-base changes found in NAGSD patients. For all but one change, disease causation was accounted by the enzymatic alterations identified, including, depending on the variant, loss of arginine activation, increased Km Glutamate, active site inactivation, decreased thermal stability, and protein misfolding. Our present approach outperforms experimental in vitro use of bacterial NAGS or in silico utilization of prediction servers (including AlphaMissense), illustrating with HuNAGS the value for UCDs of using recombinant enzymes for assessing disease-causation and molecular pathogenesis, and for therapeutic guidance.
Collapse
Affiliation(s)
- Nadine Gougeard
- Instituto de Biomedicina de Valencia, IBV-CSIC, Valencia, Spain
- Group 739, Centro de Investigación Biomédica en Red de Enfermedades Raras, (CIBERER-ISCIII) at the IBV-CSIC, Valencia, Spain
| | | | | | | | | | - Johannes Häberle
- University Children's Hospital Zurich and Children's Research Centre, Zurich, Switzerland
| | - Clara Marco-Marín
- Instituto de Biomedicina de Valencia, IBV-CSIC, Valencia, Spain
- Group 739, Centro de Investigación Biomédica en Red de Enfermedades Raras, (CIBERER-ISCIII) at the IBV-CSIC, Valencia, Spain
| | - Vicente Rubio
- Instituto de Biomedicina de Valencia, IBV-CSIC, Valencia, Spain
- Group 739, Centro de Investigación Biomédica en Red de Enfermedades Raras, (CIBERER-ISCIII) at the IBV-CSIC, Valencia, Spain
| |
Collapse
|
10
|
Livesey BJ, Badonyi M, Dias M, Frazer J, Kumar S, Lindorff-Larsen K, McCandlish DM, Orenbuch R, Shearer CA, Muffley L, Foreman J, Glazer AM, Lehner B, Marks DS, Roth FP, Rubin AF, Starita LM, Marsh JA. Guidelines for releasing a variant effect predictor. ARXIV 2024:arXiv:2404.10807v1. [PMID: 38699161 PMCID: PMC11065047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Computational methods for assessing the likely impacts of mutations, known as variant effect predictors (VEPs), are widely used in the assessment and interpretation of human genetic variation, as well as in other applications like protein engineering. Many different VEPs have been released to date, and there is tremendous variability in their underlying algorithms and outputs, and in the ways in which the methodologies and predictions are shared. This leads to considerable challenges for end users in knowing which VEPs to use and how to use them. Here, to address these issues, we provide guidelines and recommendations for the release of novel VEPs. Emphasising open-source availability, transparent methodologies, clear variant effect score interpretations, standardised scales, accessible predictions, and rigorous training data disclosure, we aim to improve the usability and interpretability of VEPs, and promote their integration into analysis and evaluation pipelines. We also provide a large, categorised list of currently available VEPs, aiming to facilitate the discovery and encourage the usage of novel methods within the scientific community.
Collapse
Affiliation(s)
- Benjamin J. Livesey
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Mihaly Badonyi
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Mafalda Dias
- Centre for Genomic Regulation (CRG),The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Jonathan Frazer
- Centre for Genomic Regulation (CRG),The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Sushant Kumar
- Department of Medical Biophysics, University of Toronto; Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - David M. McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Rose Orenbuch
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | | | - Lara Muffley
- Department of Genome Sciences, University of Washington and the Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Julia Foreman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Ben Lehner
- Wellcome Sanger Institute, Cambridge, UK; Universitat Pompeu Fabra (UPF), Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Debora S. Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Boston, MA, USA
| | - Frederick P. Roth
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Alan F. Rubin
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research; Department of Medical Biology, University of Melbourne, Parkville, Australia
| | - Lea M. Starita
- Department of Genome Sciences, University of Washington and the Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Joseph A. Marsh
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
11
|
Larrea‐Sebal A, Sasiain I, Jebari‐Benslaiman S, Galicia‐Garcia U, Uribe KB, Benito‐Vicente A, Gracia‐Rubio I, Bediaga‐Bañeres H, Arrasate S, Cenarro A, Civeira F, González‐Díaz H, Martín C. OptiMo-LDLr: An Integrated In Silico Model with Enhanced Predictive Power for LDL Receptor Variants, Unraveling Hot Spot Pathogenic Residues. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2305177. [PMID: 38258479 PMCID: PMC10987110 DOI: 10.1002/advs.202305177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 12/11/2023] [Indexed: 01/24/2024]
Abstract
Familial hypercholesterolemia (FH) is an inherited metabolic disease affecting cholesterol metabolism, with 90% of cases caused by mutations in the LDL receptor gene (LDLR), primarily missense mutations. This study aims to integrate six commonly used predictive software to create a new model for predicting LDLR mutation pathogenicity and mapping hot spot residues. Six predictive-software are selected: Polyphen-2, SIFT, MutationTaster, REVEL, VARITY, and MLb-LDLr. Software accuracy is tested with the characterized variants annotated in ClinVar and, by bioinformatic and machine learning techniques all models are integrated into a more accurate one. The resulting optimized model presents a specificity of 96.71% and a sensitivity of 98.36%. Hot spot residues with high potential of pathogenicity appear across all domains except for the signal peptide and the O-linked domain. In addition, translating this information into 3D structure of the LDLr highlights potentially pathogenic clusters within the different domains, which may be related to specific biological function. The results of this work provide a powerful tool to classify LDLR pathogenic variants. Moreover, an open-access guide user interface (OptiMo-LDLr) is provided to the scientific community. This study shows that combination of several predictive software results in a more accurate prediction to help clinicians in FH diagnosis.
Collapse
Affiliation(s)
- Asier Larrea‐Sebal
- Biofisika Institute (UPV/EHU, CSIC)Barrio Sarriena s/n.LeioaBizkaia48940Spain
- Department of Biochemistry and Molecular BiologyUniversidad del País Vasco UPV/EHULeioaBizkaia48940Spain
- Fundación Biofisika BizkaiaBarrio Sarriena s/n.LeioaBizkaia48940Spain
| | - Iñaki Sasiain
- Department of Biochemistry and Molecular BiologyUniversidad del País Vasco UPV/EHULeioaBizkaia48940Spain
| | - Shifa Jebari‐Benslaiman
- Biofisika Institute (UPV/EHU, CSIC)Barrio Sarriena s/n.LeioaBizkaia48940Spain
- Department of Biochemistry and Molecular BiologyUniversidad del País Vasco UPV/EHULeioaBizkaia48940Spain
| | - Unai Galicia‐Garcia
- Biofisika Institute (UPV/EHU, CSIC)Barrio Sarriena s/n.LeioaBizkaia48940Spain
- Department of Biochemistry and Molecular BiologyUniversidad del País Vasco UPV/EHULeioaBizkaia48940Spain
| | - Kepa B. Uribe
- Department of Biochemistry and Molecular BiologyUniversidad del País Vasco UPV/EHULeioaBizkaia48940Spain
| | - Asier Benito‐Vicente
- Biofisika Institute (UPV/EHU, CSIC)Barrio Sarriena s/n.LeioaBizkaia48940Spain
- Department of Biochemistry and Molecular BiologyUniversidad del País Vasco UPV/EHULeioaBizkaia48940Spain
| | - Irene Gracia‐Rubio
- Lipid Unit, Hospital Universitario Miguel Servet, IIS Aragon, CIBERCVUniversidad de ZaragozaZaragoza50009Spain
| | | | - Sonia Arrasate
- Department of Organic and ChemistryUniversity of the Basque Country UPV/EHULeioa48940Spain
| | - Ana Cenarro
- Lipid Unit, Hospital Universitario Miguel Servet, IIS Aragon, CIBERCVUniversidad de ZaragozaZaragoza50009Spain
| | - Fernando Civeira
- Lipid Unit, Hospital Universitario Miguel Servet, IIS Aragon, CIBERCVUniversidad de ZaragozaZaragoza50009Spain
| | - Humberto González‐Díaz
- Biofisika Institute (UPV/EHU, CSIC)Barrio Sarriena s/n.LeioaBizkaia48940Spain
- Ikerbasque, Basque Foundation for ScienceBilbaoBizkaia48013Spain
| | - Cesar Martín
- Biofisika Institute (UPV/EHU, CSIC)Barrio Sarriena s/n.LeioaBizkaia48940Spain
- Department of Biochemistry and Molecular BiologyUniversidad del País Vasco UPV/EHULeioaBizkaia48940Spain
| |
Collapse
|
12
|
Cox SN, Lo Giudice C, Lavecchia A, Poeta ML, Chiara M, Picardi E, Pesole G. Mitochondrial and Nuclear DNA Variants in Amyotrophic Lateral Sclerosis: Enrichment in the Mitochondrial Control Region and Sirtuin Pathway Genes in Spinal Cord Tissue. Biomolecules 2024; 14:411. [PMID: 38672428 PMCID: PMC11048214 DOI: 10.3390/biom14040411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 03/19/2024] [Accepted: 03/23/2024] [Indexed: 04/28/2024] Open
Abstract
Amyotrophic Lateral Sclerosis (ALS) is a progressive disease with prevalent mitochondrial dysfunctions affecting both upper and lower motor neurons in the motor cortex, brainstem, and spinal cord. Despite mitochondria having their own genome (mtDNA), in humans, most mitochondrial genes are encoded by the nuclear genome (nDNA). Our study aimed to simultaneously screen for nDNA and mtDNA genomes to assess for specific variant enrichment in ALS compared to control tissues. Here, we analysed whole exome (WES) and whole genome (WGS) sequencing data from spinal cord tissues, respectively, of 6 and 12 human donors. A total of 31,257 and 301,241 variants in nuclear-encoded mitochondrial genes were identified from WES and WGS, respectively, while mtDNA reads accounted for 73 and 332 variants. Despite technical differences, both datasets consistently revealed a specific enrichment of variants in the mitochondrial Control Region (CR) and in several of these genes directly associated with mitochondrial dynamics or with Sirtuin pathway genes within ALS tissues. Overall, our data support the hypothesis of a variant burden in specific genes, highlighting potential actionable targets for therapeutic interventions in ALS.
Collapse
Affiliation(s)
- Sharon Natasha Cox
- Department of Biosciences, Biotechnology and Environment, University of Bari “Aldo Moro”, 70126 Bari, Italy; (A.L.); (M.L.P.); (E.P.)
| | - Claudio Lo Giudice
- Institute of Biomedical Technologies, National Research Council, 70126 Bari, Italy;
| | - Anna Lavecchia
- Department of Biosciences, Biotechnology and Environment, University of Bari “Aldo Moro”, 70126 Bari, Italy; (A.L.); (M.L.P.); (E.P.)
| | - Maria Luana Poeta
- Department of Biosciences, Biotechnology and Environment, University of Bari “Aldo Moro”, 70126 Bari, Italy; (A.L.); (M.L.P.); (E.P.)
| | - Matteo Chiara
- Department of Biosciences, University of Milan, 20133 Milan, Italy;
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, National Research Council, 70126 Bari, Italy
| | - Ernesto Picardi
- Department of Biosciences, Biotechnology and Environment, University of Bari “Aldo Moro”, 70126 Bari, Italy; (A.L.); (M.L.P.); (E.P.)
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, National Research Council, 70126 Bari, Italy
| | - Graziano Pesole
- Department of Biosciences, Biotechnology and Environment, University of Bari “Aldo Moro”, 70126 Bari, Italy; (A.L.); (M.L.P.); (E.P.)
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, National Research Council, 70126 Bari, Italy
| |
Collapse
|
13
|
Saez-Matia A, Ibarluzea MG, M-Alicante S, Muguruza-Montero A, Nuñez E, Ramis R, Ballesteros OR, Lasa-Goicuria D, Fons C, Gallego M, Casis O, Leonardo A, Bergara A, Villarroel A. MLe-KCNQ2: An Artificial Intelligence Model for the Prognosis of Missense KCNQ2 Gene Variants. Int J Mol Sci 2024; 25:2910. [PMID: 38474157 DOI: 10.3390/ijms25052910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 02/27/2024] [Accepted: 02/29/2024] [Indexed: 03/14/2024] Open
Abstract
Despite the increasing availability of genomic data and enhanced data analysis procedures, predicting the severity of associated diseases remains elusive in the absence of clinical descriptors. To address this challenge, we have focused on the KV7.2 voltage-gated potassium channel gene (KCNQ2), known for its link to developmental delays and various epilepsies, including self-limited benign familial neonatal epilepsy and epileptic encephalopathy. Genome-wide tools often exhibit a tendency to overestimate deleterious mutations, frequently overlooking tolerated variants, and lack the capacity to discriminate variant severity. This study introduces a novel approach by evaluating multiple machine learning (ML) protocols and descriptors. The combination of genomic information with a novel Variant Frequency Index (VFI) builds a robust foundation for constructing reliable gene-specific ML models. The ensemble model, MLe-KCNQ2, formed through logistic regression, support vector machine, random forest and gradient boosting algorithms, achieves specificity and sensitivity values surpassing 0.95 (AUC-ROC > 0.98). The ensemble MLe-KCNQ2 model also categorizes pathogenic mutations as benign or severe, with an area under the receiver operating characteristic curve (AUC-ROC) above 0.67. This study not only presents a transferable methodology for accurately classifying KCNQ2 missense variants, but also provides valuable insights for clinical counseling and aids in the determination of variant severity. The research context emphasizes the necessity of precise variant classification, especially for genes like KCNQ2, contributing to the broader understanding of gene-specific challenges in the field of genomic research. The MLe-KCNQ2 model stands as a promising tool for enhancing clinical decision making and prognosis in the realm of KCNQ2-related pathologies.
Collapse
Affiliation(s)
| | - Markel G Ibarluzea
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
- Donostia International Physics Center, 20018 Donostia, Spain
| | - Sara M-Alicante
- Instituto Biofisika, CSIC-UPV/EHU, 48940 Leioa, Spain
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
| | | | - Eider Nuñez
- Instituto Biofisika, CSIC-UPV/EHU, 48940 Leioa, Spain
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
| | - Rafael Ramis
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
- Donostia International Physics Center, 20018 Donostia, Spain
| | - Oscar R Ballesteros
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
- Centro de Física de Materiales CFM, CSIC-UPV/EHU, 20018 Donostia, Spain
| | | | - Carmen Fons
- Pediatric Neurology Department, Sant Joan de Déu Hospital, Institut de Recerca Sant Joan de Déu, Barcelona University, 08950 Barcelona, Spain
| | - Mónica Gallego
- Departamento de Fisiología, Universidad del País Vasco, UPV/EHU, 01006 Vitoria-Gasteiz, Spain
| | - Oscar Casis
- Departamento de Fisiología, Universidad del País Vasco, UPV/EHU, 01006 Vitoria-Gasteiz, Spain
| | - Aritz Leonardo
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
- Donostia International Physics Center, 20018 Donostia, Spain
| | - Aitor Bergara
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
- Donostia International Physics Center, 20018 Donostia, Spain
- Centro de Física de Materiales CFM, CSIC-UPV/EHU, 20018 Donostia, Spain
| | | |
Collapse
|
14
|
Pathan N, Deng WQ, Di Scipio M, Khan M, Mao S, Morton RW, Lali R, Pigeyre M, Chong MR, Paré G. A method to estimate the contribution of rare coding variants to complex trait heritability. Nat Commun 2024; 15:1245. [PMID: 38336875 PMCID: PMC10858280 DOI: 10.1038/s41467-024-45407-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 01/22/2024] [Indexed: 02/12/2024] Open
Abstract
It has been postulated that rare coding variants (RVs; MAF < 0.01) contribute to the "missing" heritability of complex traits. We developed a framework, the Rare variant heritability (RARity) estimator, to assess RV heritability (h2RV) without assuming a particular genetic architecture. We applied RARity to 31 complex traits in the UK Biobank (n = 167,348) and showed that gene-level RV aggregation suffers from 79% (95% CI: 68-93%) loss of h2RV. Using unaggregated variants, 27 traits had h2RV > 5%, with height having the highest h2RV at 21.9% (95% CI: 19.0-24.8%). The total heritability, including common and rare variants, recovered pedigree-based estimates for 11 traits. RARity can estimate gene-level h2RV, enabling the assessment of gene-level characteristics and revealing 11, previously unreported, gene-phenotype relationships. Finally, we demonstrated that in silico pathogenicity prediction (variant-level) and gene-level annotations do not generally enrich for RVs that over-contribute to complex trait variance, and thus, innovative methods are needed to predict RV functionality.
Collapse
Affiliation(s)
- Nazia Pathan
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Pathology and Molecular Medicine, McMaster University, Michael G. DeGroote School of Medicine, Hamilton, Canada
| | - Wei Q Deng
- Peter Boris Centre for Addictions Research, St. Joseph's Healthcare Hamilton, Hamilton, Canada
- Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, Canada
| | - Matteo Di Scipio
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Medicine, Faculty of Health Sciences, McMaster University, Hamilton, Canada
| | - Mohammad Khan
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Medicine, Faculty of Health Sciences, McMaster University, Hamilton, Canada
| | - Shihong Mao
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
| | - Robert W Morton
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Pathology and Molecular Medicine, McMaster University, Michael G. DeGroote School of Medicine, Hamilton, Canada
| | - Ricky Lali
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| | - Marie Pigeyre
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Medicine, Faculty of Health Sciences, McMaster University, Hamilton, Canada
| | - Michael R Chong
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Pathology and Molecular Medicine, McMaster University, Michael G. DeGroote School of Medicine, Hamilton, Canada
- Thrombosis and Atherosclerosis Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton, Canada
| | - Guillaume Paré
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada.
- Department of Pathology and Molecular Medicine, McMaster University, Michael G. DeGroote School of Medicine, Hamilton, Canada.
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada.
- Thrombosis and Atherosclerosis Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton, Canada.
| |
Collapse
|
15
|
Ciesielski TH, Sirugo G, Iyengar SK, Williams SM. Characterizing the pathogenicity of genetic variants: the consequences of context. NPJ Genom Med 2024; 9:3. [PMID: 38195641 PMCID: PMC10776585 DOI: 10.1038/s41525-023-00386-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 12/15/2023] [Indexed: 01/11/2024] Open
Affiliation(s)
- Timothy H Ciesielski
- The Department of Population and Quantitative Health Sciences at Case Western Reserve University School of Medicine, Cleveland, OH, USA.
- Mary Ann Swetland Center for Environmental Health at Case Western Reserve University School of Medicine, Cleveland, OH, USA.
- Ronin Institute, Montclair, NJ, USA.
| | - Giorgio Sirugo
- The Department of Population and Quantitative Health Sciences at Case Western Reserve University School of Medicine, Cleveland, OH, USA
- Institute of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Division of Translational Medicine and Human Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Sudha K Iyengar
- The Department of Population and Quantitative Health Sciences at Case Western Reserve University School of Medicine, Cleveland, OH, USA
- The Department of Genetics and Genome Sciences at Case Western Reserve University School of Medicine, Cleveland, OH, USA
- Cleveland Institute for Computational Biology, Cleveland, OH, USA
| | - Scott M Williams
- The Department of Population and Quantitative Health Sciences at Case Western Reserve University School of Medicine, Cleveland, OH, USA
- The Department of Genetics and Genome Sciences at Case Western Reserve University School of Medicine, Cleveland, OH, USA
- Cleveland Institute for Computational Biology, Cleveland, OH, USA
| |
Collapse
|
16
|
Gunning AC, Wright CF. Evaluating the use of paralogous protein domains to increase data availability for missense variant classification. Genome Med 2023; 15:110. [PMID: 38087376 PMCID: PMC10714540 DOI: 10.1186/s13073-023-01264-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 11/22/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Classification of rare missense variants remains an ongoing challenge in genomic medicine. Evidence of pathogenicity is often sparse, and decisions about how to weigh different evidence classes may be subjective. We used a Bayesian variant classification framework to investigate the performance of variant co-localisation, missense constraint, and aggregating data across paralogous protein domains ("meta-domains"). METHODS We constructed a database of all possible coding single nucleotide variants in the human genome and used PFam predictions to annotate structurally-equivalent positions across protein domains. We counted the number of pathogenic and benign missense variants at these equivalent positions in the ClinVar database, calculated a regional constraint score for each meta-domain, and assessed this approach versus existing missense constraint metrics for classifying variant pathogenicity and benignity. RESULTS Alternative pathogenic missense variants at the same amino acid position in the same protein provide strong evidence of pathogenicity (positive likelihood ratio, LR+ = 85). Additionally, clinically annotated pathogenic or benign missense variants at equivalent positions in different proteins can provide moderate evidence of pathogenicity (LR+ = 7) or benignity (LR+ = 5), respectively. Applying these approaches sequentially (through PM5) increases sensitivity for classifying pathogenic missense variants from 27 to 41%. Missense constraint can also provide strong evidence of pathogenicity for some variants, but its absence provides no evidence of benignity. CONCLUSIONS We propose using structurally equivalent positions across related protein domains from different genes to augment evidence for variant co-localisation when classifying novel missense variants. Additionally, we advocate adopting a numerical evidence-based approach to integrating diverse data in variant interpretation.
Collapse
Affiliation(s)
- Adam Colin Gunning
- Department of Clinical and Biomedical Sciences (Medical School, Faculty of Health and Life Sciences, University of Exeter, RILD, Barrack Road, Exeter, EX2 5DW, UK.
- Exeter Genomics Laboratory, South West Genomic Laboratory Hub, Royal Devon University Healthcare NHS Foundation Trust, RILD, Barrack Road, Exeter, EX2 5DW, UK.
| | - Caroline Fiona Wright
- Department of Clinical and Biomedical Sciences (Medical School, Faculty of Health and Life Sciences, University of Exeter, RILD, Barrack Road, Exeter, EX2 5DW, UK.
| |
Collapse
|
17
|
Wang S, Wang B, Drury V, Drake S, Sun N, Alkhairo H, Arbelaez J, Duhn C, Bal VH, Langley K, Martin J, Hoekstra PJ, Dietrich A, Xing J, Heiman GA, Tischfield JA, Fernandez TV, Owen MJ, O'Donovan MC, Thapar A, State MW, Willsey AJ. Rare X-linked variants carry predominantly male risk in autism, Tourette syndrome, and ADHD. Nat Commun 2023; 14:8077. [PMID: 38057346 PMCID: PMC10700338 DOI: 10.1038/s41467-023-43776-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 11/18/2023] [Indexed: 12/08/2023] Open
Abstract
Autism spectrum disorder (ASD), Tourette syndrome (TS), and attention-deficit/hyperactivity disorder (ADHD) display strong male sex bias, due to a combination of genetic and biological factors, as well as selective ascertainment. While the hemizygous nature of chromosome X (Chr X) in males has long been postulated as a key point of "male vulnerability", rare genetic variation on this chromosome has not been systematically characterized in large-scale whole exome sequencing studies of "idiopathic" ASD, TS, and ADHD. Here, we take advantage of informative recombinations in simplex ASD families to pinpoint risk-enriched regions on Chr X, within which rare maternally-inherited damaging variants carry substantial risk in males with ASD. We then apply a modified transmission disequilibrium test to 13,052 ASD probands and identify a novel high confidence ASD risk gene at exome-wide significance (MAGEC3). Finally, we observe that rare damaging variants within these risk regions carry similar effect sizes in males with TS or ADHD, further clarifying genetic mechanisms underlying male vulnerability in multiple neurodevelopmental disorders that can be exploited for systematic gene discovery.
Collapse
Affiliation(s)
- Sheng Wang
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Belinda Wang
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Vanessa Drury
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Sam Drake
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Nawei Sun
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Hasan Alkhairo
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Juan Arbelaez
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Clif Duhn
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Vanessa H Bal
- Graduate School of Applied and Professional Psychology, Rutgers University, New Brunswick, NJ, USA
| | - Kate Langley
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University School of Medicine, Cardiff, Wales, UK
- School of Psychology, Cardiff University School of Medicine, Cardiff, Wales, UK
| | - Joanna Martin
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University School of Medicine, Cardiff, Wales, UK
| | - Pieter J Hoekstra
- University of Groningen, University Medical Center Groningen, Department of Child and Adolescent Psychiatry, Groningen, The Netherlands
- Accare Child Study Center, Groningen, The Netherlands
| | - Andrea Dietrich
- University of Groningen, University Medical Center Groningen, Department of Child and Adolescent Psychiatry, Groningen, The Netherlands
- Accare Child Study Center, Groningen, The Netherlands
| | - Jinchuan Xing
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers, the State University of New Jersey, Piscataway, NJ, USA
| | - Gary A Heiman
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers, the State University of New Jersey, Piscataway, NJ, USA
| | - Jay A Tischfield
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers, the State University of New Jersey, Piscataway, NJ, USA
| | - Thomas V Fernandez
- Yale Child Study Center and Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
| | - Michael J Owen
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University School of Medicine, Cardiff, Wales, UK
| | - Michael C O'Donovan
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University School of Medicine, Cardiff, Wales, UK
| | - Anita Thapar
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University School of Medicine, Cardiff, Wales, UK
| | - Matthew W State
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - A Jeremy Willsey
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA.
- Quantitative Biosciences Institute (QBI), University of California, San Francisco, San Francisco, CA, 94143, USA.
| |
Collapse
|
18
|
Ahmad RM, Ali BR, Al-Jasmi F, Sinnott RO, Al Dhaheri N, Mohamad MS. A review of genetic variant databases and machine learning tools for predicting the pathogenicity of breast cancer. Brief Bioinform 2023; 25:bbad479. [PMID: 38149678 PMCID: PMC10782903 DOI: 10.1093/bib/bbad479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 09/22/2023] [Accepted: 12/04/2023] [Indexed: 12/28/2023] Open
Abstract
Studies continue to uncover contributing risk factors for breast cancer (BC) development including genetic variants. Advances in machine learning and big data generated from genetic sequencing can now be used for predicting BC pathogenicity. However, it is unclear which tool developed for pathogenicity prediction is most suited for predicting the impact and pathogenicity of variant effects. A significant challenge is to determine the most suitable data source for each tool since different tools can yield different prediction results with different data inputs. To this end, this work reviews genetic variant databases and tools used specifically for the prediction of BC pathogenicity. We provide a description of existing genetic variants databases and, where appropriate, the diseases for which they have been established. Through example, we illustrate how they can be used for prediction of BC pathogenicity and discuss their associated advantages and disadvantages. We conclude that the tools that are specialized by training on multiple diverse datasets from different databases for the same disease have enhanced accuracy and specificity and are thereby more helpful to the clinicians in predicting and diagnosing BC as early as possible.
Collapse
Affiliation(s)
- Rahaf M Ahmad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Bassam R Ali
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Fatma Al-Jasmi
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
- Division of Metabolic Genetics, Department of Pediatrics, Tawam Hospital, Al Ain, United Arab Emirates
| | - Richard O Sinnott
- School of Computing and Information System, Faculty of Engineering and Information Technology, The University of Melbourne, Melbourne, Victoria, Australia
| | - Noura Al Dhaheri
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
- Division of Metabolic Genetics, Department of Pediatrics, Tawam Hospital, Al Ain, United Arab Emirates
| | - Mohd Saberi Mohamad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| |
Collapse
|
19
|
Larrea-Sebal A, Jebari-Benslaiman S, Galicia-Garcia U, Jose-Urteaga AS, Uribe KB, Benito-Vicente A, Martín C. Predictive Modeling and Structure Analysis of Genetic Variants in Familial Hypercholesterolemia: Implications for Diagnosis and Protein Interaction Studies. Curr Atheroscler Rep 2023; 25:839-859. [PMID: 37847331 PMCID: PMC10618353 DOI: 10.1007/s11883-023-01154-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/15/2023] [Indexed: 10/18/2023]
Abstract
PURPOSE OF REVIEW Familial hypercholesterolemia (FH) is a hereditary condition characterized by elevated levels of low-density lipoprotein cholesterol (LDL-C), which increases the risk of cardiovascular disease if left untreated. This review aims to discuss the role of bioinformatics tools in evaluating the pathogenicity of missense variants associated with FH. Specifically, it highlights the use of predictive models based on protein sequence, structure, evolutionary conservation, and other relevant features in identifying genetic variants within LDLR, APOB, and PCSK9 genes that contribute to FH. RECENT FINDINGS In recent years, various bioinformatics tools have emerged as valuable resources for analyzing missense variants in FH-related genes. Tools such as REVEL, Varity, and CADD use diverse computational approaches to predict the impact of genetic variants on protein function. These tools consider factors such as sequence conservation, structural alterations, and receptor binding to aid in interpreting the pathogenicity of identified missense variants. While these predictive models offer valuable insights, the accuracy of predictions can vary, especially for proteins with unique characteristics that might not be well represented in the databases used for training. This review emphasizes the significance of utilizing bioinformatics tools for assessing the pathogenicity of FH-associated missense variants. Despite their contributions, a definitive diagnosis of a genetic variant necessitates functional validation through in vitro characterization or cascade screening. This step ensures the precise identification of FH-related variants, leading to more accurate diagnoses. Integrating genetic data with reliable bioinformatics predictions and functional validation can enhance our understanding of the genetic basis of FH, enabling improved diagnosis, risk stratification, and personalized treatment for affected individuals. The comprehensive approach outlined in this review promises to advance the management of this inherited disorder, potentially leading to better health outcomes for those affected by FH.
Collapse
Affiliation(s)
- Asier Larrea-Sebal
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain
- Fundación Biofisika Bizkaia, 48940, Leioa, Spain
| | - Shifa Jebari-Benslaiman
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain
| | - Unai Galicia-Garcia
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain
| | - Ane San Jose-Urteaga
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
| | - Kepa B Uribe
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
| | - Asier Benito-Vicente
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain
| | - César Martín
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain.
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain.
| |
Collapse
|
20
|
Shirvanizadeh N, Vihinen M. VariBench, new variation benchmark categories and data sets. FRONTIERS IN BIOINFORMATICS 2023; 3:1248732. [PMID: 37795169 PMCID: PMC10546188 DOI: 10.3389/fbinf.2023.1248732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 09/08/2023] [Indexed: 10/06/2023] Open
Affiliation(s)
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
21
|
Moran AL, Fehilly JD, Blacque O, Kennedy BN. Gene therapy for RAB28: What can we learn from zebrafish? Vision Res 2023; 210:108270. [PMID: 37321111 DOI: 10.1016/j.visres.2023.108270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 05/12/2023] [Accepted: 05/12/2023] [Indexed: 06/17/2023]
Abstract
The eye is particularly suited to gene therapy due to its accessibility, immunoprivileged state and compartmentalised structure. Indeed, many clinical trials are underway for therapeutic gene strategies for inherited retinal degenerations (IRDs). However, as there are currently 281 genes associated with IRD, there is still a large unmet need for effective therapies for the majority of IRD-causing genes. In humans, RAB28 null and hypomorphic alleles cause autosomal recessive cone-rod dystrophy (arCORD). Previous work demonstrated that restoring wild type zebrafish Rab28 via germline transgenesis, specifically in cone photoreceptors, is sufficient to rescue the defects in outer segment phagocytosis (OSP) observed in zebrafish rab28-/- knockouts (KO). This rescue suggests that gene therapy for RAB28-associated CORD may be successful by RAB28 gene restoration to cones. It also inspired us to critically consider the scenarios in which zebrafish can provide informative preclinical data for development of gene therapies. Thus, this review focuses on RAB28 biology and disease, and delves into both the opportunities and limitations of using zebrafish as a model for both gene therapy development and as a diagnostic tool for patient variants of unknown significance (VUS).
Collapse
Affiliation(s)
- Ailis L Moran
- UCD School of Biomolecular and Biomedical Science, University College Dublin, Dublin, Ireland; UCD Conway Institute, University College Dublin, Dublin, Ireland
| | - John D Fehilly
- UCD School of Biomolecular and Biomedical Science, University College Dublin, Dublin, Ireland; UCD Conway Institute, University College Dublin, Dublin, Ireland
| | - Oliver Blacque
- UCD School of Biomolecular and Biomedical Science, University College Dublin, Dublin, Ireland; UCD Conway Institute, University College Dublin, Dublin, Ireland
| | - Breandán N Kennedy
- UCD School of Biomolecular and Biomedical Science, University College Dublin, Dublin, Ireland; UCD Conway Institute, University College Dublin, Dublin, Ireland
| |
Collapse
|
22
|
Costantino F, Breban M. Family studies: A useful tool to better understand spondyloarthritis. Joint Bone Spine 2023; 90:105588. [PMID: 37201576 DOI: 10.1016/j.jbspin.2023.105588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 05/12/2023] [Indexed: 05/20/2023]
Abstract
Spondyloarthritis (SpA) is an immune-mediated disease characterized by a high heritability, reflected by strong familial aggregation. Therefore, family studies are a powerful tool for elucidating the genetic basis of SpA. First, they helped to assess the relative importance of genetic and environmental factors and established the polygenic character of the disease. Family-based designs were also historically used to identify genetic factors of susceptibility through linkage analyses. In SpA, three whole-genome linkage studies were published in the 1990's, unfortunately with few consistent results. After having been put aside for several years in favour of case-control GWAS, there is a renewed interest in family-based designs in particular to detect rare variant associations. This review aims at summarizing what family studies have brought to the field of SpA genetics, from genetic epidemiology studies to the most recent rare variant analyses. It also highlights the potential interest of family history of SpA to help diagnosis and detection of patients at high risk to develop the disease.
Collapse
Affiliation(s)
- Félicie Costantino
- Rheumatology Department, AP-HP, Ambroise-Paré Hospital, 92100 Boulogne-Billancourt, France; Infection & Inflammation, UMR 1173, Inserm, UVSQ/Université Paris Saclay, 78180 Montigny-Le-Bretonneux, France; Laboratory of Excellence INFLAMEX, Université Paris-Centre, Paris, France.
| | - Maxime Breban
- Rheumatology Department, AP-HP, Ambroise-Paré Hospital, 92100 Boulogne-Billancourt, France; Infection & Inflammation, UMR 1173, Inserm, UVSQ/Université Paris Saclay, 78180 Montigny-Le-Bretonneux, France; Laboratory of Excellence INFLAMEX, Université Paris-Centre, Paris, France
| |
Collapse
|
23
|
Livesey BJ, Marsh JA. Updated benchmarking of variant effect predictors using deep mutational scanning. Mol Syst Biol 2023; 19:e11474. [PMID: 37310135 PMCID: PMC10407742 DOI: 10.15252/msb.202211474] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 05/30/2023] [Accepted: 06/02/2023] [Indexed: 06/14/2023] Open
Abstract
The assessment of variant effect predictor (VEP) performance is fraught with biases introduced by benchmarking against clinical observations. In this study, building on our previous work, we use independently generated measurements of protein function from deep mutational scanning (DMS) experiments for 26 human proteins to benchmark 55 different VEPs, while introducing minimal data circularity. Many top-performing VEPs are unsupervised methods including EVE, DeepSequence and ESM-1v, a protein language model that ranked first overall. However, the strong performance of recent supervised VEPs, in particular VARITY, shows that developers are taking data circularity and bias issues seriously. We also assess the performance of DMS and unsupervised VEPs for discriminating between known pathogenic and putatively benign missense variants. Our findings are mixed, demonstrating that some DMS datasets perform exceptionally at variant classification, while others are poor. Notably, we observe a striking correlation between VEP agreement with DMS data and performance in identifying clinically relevant variants, strongly supporting the validity of our rankings and the utility of DMS for independent benchmarking.
Collapse
Affiliation(s)
- Benjamin J Livesey
- MRC Human Genetics Unit, Institute of Genetics and CancerUniversity of EdinburghEdinburghUK
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and CancerUniversity of EdinburghEdinburghUK
| |
Collapse
|
24
|
Ramakrishnan G, Baakman C, Heijl S, Vroling B, van Horck R, Hiraki J, Xue LC, Huynen MA. Understanding structure-guided variant effect predictions using 3D convolutional neural networks. Front Mol Biosci 2023; 10:1204157. [PMID: 37475887 PMCID: PMC10354367 DOI: 10.3389/fmolb.2023.1204157] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 06/22/2023] [Indexed: 07/22/2023] Open
Abstract
Predicting pathogenicity of missense variants in molecular diagnostics remains a challenge despite the available wealth of data, such as evolutionary information, and the wealth of tools to integrate that data. We describe DeepRank-Mut, a configurable framework designed to extract and learn from physicochemically relevant features of amino acids surrounding missense variants in 3D space. For each variant, various atomic and residue-level features are extracted from its structural environment, including sequence conservation scores of the surrounding amino acids, and stored in multi-channel 3D voxel grids which are then used to train a 3D convolutional neural network (3D-CNN). The resultant model gives a probabilistic estimate of whether a given input variant is disease-causing or benign. We find that the performance of our 3D-CNN model, on independent test datasets, is comparable to other widely used resources which also combine sequence and structural features. Based on the 10-fold cross-validation experiments, we achieve an average accuracy of 0.77 on the independent test datasets. We discuss the contribution of the variant neighborhood in the model's predictive power, in addition to the impact of individual features on the model's performance. Two key features: evolutionary information of residues in the variant neighborhood and their solvent accessibilities were observed to influence the predictions. We also highlight how predictions are impacted by the underlying disease mechanisms of missense mutations and offer insights into understanding these to improve pathogenicity predictions. Our study presents aspects to take into consideration when adopting deep learning approaches for protein structure-guided pathogenicity predictions.
Collapse
Affiliation(s)
- Gayatri Ramakrishnan
- Department of Medical Biosciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Coos Baakman
- Department of Medical Biosciences, Radboud University Medical Center, Nijmegen, Netherlands
| | | | | | | | | | - Li C. Xue
- Department of Medical Biosciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Martijn A. Huynen
- Department of Medical Biosciences, Radboud University Medical Center, Nijmegen, Netherlands
| |
Collapse
|
25
|
Singh AK, Talseth-Palmer B, Xavier A, Scott RJ, Drabløs F, Sjursen W. Detection of germline variants with pathogenic potential in 48 patients with familial colorectal cancer by using whole exome sequencing. BMC Med Genomics 2023; 16:126. [PMID: 37296477 PMCID: PMC10257304 DOI: 10.1186/s12920-023-01562-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Accepted: 05/30/2023] [Indexed: 06/12/2023] Open
Abstract
BACKGROUND Hereditary genetic mutations causing predisposition to colorectal cancer are accountable for approximately 30% of all colorectal cancer cases. However, only a small fraction of these are high penetrant mutations occurring in DNA mismatch repair genes, causing one of several types of familial colorectal cancer (CRC) syndromes. Most of the mutations are low-penetrant variants, contributing to an increased risk of familial colorectal cancer, and they are often found in additional genes and pathways not previously associated with CRC. The aim of this study was to identify such variants, both high-penetrant and low-penetrant ones. METHODS We performed whole exome sequencing on constitutional DNA extracted from blood of 48 patients suspected of familial colorectal cancer and used multiple in silico prediction tools and available literature-based evidence to detect and investigate genetic variants. RESULTS We identified several causative and some potentially causative germline variants in genes known for their association with colorectal cancer. In addition, we identified several variants in genes not typically included in relevant gene panels for colorectal cancer, including CFTR, PABPC1 and TYRO3, which may be associated with an increased risk for cancer. CONCLUSIONS Identification of variants in additional genes that potentially can be associated with familial colorectal cancer indicates a larger genetic spectrum of this disease, not limited only to mismatch repair genes. Usage of multiple in silico tools based on different methods and combined through a consensus approach increases the sensitivity of predictions and narrows down a large list of variants to the ones that are most likely to be significant.
Collapse
Affiliation(s)
- Ashish Kumar Singh
- Department of Medical Genetics, St. Olavs Hospital, Trondheim, Norway.
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, NTNU - Norwegian University of Science and Technology, Trondheim, Norway.
| | - Bente Talseth-Palmer
- School of Biomedical Science and Pharmacy, Faculty of Health and Medicine, University of Newcastle and Hunter Medical Research Institute, Newcastle, Australia
- Møre and Romsdal Hospital Trust, Research Unit, Ålesund, Norway
- NSW Health Pathology, Newcastle, Australia
| | - Alexandre Xavier
- School of Biomedical Science and Pharmacy, Faculty of Health and Medicine, University of Newcastle and Hunter Medical Research Institute, Newcastle, Australia
| | - Rodney J Scott
- School of Biomedical Science and Pharmacy, Faculty of Health and Medicine, University of Newcastle and Hunter Medical Research Institute, Newcastle, Australia
- NSW Health Pathology, Newcastle, Australia
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Wenche Sjursen
- Department of Medical Genetics, St. Olavs Hospital, Trondheim, Norway
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
26
|
Hopkins CE, McCormick K, Brock T, Wood M, Ruggiero S, Mcbride K, Kim C, Lawson JA, Helbig I, Bainbridge MN. Clinical variants in Caenorhabditis elegans expressing human STXBP1 reveal a novel class of pathogenic variants and classify variants of uncertain significance. GENETICS IN MEDICINE OPEN 2023; 1:100823. [PMID: 38827422 PMCID: PMC11141691 DOI: 10.1016/j.gimo.2023.100823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Purpose Modeling disease variants in animals is useful for drug discovery, understanding disease pathology, and classifying variants of uncertain significance (VUS) as pathogenic or benign. Methods Using Clustered Regularly Interspaced Short Palindromic Repeats, we performed a Whole-gene Humanized Animal Model procedure to replace the coding sequence of the animal model's unc-18 ortholog with the coding sequence for the human STXBP1 gene. Next, we used Clustered Regularly Interspaced Short Palindromic Repeats to introduce precise point variants in the Whole-gene Humanized Animal Model-humanized STXBP1 locus from 3 clinical categories (benign, pathogenic, and VUS). Twenty-six phenotypic features extracted from video recordings were used to train machine learning classifiers on 25 pathogenic and 32 benign variants. Results Using multiple models, we were able to obtain a diagnostic sensitivity near 0.9. Twenty-three VUS were also interrogated and 8 of 23 (34.8%) were observed to be functionally abnormal. Interestingly, unsupervised clustering identified 2 distinct subsets of known pathogenic variants with distinct phenotypic features; both p.Tyr75Cys and p.Arg406Cys cluster away from other variants and show an increase in swim speed compared with hSTXBP1 worms. This leads to the hypothesis that the mechanism of disease for these 2 variants may differ from most STXBP1-mutated patients and may account for some of the clinical heterogeneity observed in the patient population. Conclusion We have demonstrated that automated analysis of a small animal system is an effective, scalable, and fast way to understand functional consequences of variants in STXBP1 and identify variant-specific intensities of aberrant activity suggesting a genotype-to-phenotype correlation is likely to occur in human clinical variations of STXBP1.
Collapse
Affiliation(s)
| | | | | | | | - Sarah Ruggiero
- Division of Neurology, Children’s Hospital of Philadelphia, Philadelphia, PA
- The Epilepsy NeuroGenetics Initiative (ENGIN), Children’s Hospital of Philadelphia, Philadelphia, PA
- Department of Biomedical and Health Informatics (DBHi), Children’s Hospital of Philadelphia, Philadelphia, PA
- University of Pennsylvania, Neuroscience Program, Philadelphia, PA
| | | | | | | | - Ingo Helbig
- Division of Neurology, Children’s Hospital of Philadelphia, Philadelphia, PA
- The Epilepsy NeuroGenetics Initiative (ENGIN), Children’s Hospital of Philadelphia, Philadelphia, PA
- Department of Biomedical and Health Informatics (DBHi), Children’s Hospital of Philadelphia, Philadelphia, PA
- University of Pennsylvania, Neuroscience Program, Philadelphia, PA
| | - Matthew N. Bainbridge
- Codified Genomics, LLC, Houston, TX
- Rady Children’s Institute for Genomic Medicine, San Diego, CA
| |
Collapse
|
27
|
Vivekanandam V, Ellmers R, Jayaseelan D, Houlden H, Männikkö R, Hanna MG. In silico versus functional characterization of genetic variants: lessons from muscle channelopathies. Brain 2023; 146:1316-1321. [PMID: 36382348 DOI: 10.1093/brain/awac431] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 10/04/2022] [Accepted: 11/06/2022] [Indexed: 11/17/2022] Open
Abstract
Accurate determination of the pathogenicity of missense genetic variants of uncertain significance is a huge challenge for implementing genetic data in clinical practice. In silico predictive tools are used to score variants' pathogenicity. However, their value in clinical settings is often unclear, as they have not usually been validated against robust functional assays. We compared nine widely used in silico predictive tools, including more recently developed tools (EVE and REVEL) with detailed cell-based electrophysiology, for 126 CLCN1 variants discovered in patients with the skeletal muscle channelopathy myotonia congenita. We found poor accuracy for most tools. The highest accuracy was obtained with MutationTaster (84.58%) and REVEL (82.54%). Both of these scores showed poor specificity, although specificity was better using EVE. Combining methods based on concordance improved performance overall but still lacked specificity. Our calculated statistics for the predictive tools were different to reported values for other genes in the literature, suggesting that the utility of the tools varies between genes. Overall, current predictive tools for this chloride channel are not reliable for clinical use, and tools with better specificity are urgently required. Improving the accuracy of predictive tools is a wider issue and a huge challenge for effective clinical implementation of genetic data.
Collapse
Affiliation(s)
- Vinojini Vivekanandam
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK
| | - Rebecca Ellmers
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK
| | - Dipa Jayaseelan
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK
| | - Henry Houlden
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK
| | - Roope Männikkö
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK
| | - Michael G Hanna
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK
| |
Collapse
|
28
|
Cannon S, Williams M, Gunning AC, Wright CF. Evaluation of in silico pathogenicity prediction tools for the classification of small in-frame indels. BMC Med Genomics 2023; 16:36. [PMID: 36855133 PMCID: PMC9972633 DOI: 10.1186/s12920-023-01454-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 02/09/2023] [Indexed: 03/02/2023] Open
Abstract
BACKGROUND The use of in silico pathogenicity predictions as evidence when interpreting genetic variants is widely accepted as part of standard variant classification guidelines. Although numerous algorithms have been developed and evaluated for classifying missense variants, in-frame insertions/deletions (indels) have been much less well studied. METHODS We created a dataset of 3964 small (< 100 bp) indels predicted to result in in-frame amino acid insertions or deletions using data from gnomAD v3.1 (minor allele frequency of 1-5%), ClinVar and the Deciphering Developmental Disorders (DDD) study. We used this dataset to evaluate the performance of nine pathogenicity predictor tools: CADD, CAPICE, FATHMM-indel, MutPred-Indel, MutationTaster2021, PROVEAN, SIFT-indel, VEST-indel and VVP. RESULTS Our dataset consisted of 2224 benign/likely benign and 1740 pathogenic/likely pathogenic variants from gnomAD (n = 809), ClinVar (n = 2882) and, DDD (n = 273). We were able to generate scores across all tools for 91% of the variants, with areas under the ROC curve (AUC) of 0.81-0.96 based on the published recommended thresholds. To avoid biases caused by inclusion of our dataset in the tools' training data, we also evaluated just DDD variants not present in either gnomAD or ClinVar (70 pathogenic and 81 benign). Using this subset, the AUC of all tools decreased substantially to 0.64-0.87. Several of the tools performed similarly however, VEST-indel had the highest AUCs of 0.93 (full dataset) and 0.87 (DDD subset). CONCLUSIONS Algorithms designed for predicting the pathogenicity of in-frame indels perform well enough to aid clinical variant classification in a similar manner to missense prediction tools.
Collapse
Affiliation(s)
- S Cannon
- Department of Clinical and Biomedical Sciences (Medical School), Faculty of Health and Life Sciences, University of Exeter, Research, Innovation, Learning and Development Building, Royal Devon and Exeter Hospital, Barrack Road, Exeter, EX2 5DW, UK
| | - M Williams
- Department of Clinical and Biomedical Sciences (Medical School), Faculty of Health and Life Sciences, University of Exeter, Research, Innovation, Learning and Development Building, Royal Devon and Exeter Hospital, Barrack Road, Exeter, EX2 5DW, UK
| | - A C Gunning
- Department of Clinical and Biomedical Sciences (Medical School), Faculty of Health and Life Sciences, University of Exeter, Research, Innovation, Learning and Development Building, Royal Devon and Exeter Hospital, Barrack Road, Exeter, EX2 5DW, UK
| | - C F Wright
- Department of Clinical and Biomedical Sciences (Medical School), Faculty of Health and Life Sciences, University of Exeter, Research, Innovation, Learning and Development Building, Royal Devon and Exeter Hospital, Barrack Road, Exeter, EX2 5DW, UK.
| |
Collapse
|
29
|
Sun J, Kulandaisamy A, Liu J, Hu K, Gromiha MM, Zhang Y. Machine learning in computational modelling of membrane protein sequences and structures: From methodologies to applications. Comput Struct Biotechnol J 2023; 21:1205-1226. [PMID: 36817959 PMCID: PMC9932300 DOI: 10.1016/j.csbj.2023.01.036] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/16/2023] [Accepted: 01/25/2023] [Indexed: 01/29/2023] Open
Abstract
Membrane proteins mediate a wide spectrum of biological processes, such as signal transduction and cell communication. Due to the arduous and costly nature inherent to the experimental process, membrane proteins have long been devoid of well-resolved atomic-level tertiary structures and, consequently, the understanding of their functional roles underlying a multitude of life activities has been hampered. Currently, computational tools dedicated to furthering the structure-function understanding are primarily focused on utilizing intelligent algorithms to address a variety of site-wise prediction problems (e.g., topology and interaction sites), but are scattered across different computing sources. Moreover, the recent advent of deep learning techniques has immensely expedited the development of computational tools for membrane protein-related prediction problems. Given the growing number of applications optimized particularly by manifold deep neural networks, we herein provide a review on the current status of computational strategies mainly in membrane protein type classification, topology identification, interaction site detection, and pathogenic effect prediction. Meanwhile, we provide an overview of how the entire prediction process proceeds, including database collection, data pre-processing, feature extraction, and method selection. This review is expected to be useful for developing more extendable computational tools specific to membrane proteins.
Collapse
Affiliation(s)
- Jianfeng Sun
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Headington, Oxford OX3 7LD, UK
| | - Arulsamy Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - Jacklyn Liu
- UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, UK
| | - Kai Hu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China
| | - M. Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India,Corresponding authors.
| | - Yuan Zhang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China,Corresponding authors.
| |
Collapse
|
30
|
Valverde-Hernández JC, Flores-Cruz A, Chavarría-Soley G, Silva de la Fuente S, Campos-Sánchez R. Frequencies of variants in genes associated with dyslipidemias identified in Costa Rican genomes. Front Genet 2023; 14:1114774. [PMID: 37065472 PMCID: PMC10098023 DOI: 10.3389/fgene.2023.1114774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 03/14/2023] [Indexed: 04/18/2023] Open
Abstract
Dyslipidemias are risk factors in diseases of significant importance to public health, such as atherosclerosis, a condition that contributes to the development of cardiovascular disease. Unhealthy lifestyles, the pre-existence of diseases, and the accumulation of genetic variants in some loci contribute to the development of dyslipidemia. The genetic causality behind these diseases has been studied primarily on populations with extensive European ancestry. Only some studies have explored this topic in Costa Rica, and none have focused on identifying variants that can alter blood lipid levels and quantifying their frequency. To fill this gap, this study focused on identifying variants in 69 genes involved in lipid metabolism using genomes from two studies in Costa Rica. We contrasted the allelic frequencies with those of groups reported in the 1000 Genomes Project and gnomAD and identified potential variants that could influence the development of dyslipidemias. In total, we detected 2,600 variants in the evaluated regions. However, after various filtering steps, we obtained 18 variants that have the potential to alter the function of 16 genes, nine variants have pharmacogenomic or protective implications, eight have high risk in Variant Effect Predictor, and eight were found in other Latin American genetic studies of lipid alterations and the development of dyslipidemia. Some of these variants have been linked to changes in blood lipid levels in other global studies and databases. In future studies, we propose to confirm at least 40 variants of interest from 23 genes in a larger cohort from Costa Rica and Latin American populations to determine their relevance regarding the genetic burden for dyslipidemia. Additionally, more complex studies should arise that include diverse clinical, environmental, and genetic data from patients and controls and functional validation of the variants.
Collapse
Affiliation(s)
| | - Andrés Flores-Cruz
- Centro de Investigación en Biología Celular y Molecular, University of Costa Rica, San José, Costa Rica
| | - Gabriela Chavarría-Soley
- Centro de Investigación en Biología Celular y Molecular, University of Costa Rica, San José, Costa Rica
- Escuela de Biología, University of Costa Rica, San José, Costa Rica
| | - Sandra Silva de la Fuente
- Centro de Investigación en Biología Celular y Molecular, University of Costa Rica, San José, Costa Rica
| | - Rebeca Campos-Sánchez
- Centro de Investigación en Biología Celular y Molecular, University of Costa Rica, San José, Costa Rica
- *Correspondence: Rebeca Campos-Sánchez,
| |
Collapse
|
31
|
Alenezi WM, Fierheller CT, Serruya C, Revil T, Oros KK, Subramanian DN, Bruce J, Spiegelman D, Pugh T, Campbell IG, Mes-Masson AM, Provencher D, Foulkes WD, Haffaf ZE, Rouleau G, Bouchard L, Greenwood CMT, Ragoussis J, Tonin PN. Genetic analyses of DNA repair pathway associated genes implicate new candidate cancer predisposing genes in ancestrally defined ovarian cancer cases. Front Oncol 2023; 13:1111191. [PMID: 36969007 PMCID: PMC10030840 DOI: 10.3389/fonc.2023.1111191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 02/06/2023] [Indexed: 03/29/2023] Open
Abstract
Not all familial ovarian cancer (OC) cases are explained by pathogenic germline variants in known risk genes. A candidate gene approach involving DNA repair pathway genes was applied to identify rare recurring pathogenic variants in familial OC cases not associated with known OC risk genes from a population exhibiting genetic drift. Whole exome sequencing (WES) data of 15 OC cases from 13 families tested negative for pathogenic variants in known OC risk genes were investigated for candidate variants in 468 DNA repair pathway genes. Filtering and prioritization criteria were applied to WES data to select top candidates for further analyses. Candidates were genotyped in ancestry defined study groups of 214 familial and 998 sporadic OC or breast cancer (BC) cases and 1025 population-matched controls and screened for additional carriers in 605 population-matched OC cases. The candidate genes were also analyzed in WES data from 937 familial or sporadic OC cases of diverse ancestries. Top candidate variants in ERCC5, EXO1, FANCC, NEIL1 and NTHL1 were identified in 5/13 (39%) OC families. Collectively, candidate variants were identified in 7/435 (1.6%) sporadic OC cases and 1/566 (0.2%) sporadic BC cases versus 1/1025 (0.1%) controls. Additional carriers were identified in 6/605 (0.9%) OC cases. Tumour DNA from ERCC5, NEIL1 and NTHL1 variant carriers exhibited loss of the wild-type allele. Carriers of various candidate variants in these genes were identified in 31/937 (3.3%) OC cases of diverse ancestries versus 0-0.004% in cancer-free controls. The strategy of applying a candidate gene approach in a population exhibiting genetic drift identified new candidate OC predisposition variants in DNA repair pathway genes.
Collapse
Affiliation(s)
- Wejdan M. Alenezi
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- Cancer Research Program, Centre for Translational Biology, The Research Institute of McGill University Health Centre, Montreal, QC, Canada
- Department of Medical Laboratory Technology, Taibah University, Medina, Saudi Arabia
| | - Caitlin T. Fierheller
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- Cancer Research Program, Centre for Translational Biology, The Research Institute of McGill University Health Centre, Montreal, QC, Canada
| | - Corinne Serruya
- Cancer Research Program, Centre for Translational Biology, The Research Institute of McGill University Health Centre, Montreal, QC, Canada
| | - Timothée Revil
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- McGill Genome Centre, McGill University, Montreal, QC, Canada
| | - Kathleen K. Oros
- Lady Davis Institute for Medical Research of the Jewish General Hospital, Montreal, QC, Canada
| | - Deepak N. Subramanian
- Cancer Genetics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
| | - Jeffrey Bruce
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Dan Spiegelman
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- Montreal Neurological Institute, McGill University, Montreal, QC, Canada
| | - Trevor Pugh
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Ian G. Campbell
- Cancer Genetics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, VIC, Australia
| | - Anne-Marie Mes-Masson
- Centre de recherche du Centre hospitalier de l’Université de Montréal and Institut du cancer de Montréal, Montreal, QC, Canada
- Departement of Medicine, Université de Montréal, Montreal, QC, Canada
| | - Diane Provencher
- Centre de recherche du Centre hospitalier de l’Université de Montréal and Institut du cancer de Montréal, Montreal, QC, Canada
- Division of Gynecologic Oncology, Université de Montréal, Montreal, QC, Canada
| | - William D. Foulkes
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- Cancer Research Program, Centre for Translational Biology, The Research Institute of McGill University Health Centre, Montreal, QC, Canada
- Lady Davis Institute for Medical Research of the Jewish General Hospital, Montreal, QC, Canada
- Department of Medical Genetics, McGill University Health Centre, Montreal, QC, Canada
- Department of Medicine, McGill University, Montreal, QC, Canada
- Gerald Bronfman Department of Oncology, McGill University, Montreal, QC, Canada
| | - Zaki El Haffaf
- Centre de recherche du Centre hospitalier de l’Université de Montréal and Institut du cancer de Montréal, Montreal, QC, Canada
- Service de Médecine Génique, Centre Hospitalier de l’Université de Montréal, Montreal, QC, Canada
| | - Guy Rouleau
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- Montreal Neurological Institute, McGill University, Montreal, QC, Canada
| | - Luigi Bouchard
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC, Canada
- Department of Medical Biology, Centres intégrés universitaires de santé et de services sociaux du Saguenay-Lac-Saint-Jean hôpital Universitaire de Chicoutimi, Saguenay, QC, Canada
- Centre de Recherche du Centre hospitalier l’Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Celia M. T. Greenwood
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- Lady Davis Institute for Medical Research of the Jewish General Hospital, Montreal, QC, Canada
- Gerald Bronfman Department of Oncology, McGill University, Montreal, QC, Canada
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, Canada
| | - Jiannis Ragoussis
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- McGill Genome Centre, McGill University, Montreal, QC, Canada
| | - Patricia N. Tonin
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- Cancer Research Program, Centre for Translational Biology, The Research Institute of McGill University Health Centre, Montreal, QC, Canada
- Department of Medicine, McGill University, Montreal, QC, Canada
- *Correspondence: Patricia N. Tonin,
| |
Collapse
|
32
|
Pejaver V, Byrne AB, Feng BJ, Pagel KA, Mooney SD, Karchin R, O'Donnell-Luria A, Harrison SM, Tavtigian SV, Greenblatt MS, Biesecker LG, Radivojac P, Brenner SE. Calibration of computational tools for missense variant pathogenicity classification and ClinGen recommendations for PP3/BP4 criteria. Am J Hum Genet 2022; 109:2163-2177. [PMID: 36413997 PMCID: PMC9748256 DOI: 10.1016/j.ajhg.2022.10.013] [Citation(s) in RCA: 149] [Impact Index Per Article: 74.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 10/21/2022] [Indexed: 11/23/2022] Open
Abstract
Recommendations from the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP) for interpreting sequence variants specify the use of computational predictors as "supporting" level of evidence for pathogenicity or benignity using criteria PP3 and BP4, respectively. However, score intervals defined by tool developers, and ACMG/AMP recommendations that require the consensus of multiple predictors, lack quantitative support. Previously, we described a probabilistic framework that quantified the strengths of evidence (supporting, moderate, strong, very strong) within ACMG/AMP recommendations. We have extended this framework to computational predictors and introduce a new standard that converts a tool's scores to PP3 and BP4 evidence strengths. Our approach is based on estimating the local positive predictive value and can calibrate any computational tool or other continuous-scale evidence on any variant type. We estimate thresholds (score intervals) corresponding to each strength of evidence for pathogenicity and benignity for thirteen missense variant interpretation tools, using carefully assembled independent data sets. Most tools achieved supporting evidence level for both pathogenic and benign classification using newly established thresholds. Multiple tools reached score thresholds justifying moderate and several reached strong evidence levels. One tool reached very strong evidence level for benign classification on some variants. Based on these findings, we provide recommendations for evidence-based revisions of the PP3 and BP4 ACMG/AMP criteria using individual tools and future assessment of computational methods for clinical interpretation.
Collapse
Affiliation(s)
- Vikas Pejaver
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA 98195, USA
| | - Alicia B Byrne
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Bing-Jian Feng
- Department of Dermatology, University of Utah, Salt Lake City, UT 84132, USA; Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, USA
| | - Kymberleigh A Pagel
- The Institute for Computational Medicine, The Johns Hopkins University, Baltimore, MD 21218, USA
| | - Sean D Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA 98195, USA
| | - Rachel Karchin
- The Institute for Computational Medicine, The Johns Hopkins University, Baltimore, MD 21218, USA; Departments of Biomedical Engineering, Oncology, and Computer Science, The Johns Hopkins University, Baltimore, MD 21218, USA
| | - Anne O'Donnell-Luria
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA 02115, USA
| | - Steven M Harrison
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Ambry Genetics, Aliso Viejo, CA 92656, USA
| | - Sean V Tavtigian
- Department of Oncological Sciences, Huntsman Cancer Institute, University of Utah School of Medicine, Salt Lake City, UT 84112, USA
| | - Marc S Greenblatt
- Department of Medicine and University of Vermont Cancer Center, University of Vermont, Larner College of Medicine, Burlington, VT 05405, USA
| | - Leslie G Biesecker
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA.
| | - Steven E Brenner
- Department of Plant and Microbial Biology and Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA.
| |
Collapse
|
33
|
Beaumont RN, Wright CF. Estimating diagnostic noise in panel-based genomic analysis. Genet Med 2022; 24:2042-2050. [PMID: 35920826 DOI: 10.1016/j.gim.2022.06.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 06/21/2022] [Accepted: 06/21/2022] [Indexed: 11/26/2022] Open
Abstract
PURPOSE Gene panels with a series of strict variant filtering rules are often used for clinical analysis of exomes and genomes. Panel sizes vary, affecting the test's sensitivity and specificity. We investigated the background rate of candidate variants in a population setting using gene panels developed to diagnose a range of heterogeneous monogenic diseases. METHODS We used the Gene2Phenotype database with the Variant Effect Predictor plugin to identify rare nonsynonymous variants in exome sequence data from 200,643 individuals in UK Biobank. We evaluated 5 clinically curated gene panels of varying sizes (50-1700 genes). RESULTS Bigger gene panels resulted in more prioritized variants, varying from an average of approximately 0.3 to 3.5 variants per person. The number of individuals with prioritized variants varied linearly with coding sequence length for monoallelic genes (∼300 individuals per 1000 base pairs) and quadratically for biallelic genes, with notable outliers. CONCLUSION Although large gene panels may be the best strategy to maximize diagnostic yield in genetically heterogeneous diseases, they frequently prioritize likely benign variants requiring follow up. Most individuals have ≥1 rare nonsynonymous variant in panels containing >500 disease genes. Extreme caution should be applied when interpreting candidate variants, particularly in the absence of relevant phenotypes.
Collapse
Affiliation(s)
- Robin N Beaumont
- Institute of Biomedical and Clinical Science, College of Medicine and Health, University of Exeter Medical School, University of Exeter, Exeter, United Kingdom
| | - Caroline F Wright
- Institute of Biomedical and Clinical Science, College of Medicine and Health, University of Exeter Medical School, University of Exeter, Exeter, United Kingdom.
| |
Collapse
|
34
|
Katsonis P, Wilhelm K, Williams A, Lichtarge O. Genome interpretation using in silico predictors of variant impact. Hum Genet 2022; 141:1549-1577. [PMID: 35488922 PMCID: PMC9055222 DOI: 10.1007/s00439-022-02457-6] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 04/17/2022] [Indexed: 02/06/2023]
Abstract
Estimating the effects of variants found in disease driver genes opens the door to personalized therapeutic opportunities. Clinical associations and laboratory experiments can only characterize a tiny fraction of all the available variants, leaving the majority as variants of unknown significance (VUS). In silico methods bridge this gap by providing instant estimates on a large scale, most often based on the numerous genetic differences between species. Despite concerns that these methods may lack reliability in individual subjects, their numerous practical applications over cohorts suggest they are already helpful and have a role to play in genome interpretation when used at the proper scale and context. In this review, we aim to gain insights into the training and validation of these variant effect predicting methods and illustrate representative types of experimental and clinical applications. Objective performance assessments using various datasets that are not yet published indicate the strengths and limitations of each method. These show that cautious use of in silico variant impact predictors is essential for addressing genome interpretation challenges.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| | - Kevin Wilhelm
- Graduate School of Biomedical Sciences, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Amanda Williams
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Biochemistry, Human Genetics and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Pharmacology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| |
Collapse
|
35
|
Guo A, Lun P, Chen J, Li Q, Chang K, Li T, Pan D, Zhang J, Zhou J, Wang K, Zhang Q, Yang Q, Gao C, Wu C, Jian X, Wen Y, Wang Z, Shi Y, Zhao X, Sun P, Li Z. Association analysis of risk genes identified by SCHEMA with schizophrenia in the Chinese Han population. Psychiatr Genet 2022; 32:188-193. [PMID: 36125369 DOI: 10.1097/ypg.0000000000000321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
BACKGROUND Schizophrenia is a chronic brain disorder. Previously, the Schizophrenia Exome Sequencing Meta-analysis consortium identified 10 highest risk genes related to schizophrenia. This study aimed to analyze the relationship between the 10 highest risk genes identified by the SCHEMA and schizophrenia in a Chinese population. METHODS A total of 225 variants in 10 genes were screened in a Chinese population of 6836 using a customized array. All variants were annotated through the Variant Effect Predictor tool, and the functional impacts of missense variants were assessed based on sorting intolerant from tolerant and PolyPhen-2 scores. The SHEsisPlus tool was used to analyze the association between risk genes and schizophrenia at the locus and gene levels. RESULTS At the locus level, no missense variants significantly related to schizophrenia were found, but we detected three missense variants that appeared only in cases, including TRIO p. Arg1185Gln, RB1CC1 p. Arg1514Cys, and HERC1 p. Val4517Leu. At the gene level, five genes (TRIO, RB1CC1, HERC1, GRIN2A, and CACAN1G) with more than one variant analyzed were kept for the gene-level association analysis. Only the association between RB1CC1 and schizophrenia reached a significant level (OR = 1.634; 95% CI, 1.062-2.516; P = 0.025). CONCLUSION In this study, we determined that RB1CC1 might be a risk gene for schizophrenia in the Chinese population. Our results provide new evidence for recognizing the correlation of these risk genes with the Chinese schizophrenia population.
Collapse
Affiliation(s)
- Aiguo Guo
- School of Basic Medicine, Qingdao University
- The Affiliated Hospital of Qingdao University and Biomedical Sciences Institute of Qingdao University (Qingdao Branch of SJTU Bio-X Institutes), Qingdao University
| | - Peng Lun
- Department of Neurosurgery, The Affiliated Hospital of Qingdao University, Qingdao University, Qingdao
| | - Jianhua Chen
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai
| | - Qinghua Li
- Department of Neurosurgery, The Affiliated Hospital of Qingdao University, Qingdao University, Qingdao
| | - Kaihui Chang
- School of Basic Medicine, Qingdao University
- The Affiliated Hospital of Qingdao University and Biomedical Sciences Institute of Qingdao University (Qingdao Branch of SJTU Bio-X Institutes), Qingdao University
| | - Teng Li
- The Affiliated Hospital of Qingdao University and Biomedical Sciences Institute of Qingdao University (Qingdao Branch of SJTU Bio-X Institutes), Qingdao University
- School of Public Health, Qingdao University, Qingdao
| | - Dun Pan
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai
| | - Jinmai Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai
| | - Juan Zhou
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai
| | - Ke Wang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai
| | - Qian Zhang
- The Affiliated Hospital of Qingdao University and Biomedical Sciences Institute of Qingdao University (Qingdao Branch of SJTU Bio-X Institutes), Qingdao University
| | - Qiangzhen Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai
| | - Chengwen Gao
- The Affiliated Hospital of Qingdao University and Biomedical Sciences Institute of Qingdao University (Qingdao Branch of SJTU Bio-X Institutes), Qingdao University
| | - Chuanhong Wu
- The Affiliated Hospital of Qingdao University and Biomedical Sciences Institute of Qingdao University (Qingdao Branch of SJTU Bio-X Institutes), Qingdao University
| | - Xuemin Jian
- The Affiliated Hospital of Qingdao University and Biomedical Sciences Institute of Qingdao University (Qingdao Branch of SJTU Bio-X Institutes), Qingdao University
| | - Yanqin Wen
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai
| | - Zhuo Wang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai
| | - Yongyong Shi
- School of Basic Medicine, Qingdao University
- The Affiliated Hospital of Qingdao University and Biomedical Sciences Institute of Qingdao University (Qingdao Branch of SJTU Bio-X Institutes), Qingdao University
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai
- Institute of Social Cognitive and Behavioral Sciences, Shanghai Jiao Tong University
- Institute of Neuropsychiatric Science and Systems Biological Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Xiangzhong Zhao
- The Affiliated Hospital of Qingdao University and Biomedical Sciences Institute of Qingdao University (Qingdao Branch of SJTU Bio-X Institutes), Qingdao University
| | - Peng Sun
- Department of Neurosurgery, The Affiliated Hospital of Qingdao University, Qingdao University, Qingdao
| | - Zhiqiang Li
- School of Basic Medicine, Qingdao University
- The Affiliated Hospital of Qingdao University and Biomedical Sciences Institute of Qingdao University (Qingdao Branch of SJTU Bio-X Institutes), Qingdao University
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai
- School of Public Health, Qingdao University, Qingdao
- Institute of Social Cognitive and Behavioral Sciences, Shanghai Jiao Tong University
- Institute of Neuropsychiatric Science and Systems Biological Medicine, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
36
|
Goetz M, Schröter J, Dattner T, Brennenstuhl H, Lenz D, Opladen T, Hörster F, Okun JG, Hoffmann GF, Kölker S, Staufner C. Genotypic and phenotypic spectrum of cytosolic phosphoenolpyruvate carboxykinase deficiency. Mol Genet Metab 2022; 137:18-25. [PMID: 35868242 DOI: 10.1016/j.ymgme.2022.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 07/07/2022] [Accepted: 07/07/2022] [Indexed: 10/17/2022]
Abstract
OBJECTIVES Pathogenic biallelic variants in PCK1 coding for the cytosolic phosphoenolpyruvate carboxykinase (PEPCK-C) cause PEPCK-C deficiency, a rare disorder of gluconeogenesis presenting with hypoglycemia, lactic acidosis, and hepatopathy. To date, there has been no systematic analysis of its phenotypic, biochemical, and genetic spectrum. METHODS All currently published individuals and a novel patient with genetically confirmed PEPCK-C deficiency were included. Clinical, biochemical, and genetic findings were analyzed. Protein and in-silico prediction score modeling was applied to analyze potential variant effects. RESULTS Thirty-two individuals from 25 families were found, including one previously unreported patient. The typical biochemical pattern was hypoglycemia triggered by catabolic situations, elevated urinary concentrations of tricarboxylic acid cycle metabolites, mildly elevated alanine and aspartate aminotransferase and elevated lactate concentrations in serum. Plasma glutamine concentrations were elevated in some patients and may be a suitable marker for newborn screening. With adequate treatment, biochemical abnormalities usually normalized following a hypoglycemic episode. Symptom onset usually occurred in infancy with a broad range from neonatal age to adulthood. Regardless of the genotype, different phenotypes with a broad clinical spectrum were found. To date, eight genotypes with nine different PCK1 variants were identified, of which alleles with the recurrent variant c.925G > A; p.(Gly309Arg) are predominant and appear to be endemic in the Finnish population. Protein modeling suggests altered manganese- and substrate-binding as superordinate pathomechanisms. CONCLUSIONS Environmental factors appear to be the main determinant for the phenotype in patients with biallelic variants in PCK1. Based on the biochemical pattern, PEPCK-C deficiency is a recognizable cause of childhood hypoglycemia. It is a treatable disease and early diagnosis is important to prevent metabolic derailment and morbidity. Newborn screening can identify at least a sub-cohort of affected individuals through elevated glutamine concentrations in dry blood.
Collapse
Affiliation(s)
- M Goetz
- Division of Child Neurology and Metabolic Disorders, Center for Pediatrics and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany
| | - J Schröter
- Division of Pediatric Epileptology, Center for Pediatrics and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany
| | - T Dattner
- Division of Child Neurology and Metabolic Disorders, Center for Pediatrics and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany
| | - H Brennenstuhl
- Division of Child Neurology and Metabolic Disorders, Center for Pediatrics and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany
| | - D Lenz
- Division of Child Neurology and Metabolic Disorders, Center for Pediatrics and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany
| | - T Opladen
- Division of Child Neurology and Metabolic Disorders, Center for Pediatrics and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany
| | - F Hörster
- Division of Child Neurology and Metabolic Disorders, Center for Pediatrics and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany
| | - J G Okun
- Division of Child Neurology and Metabolic Disorders, Center for Pediatrics and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany
| | - G F Hoffmann
- Division of Child Neurology and Metabolic Disorders, Center for Pediatrics and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany
| | - S Kölker
- Division of Child Neurology and Metabolic Disorders, Center for Pediatrics and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany
| | - C Staufner
- Division of Child Neurology and Metabolic Disorders, Center for Pediatrics and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany..
| |
Collapse
|
37
|
Barbosa P, Ribeiro M, Carmo-Fonseca M, Fonseca A. Clinical significance of genetic variation in hypertrophic cardiomyopathy: comparison of computational tools to prioritize missense variants. Front Cardiovasc Med 2022; 9:975478. [PMID: 36061567 PMCID: PMC9433717 DOI: 10.3389/fcvm.2022.975478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 08/01/2022] [Indexed: 11/13/2022] Open
Abstract
Hypertrophic cardiomyopathy (HCM) is a common heart disease associated with sudden cardiac death. Early diagnosis is critical to identify patients who may benefit from implantable cardioverter defibrillator therapy. Although genetic testing is an integral part of the clinical evaluation and management of patients with HCM and their families, in many cases the genetic analysis fails to identify a disease-causing mutation. This is in part due to difficulties in classifying newly detected rare genetic variants as well as variants-of-unknown-significance (VUS). Multiple computational algorithms have been developed to predict the potential pathogenicity of genetic variants, but their relative performance in HCM has not been comprehensively assessed. Here, we compared the performance of 39 currently available prediction tools in distinguishing between high-confidence HCM-causing missense variants and benign variants, and we developed an easy-to-use-tool to perform variant prediction benchmarks based on annotated VCF files (VETA). Our results show that tool performance increases after HCM-specific calibration of thresholds. After excluding potential biases due to circularity type I issues, we identified ClinPred, MISTIC, FATHMM, MPC and MetaLR as the five best performer tools in discriminating HCM-associated variants. We propose combining these tools in order to prioritize unknown HCM missense variants that should be closely followed-up in the clinic.
Collapse
Affiliation(s)
- Pedro Barbosa
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina da Universidade de Lisboa, Lisboa, Portugal
| | - Marta Ribeiro
- Department of Bioengineering and iBB-Institute for Bioengineering and Biosciences, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
| | - Maria Carmo-Fonseca
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina da Universidade de Lisboa, Lisboa, Portugal
| | - Alcides Fonseca
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal
- GenoMed - Diagnósticos de Medicina Molecular, Lisboa, Portugal
| |
Collapse
|
38
|
Caswell RC, Gunning AC, Owens MM, Ellard S, Wright CF. Assessing the clinical utility of protein structural analysis in genomic variant classification: experiences from a diagnostic laboratory. Genome Med 2022; 14:77. [PMID: 35869530 PMCID: PMC9308257 DOI: 10.1186/s13073-022-01082-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 07/04/2022] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND The widespread clinical application of genome-wide sequencing has resulted in many new diagnoses for rare genetic conditions, but testing regularly identifies variants of uncertain significance (VUS). The remarkable rise in the amount of genomic data has been paralleled by a rise in the number of protein structures that are now publicly available, which may have clinical utility for the interpretation of missense and in-frame insertions or deletions. METHODS Within a UK National Health Service genomic medicine diagnostic laboratory, we investigated the number of VUS over a 5-year period that were evaluated using protein structural analysis and how often this analysis aided variant classification. RESULTS We found 99 novel missense and in-frame variants across 67 genes that were initially classified as VUS by our diagnostic laboratory using standard variant classification guidelines and for which further analysis of protein structure was requested. Evidence from protein structural analysis was used in the re-assessment of 64 variants, of which 47 were subsequently reclassified as pathogenic or likely pathogenic and 17 remained as VUS. We identified several case studies where protein structural analysis aided variant interpretation by predicting disease mechanisms that were consistent with the observed phenotypes, including loss-of-function through thermodynamic destabilisation or disruption of ligand binding, and gain-of-function through de-repression or escape from proteasomal degradation. CONCLUSIONS We have shown that using in silico protein structural analysis can aid classification of VUS and give insights into the mechanisms of pathogenicity. Based on our experience, we propose a generic evidence-based workflow for incorporating protein structural information into diagnostic practice to facilitate variant classification.
Collapse
Affiliation(s)
- Richard C Caswell
- Exeter Genomics Laboratory, Royal Devon University Healthcare NHS Foundation Trust, Exeter, EX2 5DW, UK.
| | - Adam C Gunning
- Exeter Genomics Laboratory, Royal Devon University Healthcare NHS Foundation Trust, Exeter, EX2 5DW, UK
- Institute of Biomedical and Clinical Science, University of Exeter School of Medicine, Exeter, EX2 5DW, UK
| | - Martina M Owens
- Exeter Genomics Laboratory, Royal Devon University Healthcare NHS Foundation Trust, Exeter, EX2 5DW, UK
| | - Sian Ellard
- Exeter Genomics Laboratory, Royal Devon University Healthcare NHS Foundation Trust, Exeter, EX2 5DW, UK
- Institute of Biomedical and Clinical Science, University of Exeter School of Medicine, Exeter, EX2 5DW, UK
| | - Caroline F Wright
- Institute of Biomedical and Clinical Science, University of Exeter School of Medicine, Exeter, EX2 5DW, UK.
| |
Collapse
|
39
|
A Comprehensive Evaluation of the Performance of Prediction Algorithms on Clinically Relevant Missense Variants. Int J Mol Sci 2022; 23:ijms23147946. [PMID: 35887294 PMCID: PMC9322961 DOI: 10.3390/ijms23147946] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 07/09/2022] [Accepted: 07/15/2022] [Indexed: 12/12/2022] Open
Abstract
The rapid integration of genomic technologies in clinical diagnostics has resulted in the detection of a multitude of missense variants whose clinical significance is often unknown. As a result, a plethora of computational tools have been developed to facilitate variant interpretation. However, choosing an appropriate software from such a broad range of tools can be challenging; therefore, systematic benchmarking with high-quality, independent datasets is critical. Using three independent benchmarking datasets compiled from the ClinVar database, we evaluated the performance of ten widely used prediction algorithms with missense variants from 21 clinically relevant genes, including BRCA1 and BRCA2. A fourth dataset consisting of 1053 missense variants was also used to investigate the impact of type 1 circularity on their performance. The performance of the prediction algorithms varied widely across datasets. Based on Matthews Correlation Coefficient and Area Under the Curve, SNPs&GO and PMut consistently displayed an overall above-average performance across the datasets. Most of the tools demonstrated greater sensitivity and negative predictive values at the expense of lower specificity and positive predictive values. We also demonstrated that type 1 circularity significantly impacts the performance of these tools and, if not accounted for, may confound the selection of the best performing algorithms.
Collapse
|
40
|
New Developments and Possibilities in Reanalysis and Reinterpretation of Whole Exome Sequencing Datasets for Unsolved Rare Diseases Using Machine Learning Approaches. Int J Mol Sci 2022; 23:ijms23126792. [PMID: 35743235 PMCID: PMC9224427 DOI: 10.3390/ijms23126792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 06/13/2022] [Accepted: 06/15/2022] [Indexed: 11/21/2022] Open
Abstract
Rare diseases impact the lives of 300 million people in the world. Rapid advances in bioinformatics and genomic technologies have enabled the discovery of causes of 20–30% of rare diseases. However, most rare diseases have remained as unsolved enigmas to date. Newer tools and availability of high throughput sequencing data have enabled the reanalysis of previously undiagnosed patients. In this review, we have systematically compiled the latest developments in the discovery of the genetic causes of rare diseases using machine learning methods. Importantly, we have detailed methods available to reanalyze existing whole exome sequencing data of unsolved rare diseases. We have identified different reanalysis methodologies to solve problems associated with sequence alterations/mutations, variation re-annotation, protein stability, splice isoform malfunctions and oligogenic analysis. In addition, we give an overview of new developments in the field of rare disease research using whole genome sequencing data and other omics.
Collapse
|
41
|
Livesey BJ, Marsh JA. Interpreting protein variant effects with computational predictors and deep mutational scanning. Dis Model Mech 2022; 15:275742. [PMID: 35736673 PMCID: PMC9235876 DOI: 10.1242/dmm.049510] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Computational predictors of genetic variant effect have advanced rapidly in recent years. These programs provide clinical and research laboratories with a rapid and scalable method to assess the likely impacts of novel variants. However, it can be difficult to know to what extent we can trust their results. To benchmark their performance, predictors are often tested against large datasets of known pathogenic and benign variants. These benchmarking data may overlap with the data used to train some supervised predictors, which leads to data re-use or circularity, resulting in inflated performance estimates for those predictors. Furthermore, new predictors are usually found by their authors to be superior to all previous predictors, which suggests some degree of computational bias in their benchmarking. Large-scale functional assays known as deep mutational scans provide one possible solution to this problem, providing independent datasets of variant effect measurements. In this Review, we discuss some of the key advances in predictor methodology, current benchmarking strategies and how data derived from deep mutational scans can be used to overcome the issue of data circularity. We also discuss the ability of such functional assays to directly predict clinical impacts of mutations and how this might affect the future need for variant effect predictors.
Collapse
Affiliation(s)
- Benjamin J Livesey
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK
| |
Collapse
|
42
|
Scafuri B, Verdino A, D'Arminio N, Marabotti A. Computational methods to assist in the discovery of pharmacological chaperones for rare diseases. Brief Bioinform 2022; 23:6590149. [PMID: 35595532 DOI: 10.1093/bib/bbac198] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/13/2022] [Accepted: 04/28/2022] [Indexed: 12/21/2022] Open
Abstract
Pharmacological chaperones are chemical compounds able to bind proteins and stabilize them against denaturation and following degradation. Some pharmacological chaperones have been approved, or are under investigation, for the treatment of rare inborn errors of metabolism, caused by genetic mutations that often can destabilize the structure of the wild-type proteins expressed by that gene. Given that, for rare diseases, there is a general lack of pharmacological treatments, many expectations are poured out on this type of compounds. However, their discovery is not straightforward. In this review, we would like to focus on the computational methods that can assist and accelerate the search for these compounds, showing also examples in which these methods were successfully applied for the discovery of promising molecules belonging to this new category of pharmacologically active compounds.
Collapse
Affiliation(s)
- Bernardina Scafuri
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| | - Anna Verdino
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| | - Nancy D'Arminio
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| | - Anna Marabotti
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| |
Collapse
|
43
|
Xue Y, Zeng C, Ge P, Liu C, Li J, Zhang Y, Zhang D, Zhang Q, Zhao J. Association of RNF213 Variants With Periventricular Anastomosis in Moyamoya Disease. Stroke 2022; 53:2906-2916. [PMID: 35543128 DOI: 10.1161/strokeaha.121.038066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND The pathogenic mechanisms of periventricular anastomosis (PA) in moyamoya disease remain unknown. Here, we aimed to describe the angiographic profiles of PA and their relationships with really interesting new gene (RING) finger protein 213 (RNF213) genotypes. METHODS We conducted a retrospective cohort study of moyamoya disease patients consecutively recruited between June 2019 and January 2021 in Beijing Tiantan Hospital, Capital Medical University, China. C-terminal region of RNF213 was sequenced. Angiographic characteristics of PA vessels (lenticulostriate artery, thalamotuberal artery, thalamoperforating artery, anterior choroidal artery, and posterior choroidal artery) were compared between different groups of RNF213 genotypes. The dilatation and extension of PA vessels were measured by using PA score (positive, score 1-5; negative, score 0). Multivariate regression analysis was conducted to assess variables associated with PA score. In addition, gene expression of RNF213 in human brain regions was evaluated from the Allen Human Brain Atlas. RESULTS Among 260 patients (484 hemispheres), 71.2% carried no RNF213 rare and novel variants, 20.0% carried p.R4810K heterozygotes, and 8.8% carried other rare and novel variants. PA scores in patients with p.R4810K and other rare and novel variants were significantly higher than in wild-type patients (P<0.001). Age (odds ratio [OR], 0.958 [95% CI, 0.942-0.974]; P<0.001), platelet count (OR, 0.996 [95% CI, 0.992-0.999]; P=0.027), p.R4810K variant (OR, 2.653 [95% CI, 1.514-4.649]; P=0.001), other rare and novel variants (OR, 3.197 [95% CI, 1.012-10.094]; P=0.048), Suzuki stage ≥4 (OR, 1.941 [95% CI, 1.138-3.309]; P=0.015), and posterior cerebral artery involvement (OR, 1.827 [95% CI, 1.020-3.271]; P=0.043) were significantly correlated with PA score. High expression of RNF213 was detected in the periventricular area. CONCLUSIONS RNF213 variants were confirmed to be associated with PA in moyamoya disease. Individuals with RNF213 p.R4810K heterozygotes and other C-terminal region rare variants exhibited different angiographic phenotypes, compared with wild-type patients.
Collapse
Affiliation(s)
- Yimeng Xue
- Savaid Medical School, University of Chinese Academy of Sciences, Beijing (Y.X., J.Z.).,Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,China National Clinical Research Center for Neurological Diseases, Beijing (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Center of Stroke, Beijing Institute for Brain Disorders, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Key Laboratory of Translational Medicine for Cerebrovascular Disease, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Translational Engineering Center for 3D Printer in Clinical Neuroscience, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.)
| | - Chaofan Zeng
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,China National Clinical Research Center for Neurological Diseases, Beijing (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Center of Stroke, Beijing Institute for Brain Disorders, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Key Laboratory of Translational Medicine for Cerebrovascular Disease, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Translational Engineering Center for 3D Printer in Clinical Neuroscience, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.)
| | - Peicong Ge
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,China National Clinical Research Center for Neurological Diseases, Beijing (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Center of Stroke, Beijing Institute for Brain Disorders, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Key Laboratory of Translational Medicine for Cerebrovascular Disease, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Translational Engineering Center for 3D Printer in Clinical Neuroscience, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.)
| | - Chenglong Liu
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,China National Clinical Research Center for Neurological Diseases, Beijing (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Center of Stroke, Beijing Institute for Brain Disorders, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Key Laboratory of Translational Medicine for Cerebrovascular Disease, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Translational Engineering Center for 3D Printer in Clinical Neuroscience, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.)
| | - Junsheng Li
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,China National Clinical Research Center for Neurological Diseases, Beijing (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Center of Stroke, Beijing Institute for Brain Disorders, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Key Laboratory of Translational Medicine for Cerebrovascular Disease, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Translational Engineering Center for 3D Printer in Clinical Neuroscience, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.)
| | - Yan Zhang
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,China National Clinical Research Center for Neurological Diseases, Beijing (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Center of Stroke, Beijing Institute for Brain Disorders, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Key Laboratory of Translational Medicine for Cerebrovascular Disease, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Translational Engineering Center for 3D Printer in Clinical Neuroscience, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.)
| | - Dong Zhang
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,China National Clinical Research Center for Neurological Diseases, Beijing (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Center of Stroke, Beijing Institute for Brain Disorders, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Key Laboratory of Translational Medicine for Cerebrovascular Disease, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Translational Engineering Center for 3D Printer in Clinical Neuroscience, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.)
| | - Qian Zhang
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,China National Clinical Research Center for Neurological Diseases, Beijing (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Key Laboratory of Translational Medicine for Cerebrovascular Disease, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Translational Engineering Center for 3D Printer in Clinical Neuroscience, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.)
| | - Jizong Zhao
- Savaid Medical School, University of Chinese Academy of Sciences, Beijing (Y.X., J.Z.).,Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,China National Clinical Research Center for Neurological Diseases, Beijing (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Center of Stroke, Beijing Institute for Brain Disorders, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Key Laboratory of Translational Medicine for Cerebrovascular Disease, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.).,Beijing Translational Engineering Center for 3D Printer in Clinical Neuroscience, China (Y.X., C.Z., P.G., C.L., J.L., Y.Z., D.Z., Q.Z., J.Z.)
| |
Collapse
|
44
|
The Genetic and Molecular Analyses of RAD51C and RAD51D Identifies Rare Variants Implicated in Hereditary Ovarian Cancer from a Genetically Unique Population. Cancers (Basel) 2022; 14:cancers14092251. [PMID: 35565380 PMCID: PMC9104874 DOI: 10.3390/cancers14092251] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 04/13/2022] [Accepted: 04/20/2022] [Indexed: 12/03/2022] Open
Abstract
To identify candidate variants in RAD51C and RAD51D ovarian cancer (OC) predisposing genes by investigating French Canadians (FC) exhibiting unique genetic architecture. Candidates were identified by whole exome sequencing analysis of 17 OC families and 53 early-onset OC cases. Carrier frequencies were determined by the genetic analysis of 100 OC or HBOC families, 438 sporadic OC cases and 1025 controls. Variants of unknown function were assayed for their biological impact and/or cellular sensitivity to olaparib. RAD51C c.414G>C;p.Leu138Phe and c.705G>T;p.Lys235Asn and RAD51D c.137C>G;p.Ser46Cys, c.620C>T;p.Ser207Leu and c.694C>T;p.Arg232Ter were identified in 17.6% of families and 11.3% of early-onset cases. The highest carrier frequency was observed in OC families (1/44, 2.3%) and sporadic cases (15/438, 3.4%) harbouring RAD51D c.620C>T versus controls (1/1025, 0.1%). Carriers of c.620C>T (n = 7), c.705G>T (n = 2) and c.137C>G (n = 1) were identified in another 538 FC OC cases. RAD51C c.705G>T affected splicing by skipping exon four, while RAD51D p.Ser46Cys affected protein stability and conferred olaparib sensitivity. Genetic and functional assays implicate RAD51C c.705G>T and RAD51D c.137C>G as likely pathogenic variants in OC. The high carrier frequency of RAD51D c.620C>T in FC OC cases validates previous findings. Our findings further support the role of RAD51C and RAD51D in hereditary OC.
Collapse
|
45
|
Lee JH. Invertebrate Model Organisms as a Platform to Investigate Rare Human Neurological Diseases. Exp Neurobiol 2022; 31:1-16. [PMID: 35256540 PMCID: PMC8907251 DOI: 10.5607/en22003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Revised: 02/07/2022] [Accepted: 02/07/2022] [Indexed: 01/16/2023] Open
Abstract
Patients suffering from rare human diseases often go through a painful journey for finding a definite molecular diagnosis prerequisite of appropriate cures. With a novel variant isolated from a single patient, determination of its pathogenicity to end such "diagnostic odyssey" requires multi-step processes involving experts in diverse areas of interest, including clinicians, bioinformaticians and research scientists. Recent efforts in building large-scale genomic databases and in silico prediction platforms have facilitated identification of potentially pathogenic variants causative of rare human diseases of a Mendelian basis. However, the functional significance of individual variants remains elusive in many cases, thus requiring incorporation of versatile and rapid model organism (MO)-based platforms for functional analyses. In this review, the current scope of rare disease research is briefly discussed. In addition, an overview of invertebrate MOs for their key features relevant to rare neurological diseases is provided, with the characteristics of two representative invertebrate MOs, Drosophila melanogaster and Caenorhabditis elegans, as well as the challenges against them. Finally, recently developed research networks integrating these MOs in collaborative research are portraited with an array of bioinformatical analyses embedded. A comprehensive survey of MO-based research activities provided in this review will help us to design a wellstructured analysis of candidate genes or potentially pathogenic variants for their roles in rare neurological diseases in future.
Collapse
Affiliation(s)
- Ji-Hye Lee
- Department of Oral Pathology & Life Science in Dentistry, School of Dentistry, Pusan National University, Yangsan 50612, Korea.,Dental Life Science Institute, Pusan National University, Yangsan 50612, Korea.,Periodontal Disease Signaling Network Research Center, Pusan National University, Yangsan 50612, Korea
| |
Collapse
|
46
|
A machine learning approach based on ACMG/AMP guidelines for genomic variant classification and prioritization. Sci Rep 2022; 12:2517. [PMID: 35169226 PMCID: PMC8847497 DOI: 10.1038/s41598-022-06547-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 01/07/2022] [Indexed: 01/19/2023] Open
Abstract
Genomic variant interpretation is a critical step of the diagnostic procedure, often supported by the application of tools that may predict the damaging impact of each variant or provide a guidelines-based classification. We propose the application of Machine Learning methodologies, in particular Penalized Logistic Regression, to support variant classification and prioritization. Our approach combines ACMG/AMP guidelines for germline variant interpretation as well as variant annotation features and provides a probabilistic score of pathogenicity, thus supporting the prioritization and classification of variants that would be interpreted as uncertain by the ACMG/AMP guidelines. We compared different approaches in terms of variant prioritization and classification on different datasets, showing that our data-driven approach is able to solve more variant of uncertain significance (VUS) cases in comparison with guidelines-based approaches and in silico prediction tools.
Collapse
|
47
|
Foreman J, Brent S, Perrett D, Bevan AP, Hunt SE, Cunningham F, Hurles ME, Firth HV. DECIPHER: Supporting the interpretation and sharing of rare disease phenotype-linked variant data to advance diagnosis and research. Hum Mutat 2022; 43:682-697. [PMID: 35143074 PMCID: PMC9303633 DOI: 10.1002/humu.24340] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 01/17/2022] [Accepted: 02/07/2022] [Indexed: 11/12/2022]
Abstract
DECIPHER (https://www.deciphergenomics.org) is a free web platform for sharing anonymised phenotype-linked variant data from rare disease patients. Its dynamic interpretation interfaces contextualise genomic and phenotypic data to enable more informed variant interpretation, incorporating international standards for variant classification. DECIPHER supports almost all types of germline and mosaic variation in the nuclear and mitochondrial genome: sequence variants, short tandem repeats, copy-number variants and large structural variants. Patient phenotypes are deposited using Human Phenotype Ontology (HPO) terms, supplemented by quantitative data, which is aggregated to derive gene-specific phenotypic summaries. It hosts data from >250 projects from ~40 countries, openly sharing >40,000 patient records containing >51,000 variants and >172,000 phenotype terms. The rich phenotype-linked variant data in DECIPHER drives rare disease research and diagnosis by enabling patient matching within DECIPHER and with other resources, and has been cited in >2,600 publications. In this paper, we describe the types of data deposited to DECIPHER, the variant interpretation tools, and patient matching interfaces which make DECIPHER an invaluable rare disease resource. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Julia Foreman
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Simon Brent
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Daniel Perrett
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Andrew P Bevan
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Matthew E Hurles
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Helen V Firth
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom.,East Anglian Medical Genetics Service, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, United Kingdom
| |
Collapse
|
48
|
Tamana S, Xenophontos M, Minaidou A, Stephanou C, Harteveld CL, Bento C, Traeger-Synodinos J, Fylaktou I, Yasin NM, Abdul Hamid FS, Esa E, Halim-Fikri H, Zilfalil BA, Kakouri AC, Kleanthous M, Kountouris P. Evaluation of in silico predictors on short nucleotide variants in HBA1, HBA2, and HBB associated with haemoglobinopathies. eLife 2022; 11:79713. [PMID: 36453528 PMCID: PMC9731569 DOI: 10.7554/elife.79713] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Accepted: 10/31/2022] [Indexed: 12/03/2022] Open
Abstract
Haemoglobinopathies are the commonest monogenic diseases worldwide and are caused by variants in the globin gene clusters. With over 2400 variants detected to date, their interpretation using the American College of Medical Genetics and Genomics (ACMG)/Association for Molecular Pathology (AMP) guidelines is challenging and computational evidence can provide valuable input about their functional annotation. While many in silico predictors have already been developed, their performance varies for different genes and diseases. In this study, we evaluate 31 in silico predictors using a dataset of 1627 variants in HBA1, HBA2, and HBB. By varying the decision threshold for each tool, we analyse their performance (a) as binary classifiers of pathogenicity and (b) by using different non-overlapping pathogenic and benign thresholds for their optimal use in the ACMG/AMP framework. Our results show that CADD, Eigen-PC, and REVEL are the overall top performers, with the former reaching moderate strength level for pathogenic prediction. Eigen-PC and REVEL achieve the highest accuracies for missense variants, while CADD is also a reliable predictor of non-missense variants. Moreover, SpliceAI is the top performing splicing predictor, reaching strong level of evidence, while GERP++ and phyloP are the most accurate conservation tools. This study provides evidence about the optimal use of computational tools in globin gene clusters under the ACMG/AMP framework.
Collapse
Affiliation(s)
- Stella Tamana
- Molecular Genetics Thalassaemia Department, The Cyprus Institute of Neurology and GeneticsNicosiaCyprus
| | - Maria Xenophontos
- Molecular Genetics Thalassaemia Department, The Cyprus Institute of Neurology and GeneticsNicosiaCyprus
| | - Anna Minaidou
- Molecular Genetics Thalassaemia Department, The Cyprus Institute of Neurology and GeneticsNicosiaCyprus
| | - Coralea Stephanou
- Molecular Genetics Thalassaemia Department, The Cyprus Institute of Neurology and GeneticsNicosiaCyprus
| | - Cornelis L Harteveld
- Molecular Genetics Thalassaemia Department, The Cyprus Institute of Neurology and GeneticsNicosiaCyprus,Leiden University Medical CenterLeidenNetherlands
| | - Celeste Bento
- Centro Hospitalar e Universitário de CoimbraCoimbraPortugal
| | | | - Irene Fylaktou
- Division of Endocrinology, Metabolism and Diabetes, First Department of Pediatrics, National and Kapodistrian University of AthensAthensGreece
| | - Norafiza Mohd Yasin
- Haematology Unit, Cancer Research Centre, Institute for Medical Research, National Health of Institutes (NIH), Ministry of Health MalaysiaSelangorMalaysia
| | - Faidatul Syazlin Abdul Hamid
- Haematology Unit, Cancer Research Centre, Institute for Medical Research, National Health of Institutes (NIH), Ministry of Health MalaysiaSelangorMalaysia
| | - Ezalia Esa
- Haematology Unit, Cancer Research Centre, Institute for Medical Research, National Health of Institutes (NIH), Ministry of Health MalaysiaSelangorMalaysia
| | - Hashim Halim-Fikri
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Health Campus, Universiti Sains MalaysiaKelantanMalaysia
| | - Bin Alwi Zilfalil
- Human Genome Centre, School of Medical Sciences, Health Campus, Universiti Sains MalaysiaKelantanMalaysia
| | - Andrea C Kakouri
- Molecular Genetics Thalassaemia Department, The Cyprus Institute of Neurology and GeneticsNicosiaCyprus
| | | | - Marina Kleanthous
- Molecular Genetics Thalassaemia Department, The Cyprus Institute of Neurology and GeneticsNicosiaCyprus
| | - Petros Kountouris
- Molecular Genetics Thalassaemia Department, The Cyprus Institute of Neurology and GeneticsNicosiaCyprus
| |
Collapse
|
49
|
Loong L, Cubuk C, Choi S, Allen S, Torr B, Garrett A, Loveday C, Durkie M, Callaway A, Burghel GJ, Drummond J, Robinson R, Berry IR, Wallace A, Eccles DM, Tischkowitz M, Ellard S, Ware JS, Hanson H, Turnbull C. Quantifying prediction of pathogenicity for within-codon concordance (PM5) using 7541 functional classifications of BRCA1 and MSH2 missense variants. Genet Med 2021; 24:552-563. [PMID: 34906453 PMCID: PMC8896276 DOI: 10.1016/j.gim.2021.11.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 10/21/2021] [Accepted: 11/12/2021] [Indexed: 11/18/2022] Open
Abstract
PURPOSE Conditions and thresholds applied for evidence weighting of within-codon concordance (PM5) for pathogenicity vary widely between laboratories and expert groups. Because of the sparseness of available clinical classifications, there is little evidence for variation in practice. METHODS We used as a truthset 7541 dichotomous functional classifications of BRCA1 and MSH2, spanning 311 codons of BRCA1 and 918 codons of MSH2, generated from large-scale functional assays that have been shown to correlate excellently with clinical classifications. We assessed PM5 at 5 stringencies with incorporation of 8 in silico tools. For each analysis, we quantified a positive likelihood ratio (pLR, true positive rate/false positive rate), the predictive value of PM5-lookup in ClinVar compared with the functional truthset. RESULTS pLR was 16.3 (10.6-24.9) for variants for which there was exactly 1 additional colocated deleterious variant on ClinVar, and the variant under examination was equally or more damaging when analyzed using BLOSUM62. pLR was 71.5 (37.8-135.3) for variants for which there were 2 or more colocated deleterious ClinVar variants, and the variant under examination was equally or more damaging than at least 1 colocated variant when analyzed using BLOSUM62. CONCLUSION These analyses support the graded use of PM5, with potential to use it at higher evidence weighting where more stringent criteria are met.
Collapse
Affiliation(s)
- Lucy Loong
- Division of Genetics and Epidemiology, The Institute of Cancer Research, Sutton, United Kingdom
| | - Cankut Cubuk
- Division of Genetics and Epidemiology, The Institute of Cancer Research, Sutton, United Kingdom
| | - Subin Choi
- Division of Genetics and Epidemiology, The Institute of Cancer Research, Sutton, United Kingdom
| | - Sophie Allen
- Division of Genetics and Epidemiology, The Institute of Cancer Research, Sutton, United Kingdom
| | - Beth Torr
- Division of Genetics and Epidemiology, The Institute of Cancer Research, Sutton, United Kingdom
| | - Alice Garrett
- Division of Genetics and Epidemiology, The Institute of Cancer Research, Sutton, United Kingdom
| | - Chey Loveday
- Division of Genetics and Epidemiology, The Institute of Cancer Research, Sutton, United Kingdom
| | - Miranda Durkie
- Sheffield Diagnostic Genetics Service, NHS North East and Yorkshire Genomic Laboratory Hub, Sheffield Children's NHS Foundation Trust, Sheffield, United Kingdom
| | - Alison Callaway
- Wessex Regional Genetics Laboratory, Salisbury NHS Foundation Trust, Salisbury, United Kingdom; Human Genetics and Genomic Medicine, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - George J Burghel
- Manchester Centre for Genomic Medicine and North West Genomic Laboratory Hub, Manchester University NHS Foundation Trust, Manchester, United Kingdom
| | - James Drummond
- East Genomic Laboratory Hub, Cambridge University Hospitals Genomic Laboratory, Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Rachel Robinson
- North East and Yorkshire Genomic Laboratory Hub, The Leeds Teaching Hospitals NHS Trust, Leeds, United Kingdom
| | - Ian R Berry
- Bristol Genetics Laboratory, Pathology Sciences, Southmead Hospital, North Bristol NHS Trust, Bristol, United Kingdom
| | - Andrew Wallace
- Manchester Centre for Genomic Medicine and North West Genomic Laboratory Hub, Manchester University NHS Foundation Trust, Manchester, United Kingdom
| | - Diana M Eccles
- Cancer Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Marc Tischkowitz
- Department of Medical Genetics, NIHR Research Cambridge Biomedical Research Centre, University of Cambridge, Cambridge, United Kingdom
| | - Sian Ellard
- Department of Molecular Genetics, Royal Devon and Exeter NHS Foundation Trust, Exeter, United Kingdom
| | - James S Ware
- National Heart and Lung Institute, Faculty of Medicine, and MRC London Institute of Medical Sciences, Imperial College London, London, United Kingdom; NIHR Royal Brompton Cardiovascular Research Centre, Royal Brompton and Harefield NHS Foundation Trust, London, United Kingdom
| | - Helen Hanson
- Division of Genetics and Epidemiology, The Institute of Cancer Research, Sutton, United Kingdom; Department of Clinical Genetics, St. George's University Hospitals NHS Foundation Trust, London, United Kingdom
| | - Clare Turnbull
- Division of Genetics and Epidemiology, The Institute of Cancer Research, Sutton, United Kingdom; Cancer Genetics Unit, The Royal Marsden NHS Foundation Trust, London, United Kingdom.
| |
Collapse
|
50
|
A Study on the Genetics of Primary Ciliary Dyskinesia. J Clin Med 2021; 10:jcm10215102. [PMID: 34768622 PMCID: PMC8584573 DOI: 10.3390/jcm10215102] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2021] [Revised: 10/27/2021] [Accepted: 10/28/2021] [Indexed: 11/16/2022] Open
Abstract
Primary ciliary dyskinesia (PCD) is a poorly understood disorder. It is primarily autosomal recessive and is prevalent in tribal communities of the United Arab Emirates due to consanguineous marriages. This retrospective study aimed to assess the pathogenicity of the genetic variants of PCD in indigenous patients with significant clinical respiratory problems. Pathogenicity scores of variants obtained from the chart review were consolidated using the Ensembl Variant Effect Predictor. The multidimensional dataset of scores was clustered into three groups based on their pathogenicity. Sequence alignment and the Jensen–Shannon Divergence (JSD) were generated to evaluate the amino acid conservation at the site of the variation. One-hundred and twelve variants of 28 genes linked to PCD were identified in 66 patients. Twenty-two variants were double heterozygous, two triple heterozygous, and seven homozygous. Of the thirteen novel variants, two, c.11839 + 1G > A in dynein, axonemal, heavy chain 11 (DNAH11) and p.Lys92Trpfs in dynein, axonemal, intermediate chain 1 (DNAI1) were associated with dextrocardia with situs inversus, and one, p.Gly21Val in coiled-coil domain-containing protein 40 (CCDC40), with absent inner dynein arms. Homozygous C1orf127:p.Arg113Ter (rs558323413) was also associated with laterality defects in two related patients. The majority of variants were missense involving conserved residues with a median JSD score of 0.747. Homology models of two deleterious variants in the stalk of DNAH11, p.Gly3102Asp and p.Leu3127Arg, revealed structural importance of the conserved glycine and leucine. These results define potentially damaging PCD variants in the region. Future studies, however, are needed to fully comprehend the genetic underpinnings of PCD.
Collapse
|