1
|
Kafkas Ş, Abdelhakim M, Althagafi A, Toonsi S, Alghamdi M, Schofield PN, Hoehndorf R. The application of Large Language Models to the phenotype-based prioritization of causative genes in rare disease patients. Sci Rep 2025; 15:15093. [PMID: 40301638 PMCID: PMC12041562 DOI: 10.1038/s41598-025-99539-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Accepted: 04/21/2025] [Indexed: 05/01/2025] Open
Abstract
Computational methods for identifying gene-disease associations can use both genomic and phenotypic information to prioritize genes and variants that may be associated with genetic diseases. Phenotype-based methods commonly rely on comparing phenotypes observed in a patient with databases of genotype-to-phenotype associations using measures of semantic similarity. They are constrained by the quality and completeness of these resources as well as the quality and completeness of patient phenotype annotation. Genotype-to-phenotype associations used by these methods are largely derived from the literature and coded using phenotype ontologies. Large Language Models (LLMs) have been trained on large amounts of text and data and have shown their potential to answer complex questions across multiple domains. Here, we evaluate the effectiveness of LLMs in prioritizing disease-associated genes compared to existing bioinformatics methods. We show that LLMs can prioritize disease-associated genes as well, or better than, dedicated bioinformatics methods relying on pre-defined phenotype similarity, when gene sets range from 5 to 100 candidates. We apply our approach to a cohort of undiagnosed patients with rare diseases and show that LLMs can be used to provide diagnostic support that helps in identifying plausible candidate genes. Our results show that LLMs may offer an alternative to traditional bioinformatics methods to prioritize disease-associated genes based on disease phenotypes. They may, therefore, potentially enhance diagnostic accuracy and simplify the process for rare genetic diseases.
Collapse
Affiliation(s)
- Şenay Kafkas
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia.
- SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence, King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia.
| | - Marwa Abdelhakim
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia
- KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia
| | - Azza Althagafi
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia
- Computer Science Department, College of Computers and Information Technology, Taif University, 26571, Taif, Saudi Arabia
| | - Sumyyah Toonsi
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia
- SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence, King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia
- KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia
- KAUST Center of Excellence for Generative AI, King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia
| | - Malak Alghamdi
- Medical Genetic Division, Department of Pediatrics, College of Medicine, King Saud University, 11461, Riyadh, Saudi Arabia
| | - Paul N Schofield
- Department of Physiology, Development & Neuroscience, University of Cambridge, Cambridge, CB2 3EG, UK
| | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia.
- SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence, King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia.
- KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia.
- KAUST Center of Excellence for Generative AI, King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia.
| |
Collapse
|
2
|
Fratzl-Zelman N, Blouin S, Kornak U, Hartmann MA, Kurth AA, Zwerina J. Bone quality in pycnodysostosis: micropetrosis, locally distorted osteocyte lacuno-canalicular network, and heterogenous mineralization pattern in an adult female patient with multiple fractures. JBMR Plus 2025; 9:ziaf015. [PMID: 40144453 PMCID: PMC11937824 DOI: 10.1093/jbmrpl/ziaf015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Revised: 01/03/2025] [Accepted: 01/16/2025] [Indexed: 03/28/2025] Open
Abstract
Pycnodysostosis is a very rare skeletal dysplasia caused by biallelic loss-of-function mutations in cathepsin K, a proteolytic enzyme highly expressed by osteoclasts. Deficiency of cathepsin K impairs bone resorption and further bone remodeling leading to progressive osteosclerosis and bone fragility. Moreover, cathepsin K is also expressed by mature osteocytes. Whether the density, size, and viability of osteocytes and the osteocyte lacuno-canalicular network (OLCN) are also altered, thereby impacting bone quality in pycnodysostosis, has not been explored. We used light microscopy, quantitative backscattered electron imaging, and confocal laser scanning microscopy to examine bone material obtained from a 57-yr-old female patient during surgical correction after femoral head fracture. The cortex consisted of a compact shell of multilayered collagen fibrils oriented in parallel to the periosteum, reflecting vigorous primary bone apposition, multiple osteons with concentrically ordered lamellae, and scattered patches of woven bone. The trabecular area was very dense with trabecular bone volume, varying locally from 30.3% to 67.4%. The bone matrix was overmineralized (average calcium content: +7.5% versus reference values, with a 5-fold increase of highly mineralized areas >27 weight % calcium). Numerous multinucleated osteoclasts and fringes of demineralized matrix were viewed on bone surfaces. The density (number/mm2: 193 to 223) and area (20 μm2) of the osteocyte lacunae and their canalicular length (0.05 μm/μm3 bone volume) were within normal range. However, numerous bone packets exhibited (hyper)mineralized osteocyte lacunas (micropetrosis) resulting in a locally disrupted OLCN. In summary, our data indicate that in pycnodysostosis not only osteoclast function is impaired but also osteocyte viability is decreased, leading to micropetrosis, distorted OLCN, and heterogenous mineralization pattern. Thus, osteoclasts and osteocytes both contribute to reduce bone quality. However, the presence of a dense osteocyte network in large areas of the sample indicates that cathepsin K is not essential for the formation of the OLCN.
Collapse
Affiliation(s)
- Nadja Fratzl-Zelman
- Ludwig Boltzmann Institute of Osteology at the Hanusch Hospital of OEGK and AUVA Trauma Center Meidling, 1st Medical Department Hanusch Hospital, 1140 Vienna, Austria
- Vienna Bone and Growth Center, Vienna, Austria
| | - Stéphane Blouin
- Ludwig Boltzmann Institute of Osteology at the Hanusch Hospital of OEGK and AUVA Trauma Center Meidling, 1st Medical Department Hanusch Hospital, 1140 Vienna, Austria
- Vienna Bone and Growth Center, Vienna, Austria
| | - Uwe Kornak
- Institute of Human Genetics, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Markus A Hartmann
- Ludwig Boltzmann Institute of Osteology at the Hanusch Hospital of OEGK and AUVA Trauma Center Meidling, 1st Medical Department Hanusch Hospital, 1140 Vienna, Austria
- Vienna Bone and Growth Center, Vienna, Austria
| | - Andreas A Kurth
- Center for Orthopaedics and Trauma Surgery, Marienhaus Klinikum Mainz, 55131 Mainz, Germany
| | - Jochen Zwerina
- Ludwig Boltzmann Institute of Osteology at the Hanusch Hospital of OEGK and AUVA Trauma Center Meidling, 1st Medical Department Hanusch Hospital, 1140 Vienna, Austria
- Vienna Bone and Growth Center, Vienna, Austria
| |
Collapse
|
3
|
Yang H, Fu H, Zhang M, Liu Y, He YO, Wang C, Cheng L. EnrichDO: a global weighted model for Disease Ontology enrichment analysis. Gigascience 2025; 14:giaf021. [PMID: 40139908 PMCID: PMC11945307 DOI: 10.1093/gigascience/giaf021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2024] [Revised: 12/18/2024] [Accepted: 02/14/2025] [Indexed: 03/29/2025] Open
Abstract
BACKGROUND Disease Ontology (DO) has been widely studied in biomedical research and clinical practice to describe the roles of genes. DO enrichment analysis is an effective means to discover associations between genes and diseases. Compared to hundreds of Gene Ontology (GO)-based enrichment analysis methods, however, DO-based methods are relatively scarce, and most current DO-based approaches are term-for-term and thus are unable to solve over-enrichment problems caused by the "true-path" rule. RESULTS Here, we describe a novel double-weighted model, EnrichDO, which leverages the latest annotations of the human genome with DO terms and integrates DO graph topology on a global scale. Compared to classic enrichment methods (mainly for GO) and existing DO-based enrichment tools, EnrichDO performs better in both GO and DO enrichment analysis cases. It can accurately identify more specific terms, without ignoring the truly associated parent terms, as shown in the Alzheimer's disease (AD) case (AD ranked first). Moreover, both a simulated test and a data perturbation test validate the accuracy and robustness of EnrichDO. Finally, EnrichDO is applied to other types of datasets to expand its application, including gene expression profile datasets, a host gene set of microorganisms, and hallmark gene sets. Based on the findings reported here, EnrichDO shows significant improvement via all experimental results. CONCLUSIONS EnrichDO provides an effective DO enrichment analysis model for gaining insight into the significance of a particular gene set in the context of disease. To increase the usability of EnrichDO, we have developed an R-based software package, which is freely available through Bioconductor (https://bioconductor.org/packages/release/bioc/html/EnrichDO.html) or at https://github.com/liangcheng-hrbmu/EnrichDO.
Collapse
Affiliation(s)
- Haixiu Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Hongyu Fu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Meiyi Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Yangyang Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Yongqun Oliver He
- Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Chao Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
- National Health Commission (NHC) Key Laboratory of Molecular Probes and Targeted Diagnosis and Therapy, Harbin Medical University, Harbin 150028, China
| |
Collapse
|
4
|
Beckwith MA, Danis D, Bridges Y, Jacobsen JOB, Smedley D, Robinson PN. Leveraging clinical intuition to improve accuracy of phenotype-driven prioritization. Genet Med 2025; 27:101292. [PMID: 39396132 PMCID: PMC11843448 DOI: 10.1016/j.gim.2024.101292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 10/03/2024] [Accepted: 10/04/2024] [Indexed: 10/14/2024] Open
Abstract
PURPOSE Clinical intuition is commonly incorporated into the differential diagnosis as an assessment of the likelihood of candidate diagnoses based either on the patient population being seen in a specific clinic or on the signs and symptoms of the initial presentation. Algorithms to support diagnostic sequencing in individuals with a suspected rare genetic disease do not yet incorporate intuition and instead assume that each Mendelian disease has an equal pretest probability. METHODS The LIkelihood Ratio Interpretation of Clinical AbnormaLities (LIRICAL) algorithm calculates the likelihood ratio of clinical manifestations represented by Human Phenotype Ontology terms to rank candidate diagnoses. The initial version of LIRICAL assumed an equal pretest probability for each disease in its calculation of the posttest probability (where the test is diagnostic exome or genome sequencing). We introduce Clinical Intuition for Likelihood Ratios (ClintLR), an extension of the LIRICAL algorithm that boosts the pretest probability of groups of related diseases deemed to be more likely. RESULTS The average rank of the correct diagnosis in simulations using ClintLR showed a statistically significant improvement over a range of adjustment factors. CONCLUSION ClintLR successfully encodes clinical intuition to improve ranking of rare diseases in diagnostic sequencing. ClintLR is freely available at https://github.com/TheJacksonLaboratory/ClintLR.
Collapse
Affiliation(s)
| | - Daniel Danis
- The Jackson Laboratory for Genomic Medicine, Farmington, CT
| | - Yasemin Bridges
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
| | - Julius O B Jacobsen
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
| | - Damian Smedley
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT; Berlin Institute of Health, Charité-Universitätsmedizin Berlin, Berlin, Germany.
| |
Collapse
|
5
|
Ozcelik F, Dundar MS, Yildirim AB, Henehan G, Vicente O, Sánchez-Alcázar JA, Gokce N, Yildirim DT, Bingol NN, Karanfilska DP, Bertelli M, Pojskic L, Ercan M, Kellermayer M, Sahin IO, Greiner-Tollersrud OK, Tan B, Martin D, Marks R, Prakash S, Yakubi M, Beccari T, Lal R, Temel SG, Fournier I, Ergoren MC, Mechler A, Salzet M, Maffia M, Danalev D, Sun Q, Nei L, Matulis D, Tapaloaga D, Janecke A, Bown J, Cruz KS, Radecka I, Ozturk C, Nalbantoglu OU, Sag SO, Ko K, Arngrimsson R, Belo I, Akalin H, Dundar M. The impact and future of artificial intelligence in medical genetics and molecular medicine: an ongoing revolution. Funct Integr Genomics 2024; 24:138. [PMID: 39147901 DOI: 10.1007/s10142-024-01417-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 08/01/2024] [Accepted: 08/05/2024] [Indexed: 08/17/2024]
Abstract
Artificial intelligence (AI) platforms have emerged as pivotal tools in genetics and molecular medicine, as in many other fields. The growth in patient data, identification of new diseases and phenotypes, discovery of new intracellular pathways, availability of greater sets of omics data, and the need to continuously analyse them have led to the development of new AI platforms. AI continues to weave its way into the fabric of genetics with the potential to unlock new discoveries and enhance patient care. This technology is setting the stage for breakthroughs across various domains, including dysmorphology, rare hereditary diseases, cancers, clinical microbiomics, the investigation of zoonotic diseases, omics studies in all medical disciplines. AI's role in facilitating a deeper understanding of these areas heralds a new era of personalised medicine, where treatments and diagnoses are tailored to the individual's molecular features, offering a more precise approach to combating genetic or acquired disorders. The significance of these AI platforms is growing as they assist healthcare professionals in the diagnostic and treatment processes, marking a pivotal shift towards more informed, efficient, and effective medical practice. In this review, we will explore the range of AI tools available and show how they have become vital in various sectors of genomic research supporting clinical decisions.
Collapse
Affiliation(s)
- Firat Ozcelik
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Mehmet Sait Dundar
- Department of Electrical and Computer Engineering, Graduate School of Engineering and Sciences, Abdullah Gul University, Kayseri, Turkey
| | - A Baki Yildirim
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Gary Henehan
- School of Food Science and Environmental Health, Technological University of Dublin, Dublin, Ireland
| | - Oscar Vicente
- Institute for the Conservation and Improvement of Valencian Agrodiversity (COMAV), Universitat Politècnica de València, Valencia, Spain
| | - José A Sánchez-Alcázar
- Centro de Investigación Biomédica en Red: Enfermedades Raras, Centro Andaluz de Biología del Desarrollo (CABD-CSIC-Universidad Pablo de Olavide), Instituto de Salud Carlos III, Sevilla, Spain
| | - Nuriye Gokce
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Duygu T Yildirim
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Nurdeniz Nalbant Bingol
- Department of Translational Medicine, Institute of Health Sciences, Bursa Uludag University, Bursa, Turkey
| | - Dijana Plaseska Karanfilska
- Research Centre for Genetic Engineering and Biotechnology, Macedonian Academy of Sciences and Arts, Skopje, Macedonia
| | | | - Lejla Pojskic
- Institute for Genetic Engineering and Biotechnology, University of Sarajevo, Sarajevo, Bosnia and Herzegovina
| | - Mehmet Ercan
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Miklos Kellermayer
- Department of Biophysics and Radiation Biology, Faculty of Medicine, Semmelweis University, Budapest, Hungary
| | - Izem Olcay Sahin
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | | | - Busra Tan
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Donald Martin
- University Grenoble Alpes, CNRS, TIMC-IMAG/SyNaBi (UMR 5525), Grenoble, France
| | - Robert Marks
- Avram and Stella Goldstein-Goren Department of Biotechnology Engineering, Ben-Gurion University of the Negev, Be'er Sheva, Israel
| | - Satya Prakash
- Department of Biomedical Engineering, University of McGill, Montreal, QC, Canada
| | - Mustafa Yakubi
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Tommaso Beccari
- Department of Pharmeceutical Sciences, University of Perugia, Perugia, Italy
| | - Ratnesh Lal
- Neuroscience Research Institute, University of California, Santa Barbara, USA
| | - Sehime G Temel
- Department of Translational Medicine, Institute of Health Sciences, Bursa Uludag University, Bursa, Turkey
- Department of Medical Genetics, Bursa Uludag University Faculty of Medicine, Bursa, Turkey
- Department of Histology and Embryology, Faculty of Medicine, Bursa Uludag University, Bursa, Turkey
| | - Isabelle Fournier
- Réponse Inflammatoire et Spectrométrie de Masse-PRISM, University of Lille, Lille, France
| | - M Cerkez Ergoren
- Department of Medical Genetics, Near East University Faculty of Medicine, Nicosia, Cyprus
| | - Adam Mechler
- Department of Chemistry, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC, Australia
| | - Michel Salzet
- Réponse Inflammatoire et Spectrométrie de Masse-PRISM, University of Lille, Lille, France
| | - Michele Maffia
- Department of Experimental Medicine, University of Salento, Via Lecce-Monteroni, Lecce, 73100, Italy
| | - Dancho Danalev
- University of Chemical Technology and Metallurgy, Sofia, Bulgaria
| | - Qun Sun
- Department of Food Science and Technology, Sichuan University, Chengdu, China
| | - Lembit Nei
- School of Engineering Tallinn University of Technology, Tartu College, Tartu, Estonia
| | - Daumantas Matulis
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Dana Tapaloaga
- Faculty of Veterinary Medicine, University of Agronomic Sciences and Veterinary Medicine of Bucharest, Bucharest, Romania
| | - Andres Janecke
- Department of Paediatrics I, Medical University of Innsbruck, Innsbruck, Austria
- Division of Human Genetics, Medical University of Innsbruck, Innsbruck, Austria
| | - James Bown
- School of Science, Engineering and Technology, Abertay University, Dundee, UK
| | | | - Iza Radecka
- School of Science, Faculty of Science and Engineering, University of Wolverhampton, Wolverhampton, UK
| | - Celal Ozturk
- Department of Software Engineering, Erciyes University, Kayseri, Turkey
| | - Ozkan Ufuk Nalbantoglu
- Department of Computer Engineering, Engineering Faculty, Erciyes University, Kayseri, Turkey
| | - Sebnem Ozemri Sag
- Department of Medical Genetics, Bursa Uludag University Faculty of Medicine, Bursa, Turkey
| | - Kisung Ko
- Department of Medicine, College of Medicine, Chung-Ang University, Seoul, Korea
| | - Reynir Arngrimsson
- Iceland Landspitali University Hospital, University of Iceland, Reykjavik, Iceland
| | - Isabel Belo
- Centre of Biological Engineering, University of Minho, Braga, Portugal
| | - Hilal Akalin
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey.
| | - Munis Dundar
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey.
| |
Collapse
|
6
|
Kakar N, Rehman FU, Kaur R, Bhavani GS, Goyal M, Shah H, Kaur K, Sodhi KS, Kubisch C, Borck G, Panigrahi I, Girisha KM, Kornak U, Spielmann M. Multi-gene panel sequencing in highly consanguineous families and patients with congenital forms of skeletal dysplasias. Clin Genet 2024; 106:47-55. [PMID: 38378010 DOI: 10.1111/cge.14509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 02/05/2024] [Accepted: 02/07/2024] [Indexed: 02/22/2024]
Abstract
Skeletal dysplasias (SKDs) are a heterogeneous group of more than 750 genetic disorders characterized by abnormal development, growth, and maintenance of bones or cartilage in the human skeleton. SKDs are often caused by variants in early patterning genes and in many cases part of multiple malformation syndromes and occur in combination with non-skeletal phenotypes. The aim of this study was to investigate the underlying genetic cause of congenital SKDs in highly consanguineous Pakistani families, as well as in sporadic and familial SKD cases from India using multigene panel sequencing analysis. Therefore, we performed panel sequencing of 386 bone-related genes in 7 highly consanguineous families from Pakistan and 27 cases from India affected with SKDs. In the highly consanguineous families, we were able to identify the underlying genetic cause in five out of seven families, resulting in a diagnostic yield of 71%. Whereas, in the sporadic and familial SKD cases, we identified 12 causative variants, corresponding to a diagnostic yield of 44%. The genetic heterogeneity in our cohorts was very high and we were able to detect various types of variants, including missense, nonsense, and frameshift variants, across multiple genes known to cause different types of SKDs. In conclusion, panel sequencing proved to be a highly effective way to decipher the genetic basis of SKDs in highly consanguineous families as well as sporadic and or familial cases from South Asia. Furthermore, our findings expand the allelic spectrum of skeletal dysplasias.
Collapse
Affiliation(s)
- Naseebullah Kakar
- Institut für Humangenetik, Universitätsklinikum Schleswig-Holstein, University of Lübeck and University of Kiel, Lübeck, Germany
- Department of Biotechnology, BUITEMS, Quetta, Pakistan
- Institute of Human Genetics, Ulm University, Ulm, Germany
| | - Fazal Ur Rehman
- Department of Pathology, Bolan Medical College, Quetta, Pakistan
| | - Ramandeep Kaur
- Department of Pediatrics, APC, PGIMER, Chandigarh, India
| | - Gandham SriLakshmi Bhavani
- Department of Medical Genetics, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Manisha Goyal
- Pediatrics Genetic & Research Laboratory, Department of Pediatrics, Lok Nayak Hospital, New Delhi, India
| | - Hitesh Shah
- Department of Pediatric Orthopedics, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Karandeep Kaur
- Department of Pediatrics, APC, PGIMER, Chandigarh, India
| | | | - Christian Kubisch
- Institute of Human Genetics, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Guntram Borck
- Institute of Human Genetics, Ulm University, Ulm, Germany
| | | | - Katta Mohan Girisha
- Department of Medical Genetics, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Uwe Kornak
- Institute of Human Genetics, University Medical Center Göttingen, Göttingen, Germany
| | - Malte Spielmann
- Institut für Humangenetik, Universitätsklinikum Schleswig-Holstein, University of Lübeck and University of Kiel, Lübeck, Germany
| |
Collapse
|
7
|
Faviez C, Chen X, Garcelon N, Zaidan M, Billot K, Petzold F, Faour H, Douillet M, Rozet JM, Cormier-Daire V, Attié-Bitach T, Lyonnet S, Saunier S, Burgun A. Objectivizing issues in the diagnosis of complex rare diseases: lessons learned from testing existing diagnosis support systems on ciliopathies. BMC Med Inform Decis Mak 2024; 24:134. [PMID: 38789985 PMCID: PMC11127295 DOI: 10.1186/s12911-024-02538-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 05/17/2024] [Indexed: 05/26/2024] Open
Abstract
BACKGROUND There are approximately 8,000 different rare diseases that affect roughly 400 million people worldwide. Many of them suffer from delayed diagnosis. Ciliopathies are rare monogenic disorders characterized by a significant phenotypic and genetic heterogeneity that raises an important challenge for clinical diagnosis. Diagnosis support systems (DSS) applied to electronic health record (EHR) data may help identify undiagnosed patients, which is of paramount importance to improve patients' care. Our objective was to evaluate three online-accessible rare disease DSSs using phenotypes derived from EHRs for the diagnosis of ciliopathies. METHODS Two datasets of ciliopathy cases, either proven or suspected, and two datasets of controls were used to evaluate the DSSs. Patient phenotypes were automatically extracted from their EHRs and converted to Human Phenotype Ontology terms. We tested the ability of the DSSs to diagnose cases in contrast to controls based on Orphanet ontology. RESULTS A total of 79 cases and 38 controls were selected. Performances of the DSSs on ciliopathy real world data (best DSS with area under the ROC curve = 0.72) were not as good as published performances on the test set used in the DSS development phase. None of these systems obtained results which could be described as "expert-level". Patients with multisystemic symptoms were generally easier to diagnose than patients with isolated symptoms. Diseases easily confused with ciliopathy generally affected multiple organs and had overlapping phenotypes. Four challenges need to be considered to improve the performances: to make the DSSs interoperable with EHR systems, to validate the performances in real-life settings, to deal with data quality, and to leverage methods and resources for rare and complex diseases. CONCLUSION Our study provides insights into the complexities of diagnosing highly heterogenous rare diseases and offers lessons derived from evaluation existing DSSs in real-world settings. These insights are not only beneficial for ciliopathy diagnosis but also hold relevance for the enhancement of DSS for various complex rare disorders, by guiding the development of more clinically relevant rare disease DSSs, that could support early diagnosis and finally make more patients eligible for treatment.
Collapse
Affiliation(s)
- Carole Faviez
- Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université Paris Cité, Paris, F-75006, France.
- HeKA, Inria Paris, Paris, F-75012, France.
- Universite Paris Cite, Paris, France.
| | - Xiaoyi Chen
- Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université Paris Cité, Paris, F-75006, France
- HeKA, Inria Paris, Paris, F-75012, France
- Data Science Platform, Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, F-75015, France
| | - Nicolas Garcelon
- Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université Paris Cité, Paris, F-75006, France
- HeKA, Inria Paris, Paris, F-75012, France
- Data Science Platform, Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, F-75015, France
| | - Mohamad Zaidan
- Service de Néphrologie, Dialyse et Transplantation, Hôpital Universitaire Bicêtre, Assistance Publique-Hôpitaux de Paris (AP-HP), Kremlin Bicêtre, F-94270, France
| | - Katy Billot
- Laboratory of Renal Hereditary Diseases, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
| | - Friederike Petzold
- Laboratory of Renal Hereditary Diseases, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
- Division of Nephrology, University of Leipzig Medical Center, Leipzig, Germany
| | - Hassan Faour
- Data Science Platform, Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, F-75015, France
| | - Maxime Douillet
- Data Science Platform, Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, F-75015, France
| | - Jean-Michel Rozet
- Laboratory of Genetics in Ophthalmology, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
| | - Valérie Cormier-Daire
- Reference Centre for Constitutional Bone Diseases, laboratory of Osteochondrodysplasia, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
- Service de médecine génomique des maladies rares, Hôpital Necker-Enfants Malades, AP-HP, Paris, F-75015, France
| | - Tania Attié-Bitach
- Service d'Histologie-Embryologie-Cytogénétique, Hôpital Necker-Enfants Malades, AP-HP, Paris, F-75015, France
| | - Stanislas Lyonnet
- Service de médecine génomique des maladies rares, Hôpital Necker-Enfants Malades, AP-HP, Paris, F-75015, France
- Laboratory of Embryology and Genetics of Congenital Malformations, INSERM UMR 1163, Imagine Institute, Paris Cité, Paris, F-75015, France
| | - Sophie Saunier
- Laboratory of Renal Hereditary Diseases, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
| | - Anita Burgun
- Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université Paris Cité, Paris, F-75006, France
- HeKA, Inria Paris, Paris, F-75012, France
- Department of Medical Informatics, Hôpital Necker-Enfants Malades, AP-HP, Paris, F-75015, France
| |
Collapse
|
8
|
Althagafi A, Zhapa-Camacho F, Hoehndorf R. Prioritizing genomic variants through neuro-symbolic, knowledge-enhanced learning. Bioinformatics 2024; 40:btae301. [PMID: 38696757 PMCID: PMC11132820 DOI: 10.1093/bioinformatics/btae301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 04/05/2024] [Accepted: 04/30/2024] [Indexed: 05/04/2024] Open
Abstract
MOTIVATION Whole-exome and genome sequencing have become common tools in diagnosing patients with rare diseases. Despite their success, this approach leaves many patients undiagnosed. A common argument is that more disease variants still await discovery, or the novelty of disease phenotypes results from a combination of variants in multiple disease-related genes. Interpreting the phenotypic consequences of genomic variants relies on information about gene functions, gene expression, physiology, and other genomic features. Phenotype-based methods to identify variants involved in genetic diseases combine molecular features with prior knowledge about the phenotypic consequences of altering gene functions. While phenotype-based methods have been successfully applied to prioritizing variants, such methods are based on known gene-disease or gene-phenotype associations as training data and are applicable to genes that have phenotypes associated, thereby limiting their scope. In addition, phenotypes are not assigned uniformly by different clinicians, and phenotype-based methods need to account for this variability. RESULTS We developed an Embedding-based Phenotype Variant Predictor (EmbedPVP), a computational method to prioritize variants involved in genetic diseases by combining genomic information and clinical phenotypes. EmbedPVP leverages a large amount of background knowledge from human and model organisms about molecular mechanisms through which abnormal phenotypes may arise. Specifically, EmbedPVP incorporates phenotypes linked to genes, functions of gene products, and the anatomical site of gene expression, and systematically relates them to their phenotypic effects through neuro-symbolic, knowledge-enhanced machine learning. We demonstrate EmbedPVP's efficacy on a large set of synthetic genomes and genomes matched with clinical information. AVAILABILITY AND IMPLEMENTATION EmbedPVP and all evaluation experiments are freely available at https://github.com/bio-ontology-research-group/EmbedPVP.
Collapse
Affiliation(s)
- Azza Althagafi
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955, Saudi Arabia
- Computer Science Program, Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955, Saudi Arabia
- Computer Science Department, College of Computers and Information Technology, Taif University, Taif 26571, Saudi Arabia
| | - Fernando Zhapa-Camacho
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955, Saudi Arabia
- Computer Science Program, Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955, Saudi Arabia
| | - Robert Hoehndorf
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955, Saudi Arabia
- Computer Science Program, Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955, Saudi Arabia
- SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955, Saudi Arabia
| |
Collapse
|
9
|
Mao D, Liu C, Wang L, Ai-Ouran R, Deisseroth C, Pasupuleti S, Kim SY, Li L, Rosenfeld JA, Meng L, Burrage LC, Wangler MF, Yamamoto S, Santana M, Perez V, Shukla P, Eng CM, Lee B, Yuan B, Xia F, Bellen HJ, Liu P, Liu Z. AI-MARRVEL - A Knowledge-Driven AI System for Diagnosing Mendelian Disorders. NEJM AI 2024; 1:10.1056/aioa2300009. [PMID: 38962029 PMCID: PMC11221788 DOI: 10.1056/aioa2300009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/05/2024]
Abstract
BACKGROUND Diagnosing genetic disorders requires extensive manual curation and interpretation of candidate variants, a labor-intensive task even for trained geneticists. Although artificial intelligence (AI) shows promise in aiding these diagnoses, existing AI tools have only achieved moderate success for primary diagnosis. METHODS AI-MARRVEL (AIM) uses a random-forest machine-learning classifier trained on over 3.5 million variants from thousands of diagnosed cases. AIM additionally incorporates expert-engineered features into training to recapitulate the intricate decision-making processes in molecular diagnosis. The online version of AIM is available at https://ai.marrvel.org. To evaluate AIM, we benchmarked it with diagnosed patients from three independent cohorts. RESULTS AIM improved the rate of accurate genetic diagnosis, doubling the number of solved cases as compared with benchmarked methods, across three distinct real-world cohorts. To better identify diagnosable cases from the unsolved pools accumulated over time, we designed a confidence metric on which AIM achieved a precision rate of 98% and identified 57% of diagnosable cases out of a collection of 871 cases. Furthermore, AIM's performance improved after being fine-tuned for targeted settings including recessive disorders and trio analysis. Finally, AIM demonstrated potential for novel disease gene discovery by correctly predicting two newly reported disease genes from the Undiagnosed Diseases Network. CONCLUSIONS AIM achieved superior accuracy compared with existing methods for genetic diagnosis. We anticipate that this tool may aid in primary diagnosis, reanalysis of unsolved cases, and the discovery of novel disease genes. (Funded by the NIH Common Fund and others.).
Collapse
Affiliation(s)
- Dongxue Mao
- Department of Pediatrics, Baylor College of Medicine, Houston
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston
| | - Chaozhong Liu
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston
- Graduate School of Biomedical Sciences, Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston
| | - Linhua Wang
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston
- Graduate School of Biomedical Sciences, Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston
| | - Rami Ai-Ouran
- Department of Pediatrics, Baylor College of Medicine, Houston
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston
- Department of Data Science and AI, Al Hussein Technical University, Amman, Jordan
| | - Cole Deisseroth
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston
| | - Sasidhar Pasupuleti
- Department of Pediatrics, Baylor College of Medicine, Houston
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston
| | - Seon Young Kim
- Department of Pediatrics, Baylor College of Medicine, Houston
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston
| | - Lucian Li
- Department of Pediatrics, Baylor College of Medicine, Houston
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston
| | - Jill A Rosenfeld
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
| | - Linyan Meng
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
- Baylor Genetics, Houston7
| | - Lindsay C Burrage
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
| | - Michael F Wangler
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston
| | - Shinya Yamamoto
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston
| | | | | | | | - Christine M Eng
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
- Baylor Genetics, Houston7
| | - Brendan Lee
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
| | - Bo Yuan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
- Human Genome Sequencing Center, Baylor College of Medicine, Houston
| | - Fan Xia
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
- Baylor Genetics, Houston7
| | - Hugo J Bellen
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston
- Department of Neuroscience, Baylor College of Medicine, Houston
| | - Pengfei Liu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston
- Baylor Genetics, Houston7
| | - Zhandong Liu
- Department of Pediatrics, Baylor College of Medicine, Houston
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston
| |
Collapse
|
10
|
Jerschow E, Dubin R, Chen CC, iAkushev A, Sehanobish E, Asad M, Chiarella SE, Porcelli SA, Greally J. Aspirin-exacerbated respiratory disease is associated with variants in filaggrin, epithelial integrity, and cellular interactions. THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY. GLOBAL 2024; 3:100205. [PMID: 38317805 PMCID: PMC10838899 DOI: 10.1016/j.jacig.2024.100205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 11/15/2023] [Accepted: 12/01/2023] [Indexed: 02/07/2024]
Abstract
Background Previous studies have determined that up to 6% of patients with aspirin-exacerbated respiratory disease (AERD) have family history of AERD, indicating a possible link with genetic polymorphisms. However, whole exome sequencing (WES) studies of such associations are currently lacking. Objectives We sought to examine whether WES can identify pathogenic variants associated with AERD. Methods Diagnoses of AERD were confirmed in patients with nasal polyps and asthma. WES was performed using an Illumina sequencing platform. Human Phenotype Ontology terms were used to define the patients' phenotypes. Exomiser was used to annotate, filter, and prioritize possible disease-causing genetic variants. Results Of 39 patients with AERD, 41% reported a family history of asthma and 5% reported a family history of AERD. Pathogenic exome variants in the filaggrin gene (FLG) were found in 2 patients (5%). Other variants not known to be pathogenic were detected in an additional 16 patients (41%) in genes related to epithelial integrity and cellular interactions, including genes encoding desmoglein 3 (DSG3), dynein axonemal heavy chain 9 (DNAH9), collagen type VII alpha 1 chain (COL7A1), collagen type XVII alpha 1 chain (COL17A1), chromodomain helicase DNA binding protein-7 (CHD7), TSC complex subunit 2/tuberous sclerosis-2 protein (TSC2), P-selectin (SELP), and platelet-derived growth factor receptor-alpha (PDGFRA). Conclusion WES identified a monogenic susceptibility to AERD in 5% of patients with FLG pathogenic variants. Other variants not previously identified as pathogenic were found in genes relevant to epithelial integrity and cellular interactions and may further reveal genetic factors that contribute to this condition.
Collapse
Affiliation(s)
- Elina Jerschow
- Mayo Clinic, Rochester, Minn
- Albert Einstein College of Medicine, Bronx, NY
| | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Wei WQ, Rowley R, Wood A, MacArthur J, Embi PJ, Denaxas S. Improving reporting standards for phenotyping algorithm in biomedical research: 5 fundamental dimensions. J Am Med Inform Assoc 2024; 31:1036-1041. [PMID: 38269642 PMCID: PMC10990558 DOI: 10.1093/jamia/ocae005] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/12/2023] [Accepted: 01/08/2024] [Indexed: 01/26/2024] Open
Abstract
INTRODUCTION Phenotyping algorithms enable the interpretation of complex health data and definition of clinically relevant phenotypes; they have become crucial in biomedical research. However, the lack of standardization and transparency inhibits the cross-comparison of findings among different studies, limits large scale meta-analyses, confuses the research community, and prevents the reuse of algorithms, which results in duplication of efforts and the waste of valuable resources. RECOMMENDATIONS Here, we propose five independent fundamental dimensions of phenotyping algorithms-complexity, performance, efficiency, implementability, and maintenance-through which researchers can describe, measure, and deploy any algorithms efficiently and effectively. These dimensions must be considered in the context of explicit use cases and transparent methods to ensure that they do not reflect unexpected biases or exacerbate inequities.
Collapse
Affiliation(s)
- Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Robb Rowley
- National Human Genome Research Institute, Bethesda, MD 20892, United States
| | - Angela Wood
- Department of Public Health and Primary Care, University of Cambridge, Cambridge, CB2 1TN, United Kingdom
| | - Jacqueline MacArthur
- British Heart Foundation Data Science Center, Health Data Research, London, NW1 2BE, United Kingdom
| | - Peter J Embi
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Spiros Denaxas
- British Heart Foundation Data Science Center, Health Data Research, London, NW1 2BE, United Kingdom
- Institute of Health Informatics, University College London, London, WC1E 6BT, United Kingdom
| |
Collapse
|
12
|
Duyzend MH, Cacheiro P, Jacobsen JO, Giordano J, Brand H, Wapner RJ, Talkowski ME, Robinson PN, Smedley D. Improving prenatal diagnosis through standards and aggregation. Prenat Diagn 2024; 44:454-464. [PMID: 38242839 PMCID: PMC11006584 DOI: 10.1002/pd.6522] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 12/17/2023] [Accepted: 12/22/2023] [Indexed: 01/21/2024]
Abstract
Advances in sequencing and imaging technologies enable enhanced assessment in the prenatal space, with a goal to diagnose and predict the natural history of disease, to direct targeted therapies, and to implement clinical management, including transfer of care, election of supportive care, and selection of surgical interventions. The current lack of standardization and aggregation stymies variant interpretation and gene discovery, which hinders the provision of prenatal precision medicine, leaving clinicians and patients without an accurate diagnosis. With large amounts of data generated, it is imperative to establish standards for data collection, processing, and aggregation. Aggregated and homogeneously processed genetic and phenotypic data permits dissection of the genomic architecture of prenatal presentations of disease and provides a dataset on which data analysis algorithms can be tuned to the prenatal space. Here we discuss the importance of generating aggregate data sets and how the prenatal space is driving the development of interoperable standards and phenotype-driven tools.
Collapse
Affiliation(s)
- Michael H. Duyzend
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Pilar Cacheiro
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| | - Julius O.B. Jacobsen
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| | - Jessica Giordano
- Department of Obstetrics & Gynecology, Columbia University Medical Center, New York, NY, USA
| | - Harrison Brand
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Neurology, Harvard Medical School, Boston, MA, USA
| | - Ronald J. Wapner
- Department of Obstetrics & Gynecology, Columbia University Medical Center, New York, NY, USA
| | - Michael E. Talkowski
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Neurology, Harvard Medical School, Boston, MA, USA
- Program in Biological and Biomedical Sciences, Division of Medical Sciences, Harvard Medical School, Boston, MA, USA
- Program in Bioinformatics and Integrative Genomics, Division of Medical Sciences, Harvard Medical School, Boston, MA, USA
| | - Peter N. Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
- Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, USA
| | - Damian Smedley
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| |
Collapse
|
13
|
Yuan X, Su J, Wang J, Dai B, Sun Y, Zhang K, Li Y, Chuan J, Tang C, Yu Y, Gong Q. Refined preferences of prioritizers improve intelligent diagnosis for Mendelian diseases. Sci Rep 2024; 14:2845. [PMID: 38310124 PMCID: PMC10838329 DOI: 10.1038/s41598-024-53461-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 01/31/2024] [Indexed: 02/05/2024] Open
Abstract
Phenotype-guided gene prioritizers have proved a highly efficient approach to identifying causal genes for Mendelian diseases. In our previous study, we preliminarily evaluated the performance of ten prioritizers. However, all the selected software was run based on default settings and singleton mode. With a large-scale family dataset from Deciphering Developmental Disorders (DDD) project (N = 305) and an in-house trio cohort (N = 152), the four optimal performers in our prior study including Exomiser, PhenIX, AMELIE, and LIRCIAL were further assessed through parameter optimization and/or the utilization of trio mode. The in-depth assessment revealed high diagnostic yields of the four prioritizers with refined preferences, each alone or together: (1) 83.3-91.8% of the causal genes were presented among the first ten candidates in the final ranking lists of the four tools; (2) Over 97.7% of the causal genes were successfully captured within the top 50 by either of the four software. Exomiser did best in directly hitting the target (ranking the causal gene at the very top) while LIRICAL displayed a predominant overall detection capability. Besides, cases affected by low-penetrance and high-frequency pathogenic variants were found misjudged during the automated prioritization process. The discovery of the limitations shed light on the specific directions of future enhancement for causal-gene ranking tools.
Collapse
Affiliation(s)
- Xiao Yuan
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Jieqiong Su
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Jing Wang
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Bing Dai
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Yanfang Sun
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Keke Zhang
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Yinghua Li
- Guangzhou Kingmed Center for Clinical Laboratory, Guangzhou, Guangdong, China
| | - Jun Chuan
- Genetalks Biotech. Co., Ltd., Changsha, Hunan, China
| | - Chunyan Tang
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Yan Yu
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China.
| | - Qiang Gong
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China.
| |
Collapse
|
14
|
Yang J, Shu L, Han M, Pan J, Chen L, Yuan T, Tan L, Shu Q, Duan H, Li H. RDmaster: A novel phenotype-oriented dialogue system supporting differential diagnosis of rare disease. Comput Biol Med 2024; 169:107924. [PMID: 38181610 DOI: 10.1016/j.compbiomed.2024.107924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 12/18/2023] [Accepted: 01/01/2024] [Indexed: 01/07/2024]
Abstract
BACKGROUND Clinicians often lack the necessary expertise to differentially diagnose multiple underlying rare diseases (RDs) due to their complex and overlapping clinical features, leading to misdiagnoses and delayed treatments. The aim of this study is to develop a novel electronic differential diagnostic support system for RDs. METHOD Through integrating two Bayesian diagnostic methods, a candidate list was generated with enhance clinical interpretability for the further Q&A based differential diagnosis (DDX). To achieve an efficient Q&A dialogue strategy, we introduce a novel metric named the adaptive information gain and Gini index (AIGGI) to evaluate the expected gain of interrogated phenotypes within real-time diagnostic states. RESULTS This DDX tool called RDmaster has been implemented as a web-based platform (http://rdmaster.nbscn.org/). A diagnostic trial involving 238 published RD patients revealed that RDmaster outperformed existing RD diagnostic tools, as well as ChatGPT, and was shown to enhance the diagnostic accuracy through its Q&A system. CONCLUSIONS The RDmaster offers an effective multi-omics differential diagnostic technique and outperforms existing tools and popular large language models, particularly enhancing differential diagnosis in collecting diagnostically beneficial phenotypes.
Collapse
Affiliation(s)
- Jian Yang
- Clinical Data Center, The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Zhejiang, China; The College of Biomedical Engineering and Instrument Science, Zhejiang University, Zhejiang, China
| | - Liqi Shu
- Rhode Island Hospital, Warren Alpert Medical School of Brown University, Rhode Island, USA
| | - Mingyu Han
- Neonatal Department, The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Zhejiang, China
| | - Jiarong Pan
- Neonatal Department, The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Zhejiang, China
| | - Lihua Chen
- Neonatal Department, The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Zhejiang, China
| | - Tianming Yuan
- Neonatal Department, The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Zhejiang, China
| | - Linhua Tan
- Surgical Intensive Care Unit, The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Zhejiang, China
| | - Qiang Shu
- Clinical Data Center, The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Zhejiang, China
| | - Huilong Duan
- The College of Biomedical Engineering and Instrument Science, Zhejiang University, Zhejiang, China
| | - Haomin Li
- Clinical Data Center, The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Zhejiang, China.
| |
Collapse
|
15
|
Canavati C, Sherill-Rofe D, Kamal L, Bloch I, Zahdeh F, Sharon E, Terespolsky B, Allan IA, Rabie G, Kawas M, Kassem H, Avraham KB, Renbaum P, Levy-Lahad E, Kanaan M, Tabach Y. Using multi-scale genomics to associate poorly annotated genes with rare diseases. Genome Med 2024; 16:4. [PMID: 38178268 PMCID: PMC10765705 DOI: 10.1186/s13073-023-01276-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Accepted: 12/15/2023] [Indexed: 01/06/2024] Open
Abstract
BACKGROUND Next-generation sequencing (NGS) has significantly transformed the landscape of identifying disease-causing genes associated with genetic disorders. However, a substantial portion of sequenced patients remains undiagnosed. This may be attributed not only to the challenges posed by harder-to-detect variants, such as non-coding and structural variations but also to the existence of variants in genes not previously associated with the patient's clinical phenotype. This study introduces EvORanker, an algorithm that integrates unbiased data from 1,028 eukaryotic genomes to link mutated genes to clinical phenotypes. METHODS EvORanker utilizes clinical data, multi-scale phylogenetic profiling, and other omics data to prioritize disease-associated genes. It was evaluated on solved exomes and simulated genomes, compared with existing methods, and applied to 6260 knockout genes with mouse phenotypes lacking human associations. Additionally, EvORanker was made accessible as a user-friendly web tool. RESULTS In the analyzed exomic cohort, EvORanker accurately identified the "true" disease gene as the top candidate in 69% of cases and within the top 5 candidates in 95% of cases, consistent with results from the simulated dataset. Notably, EvORanker outperformed existing methods, particularly for poorly annotated genes. In the case of the 6260 knockout genes with mouse phenotypes, EvORanker linked 41% of these genes to observed human disease phenotypes. Furthermore, in two unsolved cases, EvORanker successfully identified DLGAP2 and LPCAT3 as disease candidates for previously uncharacterized genetic syndromes. CONCLUSIONS We highlight clade-based phylogenetic profiling as a powerful systematic approach for prioritizing potential disease genes. Our study showcases the efficacy of EvORanker in associating poorly annotated genes to disease phenotypes observed in patients. The EvORanker server is freely available at https://ccanavati.shinyapps.io/EvORanker/ .
Collapse
Affiliation(s)
- Christina Canavati
- Department of Developmental Biology and Cancer Research, Institute of Medical Research - Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel
- Molecular Genetics Lab, Istishari Arab Hospital, Ramallah, Palestine
| | - Dana Sherill-Rofe
- Department of Developmental Biology and Cancer Research, Institute of Medical Research - Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel
| | - Lara Kamal
- Molecular Genetics Lab, Istishari Arab Hospital, Ramallah, Palestine
- Department of Human Molecular Genetics and Biochemistry, Faculty of Medicine and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, 6997801, Israel
| | - Idit Bloch
- Department of Developmental Biology and Cancer Research, Institute of Medical Research - Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel
| | - Fouad Zahdeh
- Medical Genetics Institute, Shaare Zedek Medical Center, Jerusalem, 91031, Israel
| | - Elad Sharon
- Department of Developmental Biology and Cancer Research, Institute of Medical Research - Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel
| | - Batel Terespolsky
- Department of Developmental Biology and Cancer Research, Institute of Medical Research - Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel
- Medical Genetics Institute, Shaare Zedek Medical Center, Jerusalem, 91031, Israel
| | - Islam Abu Allan
- Molecular Genetics Lab, Istishari Arab Hospital, Ramallah, Palestine
| | - Grace Rabie
- Hereditary Research Laboratory and Department of Life Sciences, Bethlehem University, Bethlehem, 72372, Palestine
| | - Mariana Kawas
- Hereditary Research Laboratory and Department of Life Sciences, Bethlehem University, Bethlehem, 72372, Palestine
| | - Hanin Kassem
- Molecular Genetics Lab, Istishari Arab Hospital, Ramallah, Palestine
| | - Karen B Avraham
- Department of Human Molecular Genetics and Biochemistry, Faculty of Medicine and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, 6997801, Israel
| | - Paul Renbaum
- Medical Genetics Institute, Shaare Zedek Medical Center, Jerusalem, 91031, Israel
| | - Ephrat Levy-Lahad
- Medical Genetics Institute, Shaare Zedek Medical Center, Jerusalem, 91031, Israel
- Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel
| | - Moien Kanaan
- Molecular Genetics Lab, Istishari Arab Hospital, Ramallah, Palestine
- Hereditary Research Laboratory and Department of Life Sciences, Bethlehem University, Bethlehem, 72372, Palestine
| | - Yuval Tabach
- Department of Developmental Biology and Cancer Research, Institute of Medical Research - Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel.
| |
Collapse
|
16
|
Stirnemann JJ, Besson R, Spaggiari E, Rojo S, Loge F, Peyro-Saint-Paul H, Allassonniere S, Le Pennec E, Hutchinson C, Sebire N, Ville Y. Development and clinical validation of real-time artificial intelligence diagnostic companion for fetal ultrasound examination. ULTRASOUND IN OBSTETRICS & GYNECOLOGY : THE OFFICIAL JOURNAL OF THE INTERNATIONAL SOCIETY OF ULTRASOUND IN OBSTETRICS AND GYNECOLOGY 2023; 62:353-360. [PMID: 37161503 DOI: 10.1002/uog.26242] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 02/13/2023] [Accepted: 03/20/2023] [Indexed: 05/11/2023]
Abstract
OBJECTIVE Prenatal diagnosis of a rare disease on ultrasound relies on a physician's ability to remember an intractable amount of knowledge. We developed a real-time decision support system (DSS) that suggests, at each step of the examination, the next phenotypic feature to assess, optimizing the diagnostic pathway to the smallest number of possible diagnoses. The objective of this study was to evaluate the performance of this real-time DSS using clinical data. METHODS This validation study was conducted on a database of 549 perinatal phenotypes collected from two referral centers (one in France and one in the UK). Inclusion criteria were: at least one anomaly was visible on fetal ultrasound after 11 weeks' gestation; the anomaly was confirmed postnatally; an associated rare disease was confirmed or ruled out based on postnatal/postmortem investigation, including physical examination, genetic testing and imaging; and, when confirmed, the syndrome was known by the DSS software. The cases were assessed retrospectively by the software, using either the full phenotype as a single input, or a stepwise input of phenotypic features, as prompted by the software, mimicking its use in a real-life clinical setting. Adjudication of discordant cases, in which there was disagreement between the DSS output and the postnatally confirmed ('ascertained') diagnosis, was performed by a panel of external experts. The proportion of ascertained diagnoses within the software's top-10 differential diagnoses output was evaluated, as well as the sensitivity and specificity of the software to select correctly as its best guess a syndromic or isolated condition. RESULTS The dataset covered 110/408 (27%) diagnoses within the software's database, yielding a cumulative prevalence of 83%. For syndromic cases, the ascertained diagnosis was within the top-10 list in 93% and 83% of cases using the full-phenotype and stepwise input, respectively, after adjudication. The full-phenotype and stepwise approaches were associated, respectively, with a specificity of 94% and 96% and a sensitivity of 99% and 84%. The stepwise approach required an average of 13 queries to reach the final set of diagnoses. CONCLUSIONS The DSS showed high performance when applied to real-world data. This validation study suggests that such software can improve perinatal care, efficiently providing complex and otherwise overlooked knowledge to care-providers involved in ultrasound-based prenatal diagnosis. © 2023 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of International Society of Ultrasound in Obstetrics and Gynecology.
Collapse
Affiliation(s)
- J J Stirnemann
- Department of Obstetrics and Maternal-Fetal Medicine, Necker-Enfants Malades Hospital, AP-HP, Paris, France
- EA7328 Université de Paris, IMAGINE Institute, Paris, France
| | | | - E Spaggiari
- Department of Obstetrics and Maternal-Fetal Medicine, Necker-Enfants Malades Hospital, AP-HP, Paris, France
- EA7328 Université de Paris, IMAGINE Institute, Paris, France
- Department of Histology-Embryology and Cytogenetics, Unit of Embryo and Fetal Pathology, Necker-Enfants Malades Hospital, AP-HP, Paris, France
| | | | | | | | - S Allassonniere
- School of Medicine, Université de Paris, INRIA EPI HEKA, INSERM UMR 1138, Sorbonne Université, Paris, France
- Center for Applied Mathematics, Ecole Polytechnique, Institut Polytechnique de Paris, Paris, France
| | - E Le Pennec
- Center for Applied Mathematics, Ecole Polytechnique, Institut Polytechnique de Paris, Paris, France
- Xpop, INRIA Saclay Center, Paris, France
| | - C Hutchinson
- NIHR Great Ormond Street Hospital Biomedical Research Centre, London, UK
| | - N Sebire
- NIHR Great Ormond Street Hospital Biomedical Research Centre, London, UK
| | - Y Ville
- Department of Obstetrics and Maternal-Fetal Medicine, Necker-Enfants Malades Hospital, AP-HP, Paris, France
- EA7328 Université de Paris, IMAGINE Institute, Paris, France
| |
Collapse
|
17
|
Dingemans AJM, Hinne M, Truijen KMG, Goltstein L, van Reeuwijk J, de Leeuw N, Schuurs-Hoeijmakers J, Pfundt R, Diets IJ, den Hoed J, de Boer E, Coenen-van der Spek J, Jansen S, van Bon BW, Jonis N, Ockeloen CW, Vulto-van Silfhout AT, Kleefstra T, Koolen DA, Campeau PM, Palmer EE, Van Esch H, Lyon GJ, Alkuraya FS, Rauch A, Marom R, Baralle D, van der Sluijs PJ, Santen GWE, Kooy RF, van Gerven MAJ, Vissers LELM, de Vries BBA. PhenoScore quantifies phenotypic variation for rare genetic diseases by combining facial analysis with other clinical features using a machine-learning framework. Nat Genet 2023; 55:1598-1607. [PMID: 37550531 PMCID: PMC11414844 DOI: 10.1038/s41588-023-01469-w] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 07/05/2023] [Indexed: 08/09/2023]
Abstract
Several molecular and phenotypic algorithms exist that establish genotype-phenotype correlations, including facial recognition tools. However, no unified framework that investigates both facial data and other phenotypic data directly from individuals exists. We developed PhenoScore: an open-source, artificial intelligence-based phenomics framework, combining facial recognition technology with Human Phenotype Ontology data analysis to quantify phenotypic similarity. Here we show PhenoScore's ability to recognize distinct phenotypic entities by establishing recognizable phenotypes for 37 of 40 investigated syndromes against clinical features observed in individuals with other neurodevelopmental disorders and show it is an improvement on existing approaches. PhenoScore provides predictions for individuals with variants of unknown significance and enables sophisticated genotype-phenotype studies by testing hypotheses on possible phenotypic (sub)groups. PhenoScore confirmed previously known phenotypic subgroups caused by variants in the same gene for SATB1, SETBP1 and DEAF1 and provides objective clinical evidence for two distinct ADNP-related phenotypes, already established functionally.
Collapse
Affiliation(s)
- Alexander J M Dingemans
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
- Department of Artificial Intelligence, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Max Hinne
- Department of Artificial Intelligence, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Kim M G Truijen
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Lia Goltstein
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Jeroen van Reeuwijk
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Nicole de Leeuw
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Janneke Schuurs-Hoeijmakers
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Rolph Pfundt
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Illja J Diets
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Joery den Hoed
- Language and Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
| | - Elke de Boer
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Jet Coenen-van der Spek
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Sandra Jansen
- Department of Human Genetics, Amsterdam UMC, University of Amsterdam, Amsterdam, the Netherlands
| | - Bregje W van Bon
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Noraly Jonis
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Charlotte W Ockeloen
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Anneke T Vulto-van Silfhout
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Tjitske Kleefstra
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - David A Koolen
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Philippe M Campeau
- Department of Pediatrics, University of Montreal, Montreal, Quebec, Canada
| | - Elizabeth E Palmer
- Faculty of Medicine and Health, UNSW Sydney, Sydney, New South Wales, Australia
- Sydney Children's Hospitals Network, Sydney, New South Wales, Australia
| | - Hilde Van Esch
- Center for Human Genetics, University Hospitals Leuven, University of Leuven, Leuven, Belgium
| | - Gholson J Lyon
- Department of Human Genetics and George A. Jervis Clinic, Institute for Basic Research in Developmental Disabilities (IBR), Staten Island, NY, USA
- Biology PhD Program, The Graduate Center, The City University of New York, New York City, NY, USA
| | - Fowzan S Alkuraya
- Department of Translational Genomics, Center for Genomic Medicine, King Faisal Specialist Hospital and Research Center, Riyadh, Saudi Arabia
| | - Anita Rauch
- Institute of Medical Genetics, University of Zürich, Zürich, Switzerland
| | - Ronit Marom
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Diana Baralle
- Faculty of Medicine, University of Southampton, Southampton, UK
| | | | - Gijs W E Santen
- Department of Clinical Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | - R Frank Kooy
- Department of Medical Genetics, University of Antwerp, Antwerp, Belgium
| | - Marcel A J van Gerven
- Department of Artificial Intelligence, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Lisenka E L M Vissers
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands.
| | - Bert B A de Vries
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, the Netherlands.
| |
Collapse
|
18
|
Henry OJ, Stödberg T, Båtelson S, Rasi C, Stranneheim H, Wedell A. Individualised human phenotype ontology gene panels improve clinical whole exome and genome sequencing analytical efficacy in a cohort of developmental and epileptic encephalopathies. Mol Genet Genomic Med 2023; 11:e2167. [PMID: 36967109 PMCID: PMC10337286 DOI: 10.1002/mgg3.2167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 02/21/2023] [Accepted: 03/01/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUND The majority of genetic epilepsies remain unsolved in terms of specific genotype. Phenotype-based genomic analyses have shown potential to strengthen genomic analysis in various ways, including improving analytical efficacy. METHODS We have tested a standardised phenotyping method termed 'Phenomodels' for integrating deep-phenotyping information with our in-house developed clinical whole exome/genome sequencing analytical pipeline. Phenomodels includes a user-friendly epilepsy phenotyping template and an objective measure for selecting which template terms to include in individualised Human Phenotype Ontology (HPO) gene panels. In a pilot study of 38 previously solved cases of developmental and epileptic encephalopathies, we compared the sensitivity and specificity of the individualised HPO gene panels with the clinical epilepsy gene panel. RESULTS The Phenomodels template showed high sensitivity for capturing relevant phenotypic information, where 37/38 individuals' HPO gene panels included the causative gene. The HPO gene panels also had far fewer variants to assess than the epilepsy gene panel. CONCLUSION We have demonstrated a viable approach for incorporating standardised phenotype information into clinical genomic analyses, which may enable more efficient analysis.
Collapse
Affiliation(s)
- Olivia J. Henry
- Department of Molecular Medicine and SurgeryKarolinska InstitutetStockholmSweden
| | - Tommy Stödberg
- Department of Women's and Children's HealthKarolinska InstitutetStockholmSweden
- Department of Pediatric NeurologyKarolinska University HospitalStockholmSweden
| | - Sofia Båtelson
- Department of Pediatric NeurologyKarolinska University HospitalStockholmSweden
| | - Chiara Rasi
- Science for Life Laboratory, Department of Microbiology, Tumour and Cell BiologyKarolinska InstitutetStockholmSweden
| | - Henrik Stranneheim
- Department of Molecular Medicine and SurgeryKarolinska InstitutetStockholmSweden
- Science for Life Laboratory, Department of Microbiology, Tumour and Cell BiologyKarolinska InstitutetStockholmSweden
- Centre for Inherited Metabolic DiseasesKarolinska University HospitalStockholmSweden
| | - Anna Wedell
- Department of Molecular Medicine and SurgeryKarolinska InstitutetStockholmSweden
- Centre for Inherited Metabolic DiseasesKarolinska University HospitalStockholmSweden
| |
Collapse
|
19
|
Thessen AE, Cooper L, Swetnam TL, Hegde H, Reese J, Elser J, Jaiswal P. Using knowledge graphs to infer gene expression in plants. Front Artif Intell 2023; 6:1201002. [PMID: 37384147 PMCID: PMC10298150 DOI: 10.3389/frai.2023.1201002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 05/23/2023] [Indexed: 06/30/2023] Open
Abstract
Introduction Climate change is already affecting ecosystems around the world and forcing us to adapt to meet societal needs. The speed with which climate change is progressing necessitates a massive scaling up of the number of species with understood genotype-environment-phenotype (G×E×P) dynamics in order to increase ecosystem and agriculture resilience. An important part of predicting phenotype is understanding the complex gene regulatory networks present in organisms. Previous work has demonstrated that knowledge about one species can be applied to another using ontologically-supported knowledge bases that exploit homologous structures and homologous genes. These types of structures that can apply knowledge about one species to another have the potential to enable the massive scaling up that is needed through in silico experimentation. Methods We developed one such structure, a knowledge graph (KG) using information from Planteome and the EMBL-EBI Expression Atlas that connects gene expression, molecular interactions, functions, and pathways to homology-based gene annotations. Our preliminary analysis uses data from gene expression studies in Arabidopsis thaliana and Populus trichocarpa plants exposed to drought conditions. Results A graph query identified 16 pairs of homologous genes in these two taxa, some of which show opposite patterns of gene expression in response to drought. As expected, analysis of the upstream cis-regulatory region of these genes revealed that homologs with similar expression behavior had conserved cis-regulatory regions and potential interaction with similar trans-elements, unlike homologs that changed their expression in opposite ways. Discussion This suggests that even though the homologous pairs share common ancestry and functional roles, predicting expression and phenotype through homology inference needs careful consideration of integrating cis and trans-regulatory components in the curated and inferred knowledge graph.
Collapse
Affiliation(s)
- Anne E. Thessen
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| | - Laurel Cooper
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, United States
| | - Tyson L. Swetnam
- BIO5 Institute, University of Arizona, Tucson, AZ, United States
| | - Harshad Hegde
- Environmental Genomics and Systems Biology Division, Berkeley Lab (DOE), Berkeley, CA, United States
| | - Justin Reese
- Environmental Genomics and Systems Biology Division, Berkeley Lab (DOE), Berkeley, CA, United States
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, United States
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, United States
| |
Collapse
|
20
|
Badonyi M, Marsh JA. Buffering of genetic dominance by allele-specific protein complex assembly. SCIENCE ADVANCES 2023; 9:eadf9845. [PMID: 37256959 PMCID: PMC10413657 DOI: 10.1126/sciadv.adf9845] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 04/24/2023] [Indexed: 06/02/2023]
Abstract
Protein complex assembly often occurs while subunits are being translated, resulting in complexes whose subunits were translated from the same mRNA in an allele-specific manner. It has thus been hypothesized that such cotranslational assembly may counter the assembly-mediated dominant-negative effect, whereby co-assembly of mutant and wild-type subunits "poisons" complex activity. Here, we show that cotranslationally assembling subunits are much less likely to be associated with autosomal dominant relative to recessive disorders, and that subunits with dominant-negative disease mutations are significantly depleted in cotranslational assembly compared to those associated with loss-of-function mutations. We also find that complexes with known dominant-negative effects tend to expose their interfaces late during translation, lessening the likelihood of cotranslational assembly. Finally, by combining complex properties with other features, we trained a computational model for predicting proteins likely to be associated with non-loss-of-function disease mechanisms, which we believe will be of considerable utility for protein variant interpretation.
Collapse
Affiliation(s)
- Mihaly Badonyi
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | | |
Collapse
|
21
|
Hallowell N, Badger S, McKay F, Kerasidou A, Nellåker C. Democratising or disrupting diagnosis? Ethical issues raised by the use of AI tools for rare disease diagnosis. SSM. QUALITATIVE RESEARCH IN HEALTH 2023; 3:100240. [PMID: 37426704 PMCID: PMC10323712 DOI: 10.1016/j.ssmqr.2023.100240] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 02/13/2023] [Accepted: 02/13/2023] [Indexed: 07/11/2023]
Abstract
Computational phenotyping (CP) technology uses facial recognition algorithms to classify and potentially diagnose rare genetic disorders on the basis of digitised facial images. This AI technology has a number of research as well as clinical applications, such as supporting diagnostic decision-making. Using the example of CP, we examine stakeholders' views of the benefits and costs of using AI as a diagnostic tool within the clinic. Through a series of in-depth interviews (n = 20) with: clinicians, clinical researchers, data scientists, industry and support group representatives, we report stakeholder views regarding the adoption of this technology in a clinical setting. While most interviewees were supportive of employing CP as a diagnostic tool in some capacity we observed ambivalence around the potential for artificial intelligence to overcome diagnostic uncertainty in a clinical context. Thus, while there was widespread agreement amongst interviewees concerning the public benefits of AI assisted diagnosis, namely, its potential to increase diagnostic yield and enable faster more objective and accurate diagnoses by up skilling non specialists and thereby enabling access to diagnosis that is potentially lacking, interviewees also raised concerns about ensuring algorithmic reliability, expunging algorithmic bias and that the use of AI could result in deskilling the specialist clinical workforce. We conclude that, prior to widespread clinical implementation, on-going reflection is needed regarding the trade-offs required to determine acceptable levels of bias and conclude that diagnostic AI tools should only be employed as an assistive technology within the dysmorphology clinic.
Collapse
Affiliation(s)
- Nina Hallowell
- The Ethox Centre and Wellcome Centre for Ethics & Humanities, Nuffield Department of Population Health and Big Data Institute, University of Oxford, UK
| | | | - Francis McKay
- The Ethox Centre and Wellcome Centre for Ethics & Humanities, Nuffield Department of Population Health and Big Data Institute, University of Oxford, UK
| | - Angeliki Kerasidou
- The Ethox Centre and Wellcome Centre for Ethics & Humanities, Nuffield Department of Population Health and Big Data Institute, University of Oxford, UK
| | - Christoffer Nellåker
- Nuffield Department of Women's and Reproductive Health and Big Data Institute, University of Oxford, UK
| |
Collapse
|
22
|
Yang J, Shu L, Duan H, Li H. A Robust Phenotype-driven Likelihood Ratio Analysis Approach Assisting Interpretable Clinical Diagnosis of Rare Diseases. J Biomed Inform 2023; 142:104372. [PMID: 37105510 DOI: 10.1016/j.jbi.2023.104372] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 02/20/2023] [Accepted: 04/20/2023] [Indexed: 04/29/2023]
Abstract
Phenotype-based prioritization of candidate genes and diseases has become a well-established approach for multi-omics diagnostics of rare diseases. Most current algorithms exploit semantic analysis and probabilistic statistics based on Human Phenotype Ontology and are commonly superior to naive search methods. However, these algorithms are mostly less interpretable and do not perform well in real clinical scenarios due to noise and imprecision of query terms, and the fact that individuals may not display all phenotypes of the disease they belong to. We present a Phenotype-driven Likelihood Ratio analysis approach (PheLR) assisting interpretable clinical diagnosis of rare diseases. With a likelihood ratio paradigm, PheLR estimates the posterior probability of candidate diseases and how much a phenotypic feature contributes to the prioritization result. Benchmarked using simulated and realistic patients, PheLR shows significant advantages over current approaches and is robust to noise and inaccuracy. To facilitate clinical practice and visualized differential diagnosis, PheLR is implemented as an online web tool (http://phelr.nbscn.org).
Collapse
Affiliation(s)
- Jian Yang
- Clinical Data Center, The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Zhejiang, China; The College of Biomedical Engineering and Instrument Science, Zhejiang University, Zhejiang, China
| | - Liqi Shu
- Rhode Island Hospital, Warren Alpert Medical School of Brown University, Rhode Island, USA
| | - Huilong Duan
- The College of Biomedical Engineering and Instrument Science, Zhejiang University, Zhejiang, China
| | - Haomin Li
- Clinical Data Center, The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Zhejiang, China.
| |
Collapse
|
23
|
Xiao T, Dong X, Lu Y, Zhou W. High-Resolution and Multidimensional Phenotypes Can Complement Genomics Data to Diagnose Diseases in the Neonatal Population. PHENOMICS (CHAM, SWITZERLAND) 2023; 3:204-215. [PMID: 37197647 PMCID: PMC10110825 DOI: 10.1007/s43657-022-00071-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 08/03/2022] [Accepted: 08/08/2022] [Indexed: 05/19/2023]
Abstract
Advances in genomic medicine have greatly improved our understanding of human diseases. However, phenome is not well understood. High-resolution and multidimensional phenotypes have shed light on the mechanisms underlying neonatal diseases in greater details and have the potential to optimize clinical strategies. In this review, we first highlight the value of analyzing traditional phenotypes using a data science approach in the neonatal population. We then discuss recent research on high-resolution, multidimensional, and structured phenotypes in neonatal critical diseases. Finally, we briefly introduce current technologies available for the analysis of multidimensional data and the value that can be provided by integrating these data into clinical practice. In summary, a time series of multidimensional phenome can improve our understanding of disease mechanisms and diagnostic decision-making, stratify patients, and provide clinicians with optimized strategies for therapeutic intervention; however, the available technologies for collecting multidimensional data and the best platform for connecting multiple modalities should be considered.
Collapse
Affiliation(s)
- Tiantian Xiao
- Division of Neonatology, Children’s Hospital of Fudan University, National Children’s Medical Center, 399 Wanyuan Road, Shanghai, 201102 China
- Department of Neonatology, Chengdu Women’s and Children’s Central Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610000 China
| | - Xinran Dong
- Center for Molecular Medicine, Pediatric Research Institute, Children’s Hospital of Fudan University, National Children’s Medical Center, Shanghai, 201102 China
| | - Yulan Lu
- Center for Molecular Medicine, Pediatric Research Institute, Children’s Hospital of Fudan University, National Children’s Medical Center, Shanghai, 201102 China
| | - Wenhao Zhou
- Division of Neonatology, Children’s Hospital of Fudan University, National Children’s Medical Center, 399 Wanyuan Road, Shanghai, 201102 China
- Center for Molecular Medicine, Pediatric Research Institute, Children’s Hospital of Fudan University, National Children’s Medical Center, Shanghai, 201102 China
| |
Collapse
|
24
|
Chandak P, Huang K, Zitnik M. Building a knowledge graph to enable precision medicine. Sci Data 2023; 10:67. [PMID: 36732524 PMCID: PMC9893183 DOI: 10.1038/s41597-023-01960-3] [Citation(s) in RCA: 96] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 01/11/2023] [Indexed: 02/04/2023] Open
Abstract
Developing personalized diagnostic strategies and targeted treatments requires a deep understanding of disease biology and the ability to dissect the relationship between molecular and genetic factors and their phenotypic consequences. However, such knowledge is fragmented across publications, non-standardized repositories, and evolving ontologies describing various scales of biological organization between genotypes and clinical phenotypes. Here, we present PrimeKG, a multimodal knowledge graph for precision medicine analyses. PrimeKG integrates 20 high-quality resources to describe 17,080 diseases with 4,050,249 relationships representing ten major biological scales, including disease-associated protein perturbations, biological processes and pathways, anatomical and phenotypic scales, and the entire range of approved drugs with their therapeutic action, considerably expanding previous efforts in disease-rooted knowledge graphs. PrimeKG contains an abundance of 'indications', 'contradictions', and 'off-label use' drug-disease edges that lack in other knowledge graphs and can support AI analyses of how drugs affect disease-associated networks. We supplement PrimeKG's graph structure with language descriptions of clinical guidelines to enable multimodal analyses and provide instructions for continual updates of PrimeKG as new data become available.
Collapse
Affiliation(s)
- Payal Chandak
- Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA, 02139, USA
| | - Kexin Huang
- Department of Computer Science, Stanford University, Stanford, CA, 94305, USA
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Harvard University, Boston, MA, 02115, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
- Harvard Data Science Initiative, Cambridge, MA, 02138, USA.
| |
Collapse
|
25
|
Affiliation(s)
- Xing Chen
- Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou 221116, China.,School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Li Huang
- The Future Laboratory, Tsinghua University, Beijing 100084, China
| |
Collapse
|
26
|
Papakonstantinou E, Efthymiou V, Dragoumani K, Christodoulou M, Vlachakis D. Collaborative Platforms and Matchmaking Algorithms for Research and Education, Establishment, and Optimization of Consortia. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2023; 1424:125-133. [PMID: 37486486 DOI: 10.1007/978-3-031-31982-2_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/25/2023]
Abstract
Matchmaking has a great position in the rational allocation of resources in several fields, ranging from market operation to people's daily lives. Matchmakers have evolved through artificial intelligence technologies and are being introduced in numerous aspects of industry, research, and academia in solving decision issues, research innovation design, and building robust and efficient networks. The goal of this report is to describe the collaborative platforms and matchmaking algorithms for research and education, as well as the establishment and optimization of consortia.
Collapse
Affiliation(s)
- Eleni Papakonstantinou
- Department of Biotechnology, Laboratory of Genetics, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece
| | - Vasiliki Efthymiou
- University Research Institute of Maternal and Child Health & Precision Medicine, National and Kapodistrian University of Athens, "Aghia Sophia" Children's Hospital, Athens, Greece
| | - Konstantina Dragoumani
- Department of Biotechnology, Laboratory of Genetics, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece
| | - Maria Christodoulou
- Department of Biotechnology, Laboratory of Genetics, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece
| | - Dimitrios Vlachakis
- Department of Biotechnology, Laboratory of Genetics, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece.
- University Research Institute of Maternal and Child Health & Precision Medicine, National and Kapodistrian University of Athens, "Aghia Sophia" Children's Hospital, Athens, Greece.
- Division of Endocrinology and Metabolism, Center of Clinical, Experimental Surgery and Translational Research, Biomedical Research Foundation of the Academy of Athens, Athens, Greece.
| |
Collapse
|
27
|
Cisterna A, González-Vidal A, Ruiz D, Ortiz J, Gómez-Pascual A, Chen Z, Nalls M, Faghri F, Hardy J, Díez I, Maietta P, Álvarez S, Ryten M, Botía JA. PhenoExam: gene set analyses through integration of different phenotype databases. BMC Bioinformatics 2022; 23:567. [PMID: 36587217 PMCID: PMC9805686 DOI: 10.1186/s12859-022-05122-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 12/22/2022] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Gene set enrichment analysis (detecting phenotypic terms that emerge as significant in a set of genes) plays an important role in bioinformatics focused on diseases of genetic basis. To facilitate phenotype-oriented gene set analysis, we developed PhenoExam, a freely available R package for tool developers and a web interface for users, which performs: (1) phenotype and disease enrichment analysis on a gene set; (2) measures statistically significant phenotype similarities between gene sets and (3) detects significant differential phenotypes or disease terms across different databases. RESULTS PhenoExam generates sensitive and accurate phenotype enrichment analyses. It is also effective in segregating gene sets or Mendelian diseases with very similar phenotypes. We tested the tool with two similar diseases (Parkinson and dystonia), to show phenotype-level similarities but also potentially interesting differences. Moreover, we used PhenoExam to validate computationally predicted new genes potentially associated with epilepsy. CONCLUSIONS We developed PhenoExam, a freely available R package and Web application, which performs phenotype enrichment and disease enrichment analysis on gene set G, measures statistically significant phenotype similarities between pairs of gene sets G and G' and detects statistically significant exclusive phenotypes or disease terms, across different databases. We proved with simulations and real cases that it is useful to distinguish between gene sets or diseases with very similar phenotypes. Github R package URL is https://github.com/alexcis95/PhenoExam . Shiny App URL is https://alejandrocisterna.shinyapps.io/phenoexamweb/ .
Collapse
Affiliation(s)
- Alejandro Cisterna
- Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Murcia, Spain
| | - Aurora González-Vidal
- Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Murcia, Spain
| | - Daniel Ruiz
- Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Murcia, Spain
| | - Jordi Ortiz
- Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Murcia, Spain
| | - Alicia Gómez-Pascual
- Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Murcia, Spain
| | - Zhongbo Chen
- Department of Neurodegenerative Disease, UCL, Institute of Neurology, London, UK
| | - Mike Nalls
- Data Tecnica International LLC, Glen Echo, MD, USA
- Laboratory of Neurogenetics, NIA/NIH, Bethesda, MD, USA
- Center for Alzheimer's and Related Dememtias, NIH, Bethesda, MD, USA
| | - Faraz Faghri
- Data Tecnica International LLC, Glen Echo, MD, USA
- Laboratory of Neurogenetics, NIA/NIH, Bethesda, MD, USA
- Center for Alzheimer's and Related Dememtias, NIH, Bethesda, MD, USA
| | - John Hardy
- Department of Neurodegenerative Disease, UCL, Institute of Neurology, London, UK
- Reta Lila Weston Institute, UCL Queen Square Institute of Neurology, London, UK
- UCL Movement Disorders Centre, University College London, London, UK
- Institute for Advanced Study, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Irene Díez
- NIMGenetics Genómica y Medicina S.L, Madrid, Spain
| | | | - Sara Álvarez
- NIMGenetics Genómica y Medicina S.L, Madrid, Spain
| | - Mina Ryten
- Department of Neurodegenerative Disease, UCL, Institute of Neurology, London, UK
- NIHR Great Ormond Street Hospital Biomedical Research Centre, University College London, London, UK
- Genetics and Genomic Medicine, Great Ormond Street Institute of Child Health, University College London, London, WC1E 6BT, UK
| | - Juan A Botía
- Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Murcia, Spain.
- Department of Neurodegenerative Disease, UCL, Institute of Neurology, London, UK.
| |
Collapse
|
28
|
Phenotype-aware prioritisation of rare Mendelian disease variants. Trends Genet 2022; 38:1271-1283. [PMID: 35934592 PMCID: PMC9950798 DOI: 10.1016/j.tig.2022.07.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/06/2022] [Accepted: 07/05/2022] [Indexed: 01/24/2023]
Abstract
A molecular diagnosis from the analysis of sequencing data in rare Mendelian diseases has a huge impact on the management of patients and their families. Numerous patient phenotype-aware variant prioritisation (VP) tools have been developed to help automate this process, and shorten the diagnostic odyssey, but performance statistics on real patient data are limited. Here we identify, assess, and compare the performance of all up-to-date, freely available, and programmatically accessible tools using a whole-exome, retinal disease dataset from 134 individuals with a molecular diagnosis. All tools were able to identify around two-thirds of the genetic diagnoses as the top-ranked candidate, with LIRICAL performing best overall. Finally, we discuss the challenges to overcome most cases remaining undiagnosed after current, state-of-the-art practices.
Collapse
|
29
|
Yu H, Zhang G, Yu S, Wu W. Wiedemann-Steiner Syndrome: Case Report and Review of Literature. CHILDREN (BASEL, SWITZERLAND) 2022; 9:children9101545. [PMID: 36291481 PMCID: PMC9600770 DOI: 10.3390/children9101545] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Revised: 10/04/2022] [Accepted: 10/08/2022] [Indexed: 11/07/2022]
Abstract
Wiedemann–Steiner syndrome (WDSTS) is an autosomal dominant disorder with a broad and variable phenotypic spectrum characterized by intellectual disability, prenatal and postnatal growth retardation, hypertrichosis, characteristic facial features, behavioral problems, and congenital anomalies involving different systems. Here, we report a five-year-old boy who was diagnosed with WDSTS based on the results of Trio-based whole-exome sequencing and an assessment of his clinical features. He had intellectual disability, short stature, hirsutism, and atypical facial features, including a low hairline, down-slanting palpebral fissures, hypertelorism, long eyelashes, broad and arching eyebrows, synophrys, a bulbous nose, a broad nasal tip, and dental/oral anomalies. However, not all individuals with WDSTS exhibit the classic phenotype, so the spectrum of the disorder can vary widely from relatively atypical facial features to multiple systemic symptoms. Here, we summarize the clinical and molecular spectrum, diagnosis and differential diagnosis, long-term management, and care planning of WDSTS to improve the awareness of both pediatricians and clinical geneticists and to promote the diagnosis and treatment of the disease.
Collapse
|
30
|
Costantini A, Mäkitie RE, Hartmann MA, Fratzl-Zelman N, Zillikens MC, Kornak U, Søe K, Mäkitie O. Early-Onset Osteoporosis: Rare Monogenic Forms Elucidate the Complexity of Disease Pathogenesis Beyond Type I Collagen. J Bone Miner Res 2022; 37:1623-1641. [PMID: 35949115 PMCID: PMC9542053 DOI: 10.1002/jbmr.4668] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 07/22/2022] [Accepted: 08/01/2022] [Indexed: 12/05/2022]
Abstract
Early-onset osteoporosis (EOOP), characterized by low bone mineral density (BMD) and fractures, affects children, premenopausal women and men aged <50 years. EOOP may be secondary to a chronic illness, long-term medication, nutritional deficiencies, etc. If no such cause is identified, EOOP is regarded primary and may then be related to rare variants in genes playing a pivotal role in bone homeostasis. If the cause remains unknown, EOOP is considered idiopathic. The scope of this review is to guide through clinical and genetic diagnostics of EOOP, summarize the present knowledge on rare monogenic forms of EOOP, and describe how analysis of bone biopsy samples can lead to a better understanding of the disease pathogenesis. The diagnostic pathway of EOOP is often complicated and extensive assessments may be needed to reliably exclude secondary causes. Due to the genetic heterogeneity and overlapping features in the various genetic forms of EOOP and other bone fragility disorders, the genetic diagnosis usually requires the use of next-generation sequencing to investigate several genes simultaneously. Recent discoveries have elucidated the complexity of disease pathogenesis both regarding genetic architecture and bone tissue-level pathology. Two rare monogenic forms of EOOP are due to defects in genes partaking in the canonical WNT pathway: LRP5 and WNT1. Variants in the genes encoding plastin-3 (PLS3) and sphingomyelin synthase 2 (SGMS2) have also been found in children and young adults with skeletal fragility. The molecular mechanisms leading from gene defects to clinical manifestations are often not fully understood. Detailed analysis of patient-derived transiliac bone biopsies gives valuable information to understand disease pathogenesis, distinguishes EOOP from other bone fragility disorders, and guides in patient management, but is not widely available in clinical settings. Despite the great advances in this field, EOOP remains an insufficiently explored entity and further research is needed to optimize diagnostic and therapeutic approaches. © 2022 The Authors. Journal of Bone and Mineral Research published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research (ASBMR).
Collapse
Affiliation(s)
- Alice Costantini
- Department of Molecular Medicine and Surgery and Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden.,Paris Cité University, INSERM UMR1163, Institut Imagine, Paris, France
| | - Riikka E Mäkitie
- Folkhälsan Institute of Genetics, Helsinki, Finland.,Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, University of Helsinki, Helsinki, Finland.,Department of Otorhinolaryngology-Head and Neck Surgery, Helsinki University Hospital and University of Helsinki, Helsinki, Finland
| | - Markus A Hartmann
- Ludwig Boltzmann Institute of Osteology at Hanusch Hospital of OEGK and AUVA Trauma Centre Meidling, 1st Medical Department Hanusch Hospital, Vienna, Austria.,Vienna Bone and Growth Center, Vienna, Austria
| | - Nadja Fratzl-Zelman
- Ludwig Boltzmann Institute of Osteology at Hanusch Hospital of OEGK and AUVA Trauma Centre Meidling, 1st Medical Department Hanusch Hospital, Vienna, Austria.,Vienna Bone and Growth Center, Vienna, Austria
| | - M Carola Zillikens
- Bone Center, Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Uwe Kornak
- Institute of Human Genetics, University Medical Center Göttingen, Göttingen, Germany
| | - Kent Søe
- Clinical Cell Biology, Department of Pathology, Odense University Hospital, Odense, Denmark.,Clinical Cell Biology, Pathology Research Unit, Department of Clinical Research, University of Southern Denmark, Odense, Denmark.,Department of Molecular Medicine, University of Southern Denmark, Odense, Denmark
| | - Outi Mäkitie
- Department of Molecular Medicine and Surgery and Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden.,Folkhälsan Institute of Genetics, Helsinki, Finland.,Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, University of Helsinki, Helsinki, Finland.,Children's Hospital and Pediatric Research Center, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| |
Collapse
|
31
|
Jacobsen JOB, Kelly C, Cipriani V, Research Consortium GE, Mungall CJ, Reese J, Danis D, Robinson PN, Smedley D. Phenotype-driven approaches to enhance variant prioritization and diagnosis of rare disease. Hum Mutat 2022; 43:1071-1081. [PMID: 35391505 PMCID: PMC9288531 DOI: 10.1002/humu.24380] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 01/25/2022] [Accepted: 04/03/2022] [Indexed: 11/20/2022]
Abstract
Rare disease diagnostics and disease gene discovery have been revolutionized by whole-exome and genome sequencing but identifying the causative variant(s) from the millions in each individual remains challenging. The use of deep phenotyping of patients and reference genotype-phenotype knowledge, alongside variant data such as allele frequency, segregation, and predicted pathogenicity, has proved an effective strategy to tackle this issue. Here we review the numerous tools that have been developed to automate this approach and demonstrate the power of such an approach on several thousand diagnosed cases from the 100,000 Genomes Project. Finally, we discuss the challenges that need to be overcome if we are going to improve detection rates and help the majority of patients that still remain without a molecular diagnosis after state-of-the-art genomic interpretation.
Collapse
Affiliation(s)
- Julius O. B. Jacobsen
- William Harvey Research Institute, Charterhouse Square, Barts and the London School of Medicine and Dentistry QueenQueen Mary University of LondonLondonUK
| | - Catherine Kelly
- William Harvey Research Institute, Charterhouse Square, Barts and the London School of Medicine and Dentistry QueenQueen Mary University of LondonLondonUK
| | - Valentina Cipriani
- William Harvey Research Institute, Charterhouse Square, Barts and the London School of Medicine and Dentistry QueenQueen Mary University of LondonLondonUK
| | | | - Christopher J. Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National LaboratoryBerkeleyCaliforniaUSA
| | - Justin Reese
- Environmental Genomics and Systems Biology, Lawrence Berkeley National LaboratoryBerkeleyCaliforniaUSA
| | - Daniel Danis
- The Jackson Laboratory for Genomic MedicineFarmingtonConnecticutUSA
| | | | - Damian Smedley
- William Harvey Research Institute, Charterhouse Square, Barts and the London School of Medicine and Dentistry QueenQueen Mary University of LondonLondonUK
| |
Collapse
|
32
|
Stürznickel J, Heider F, Delsmann A, Gödel M, Grünhagen J, Huber TB, Kornak U, Amling M, Oheim R. Clinical Spectrum of Hereditary Hypophosphatemic Rickets With Hypercalciuria (HHRH). J Bone Miner Res 2022; 37:1580-1591. [PMID: 35689455 DOI: 10.1002/jbmr.4630] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 05/19/2022] [Accepted: 06/04/2022] [Indexed: 11/11/2022]
Abstract
Hereditary hypophosphatemic rickets with hypercalciuria (HHRH) represents an FGF23-independent disease caused by biallelic variants in the solute carrier family 34-member 3 (SLC34A3) gene. HHRH is characterized by chronic hypophosphatemia and an increased risk for nephrocalcinosis and rickets/osteomalacia, muscular weakness, and secondary limb deformity. Biochemical changes, but no relevant skeletal changes, have been reported for heterozygous SLC34A3 carriers. Therefore, we assessed the characteristics of individuals with biallelic and monoallelic SLC34A3 variants. In 8 index patients and 5 family members, genetic analysis was performed using a custom gene panel. The skeletal assessment comprised biochemical parameters, areal bone mineral density (aBMD), and bone microarchitecture. Pathogenic SLC34A3 variants were revealed in 7 of 13 individuals (2 homozygous, 5 heterozygous), whereas 3 of 13 carried monoallelic variants of unknown significance. Whereas both homozygous individuals had nephrocalcinosis, only one displayed a skeletal phenotype consistent with HHRH. Reduced to low-normal phosphate levels, decreased tubular reabsorption of phosphate (TRP), and high-normal to elevated values of 1,25-OH2 -D3 accompanied by normal cFGF23 levels were revealed independently of mutational status. Interestingly, individuals with nephrocalcinosis showed significantly increased calcium excretion and 1,25-OH2 -D3 levels but normal phosphate reabsorption. Furthermore, aBMD Z-score <-2.0 was revealed in 4 of 8 heterozygous carriers, and HR-pQCT analysis showed a moderate decrease in structural parameters. Our findings highlight the clinical relevance also of monoallelic SLC34A3 variants, including their potential skeletal impairment. Calcium excretion and 1,25-OH2 -D3 levels, but not TRP, were associated with nephrocalcinosis. Future studies should investigate the effects of distinct SLC34A3 variants and optimize treatment and monitoring regimens to prevent nephrocalcinosis and skeletal deterioration. © 2022 The Authors. Journal of Bone and Mineral Research published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research (ASBMR).
Collapse
Affiliation(s)
- Julian Stürznickel
- Department of Osteology and Biomechanics, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.,Department of Trauma and Orthopaedic Surgery, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Fiona Heider
- Department of Osteology and Biomechanics, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Alena Delsmann
- Department of Osteology and Biomechanics, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Markus Gödel
- III. Department of Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Johannes Grünhagen
- Labor Berlin Charité Vivantes GmbH-corporate member of Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Tobias B Huber
- III. Department of Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Uwe Kornak
- Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany.,BIH Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany.,Institute of Human Genetics, University Medical Center Göttingen, Göttingen, Germany
| | - Michael Amling
- Department of Osteology and Biomechanics, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Ralf Oheim
- Department of Osteology and Biomechanics, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.,Martin Zeitz Center for Rare Diseases, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| |
Collapse
|
33
|
Havrilla JM, Singaravelu A, Driscoll DM, Minkovsky L, Helbig I, Medne L, Wang K, Krantz I, Desai BR. PheNominal: an EHR-integrated web application for structured deep phenotyping at the point of care. BMC Med Inform Decis Mak 2022; 22:198. [PMID: 35902925 PMCID: PMC9335954 DOI: 10.1186/s12911-022-01927-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Accepted: 07/06/2022] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Clinical phenotype information greatly facilitates genetic diagnostic interpretations pipelines in disease. While post-hoc extraction using natural language processing on unstructured clinical notes continues to improve, there is a need to improve point-of-care collection of patient phenotypes. Therefore, we developed "PheNominal", a point-of-care web application, embedded within Epic electronic health record (EHR) workflows, to permit capture of standardized phenotype data. METHODS Using bi-directional web services available within commercial EHRs, we developed a lightweight web application that allows users to rapidly browse and identify relevant terms from the Human Phenotype Ontology (HPO). Selected terms are saved discretely within the patient's EHR, permitting reuse both in clinical notes as well as in downstream diagnostic and research pipelines. RESULTS In the 16 months since implementation, PheNominal was used to capture discrete phenotype data for over 1500 individuals and 11,000 HPO terms during clinic and inpatient encounters for a genetic diagnostic consultation service within a quaternary-care pediatric academic medical center. An average of 7 HPO terms were captured per patient. Compared to a manual workflow, the average time to enter terms for a patient was reduced from 15 to 5 min per patient, and there were fewer annotation errors. CONCLUSIONS Modern EHRs support integration of external applications using application programming interfaces. We describe a practical application of these interfaces to facilitate deep phenotype capture in a discrete, structured format within a busy clinical workflow. Future versions will include a vendor-agnostic implementation using FHIR. We describe pilot efforts to integrate structured phenotyping through controlled dictionaries into diagnostic and research pipelines, reducing manual effort for phenotype documentation and reducing errors in data entry.
Collapse
Affiliation(s)
- James M. Havrilla
- grid.239552.a0000 0001 0680 8770Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104 USA
| | - Anbumalar Singaravelu
- grid.239552.a0000 0001 0680 8770Emerging Technology and Transformation Team, Information Services, Children’s Hospital of Philadelphia, Philadelphia, PA 19104 USA
| | - Dennis M. Driscoll
- grid.239552.a0000 0001 0680 8770Emerging Technology and Transformation Team, Information Services, Children’s Hospital of Philadelphia, Philadelphia, PA 19104 USA
| | - Leonard Minkovsky
- grid.239552.a0000 0001 0680 8770Emerging Technology and Transformation Team, Information Services, Children’s Hospital of Philadelphia, Philadelphia, PA 19104 USA
| | - Ingo Helbig
- grid.239552.a0000 0001 0680 8770Division of Neurology, Children’s Hospital of Philadelphia, Philadelphia, PA 19104 USA ,grid.239552.a0000 0001 0680 8770The Epilepsy NeuroGenetics Initiative (ENGIN), Children’s Hospital of Philadelphia, Philadelphia, USA ,grid.239552.a0000 0001 0680 8770Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104 USA ,grid.25879.310000 0004 1936 8972Department of Neurology, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104 USA
| | - Livija Medne
- grid.239552.a0000 0001 0680 8770Roberts Individualized Medical Genetics Center, Children’s Hospital of Philadelphia, Philadelphia, PA 19104 USA
| | - Kai Wang
- grid.239552.a0000 0001 0680 8770Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104 USA ,grid.239552.a0000 0001 0680 8770Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104 USA ,grid.25879.310000 0004 1936 8972Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104 USA
| | - Ian Krantz
- grid.239552.a0000 0001 0680 8770Roberts Individualized Medical Genetics Center, Children’s Hospital of Philadelphia, Philadelphia, PA 19104 USA
| | - Bimal R. Desai
- grid.25879.310000 0004 1936 8972Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104 USA
| |
Collapse
|
34
|
Yates T, Lain A, Campbell J, FitzPatrick DR, Simpson TI. Creation and evaluation of full-text literature-derived, feature-weighted disease models of genetically determined developmental disorders. Database (Oxford) 2022; 2022:baac038. [PMID: 35670729 PMCID: PMC9216525 DOI: 10.1093/database/baac038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 03/26/2022] [Accepted: 05/25/2022] [Indexed: 11/24/2022]
Abstract
There are >2500 different genetically determined developmental disorders (DD), which, as a group, show very high levels of both locus and allelic heterogeneity. This has led to the wide-spread use of evidence-based filtering of genome-wide sequence data as a diagnostic tool in DD. Determining whether the association of a filtered variant at a specific locus is a plausible explanation of the phenotype in the proband is crucial and commonly requires extensive manual literature review by both clinical scientists and clinicians. Access to a database of weighted clinical features extracted from rigorously curated literature would increase the efficiency of this process and facilitate the development of robust phenotypic similarity metrics. However, given the large and rapidly increasing volume of published information, conventional biocuration approaches are becoming impractical. Here, we present a scalable, automated method for the extraction of categorical phenotypic descriptors from the full-text literature. Papers identified through literature review were downloaded and parsed using the Cadmus custom retrieval package. Human Phenotype Ontology terms were extracted using MetaMap, with 76-84% precision and 65-73% recall. Mean terms per paper increased from 9 in title + abstract, to 68 using full text. We demonstrate that these literature-derived disease models plausibly reflect true disease expressivity more accurately than widely used manually curated models, through comparison with prospectively gathered data from the Deciphering Developmental Disorders study. The area under the curve for receiver operating characteristic (ROC) curves increased by 5-10% through the use of literature-derived models. This work shows that scalable automated literature curation increases performance and adds weight to the need for this strategy to be integrated into informatic variant analysis pipelines. Database URL: https://doi.org/10.1093/database/baac038.
Collapse
Affiliation(s)
- T.M Yates
- MRC Human Genetics Unit, Western General Hospital, Institute of Genetics and Cancer, The University of Edinburgh, Crewe Road South, Edinburgh EH4 2XU, UK
- Transforming Genetic Medicine Initiative, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - A Lain
- Institute for Adaptive and Neural Computation, Informatics Forum, The University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, UK
| | - J Campbell
- MRC Human Genetics Unit, Western General Hospital, Institute of Genetics and Cancer, The University of Edinburgh, Crewe Road South, Edinburgh EH4 2XU, UK
- Simons Initiative for the Developing Brain, The University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XF, UK
| | - D R FitzPatrick
- MRC Human Genetics Unit, Western General Hospital, Institute of Genetics and Cancer, The University of Edinburgh, Crewe Road South, Edinburgh EH4 2XU, UK
- Transforming Genetic Medicine Initiative, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- Simons Initiative for the Developing Brain, The University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XF, UK
| | - T I Simpson
- Institute for Adaptive and Neural Computation, Informatics Forum, The University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, UK
- Simons Initiative for the Developing Brain, The University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XF, UK
| |
Collapse
|
35
|
Fang M, Su Z, Abolhassani H, Itan Y, Jin X, Hammarström L. VIPPID: a gene-specific single nucleotide variant pathogenicity prediction tool for primary immunodeficiency diseases. Brief Bioinform 2022; 23:6590436. [PMID: 35598327 PMCID: PMC9487673 DOI: 10.1093/bib/bbac176] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 04/05/2022] [Accepted: 04/18/2022] [Indexed: 01/04/2023] Open
Abstract
Abstract
Distinguishing pathogenic variants from non-pathogenic ones remains a major challenge in clinical genetic testing of primary immunodeficiency (PID) patients. Most of the existing mutation pathogenicity prediction tools treat all mutations as homogeneous entities, ignoring the differences in characteristics of different genes, and use the same model for genes in different diseases. In this study, we developed a single nucleotide variant (SNV) pathogenicity prediction tool, Variant Impact Predictor for PIDs (VIPPID; https://mylab.shinyapps.io/VIPPID/), which was tailored for PIDs genes and used a specific model for each of the most prevalent PID known genes. It employed a Conditional Inference Forest model and utilized information of 85 features of SNVs and scores from 20 existing prediction tools. Evaluation of VIPPID showed that it had superior performance (area under the curve = 0.91) over non-specific conventional tools. In addition, we also showed that the gene-specific model outperformed the non-gene-specific models. Our study demonstrated that disease-specific and gene-specific models can improve SNV pathogenicity prediction performance. This observation supports the notion that each feature of mutations in the model can be potentially used, in a new algorithm, to investigate the characteristics and function of the encoded proteins.
Collapse
Affiliation(s)
- Mingyan Fang
- BGI-Shenzhen, Shenzhen 518083, China
- Division of Clinical Immunology at the Department of Laboratory Medicine, Karolinska Institutet at Karolinska University Hospital Huddinge, SE-141 86 Stockholm, Sweden
- BGI-Singapore, Singapore 138567, Singapore
| | - Zheng Su
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, The University of New South Wales, Sydney, New South Wales, Australia
- GenieUs Genomics, 19A Boundary St, Darlinghurst NSW 2010, Australia
| | - Hassan Abolhassani
- Division of Clinical Immunology at the Department of Laboratory Medicine, Karolinska Institutet at Karolinska University Hospital Huddinge, SE-141 86 Stockholm, Sweden
- Department of Biosciences and Nutrition, NEO, Karolinska Institutet, SE14183 Huddinge, Sweden
| | - Yuval Itan
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Xin Jin
- BGI-Shenzhen, Shenzhen 518083, China
- BGI-Singapore, Singapore 138567, Singapore
| | - Lennart Hammarström
- BGI-Shenzhen, Shenzhen 518083, China
- Division of Clinical Immunology at the Department of Laboratory Medicine, Karolinska Institutet at Karolinska University Hospital Huddinge, SE-141 86 Stockholm, Sweden
- Department of Biosciences and Nutrition, NEO, Karolinska Institutet, SE14183 Huddinge, Sweden
| |
Collapse
|
36
|
Yuan X, Zhang P. Revisiting benchmark study for response to methodological critiques of 'Evaluation of phenotype-driven gene prioritization methods for Mendelian diseases'. Brief Bioinform 2022; 23:6580907. [PMID: 35514206 DOI: 10.1093/bib/bbac181] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 03/23/2022] [Accepted: 04/22/2022] [Indexed: 01/20/2023] Open
Abstract
Evaluation of phenotype-driven gene prioritization approaches for Mendelian diseases could facilitate the software development and method selection for the workflow configuration and clinical practice. In our original article, the performance of 10 well-recognized causal-gene prioritization methods was benchmarked using 305 cases from the deciphering developmental disorders (DDD) project and 209 in-house cases via a relatively unbiased methodology. The evaluation results showed that LIRICAL and AMELIE were two of the best methods in our benchmark experiments, and the possible integrative approach of these two methods may enhance the diagnostic efficiency. However, some methodological critiques were raised by the authors of Exomiser and PhenIX, so we revisited our benchmarking studies to answer their comments in this letter.
Collapse
|
37
|
Schuler BA, Nelson ET, Koziura M, Cogan JD, Hamid R, Phillips JA. Lessons learned: next-generation sequencing applied to undiagnosed genetic diseases. J Clin Invest 2022; 132:e154942. [PMID: 35362483 PMCID: PMC8970663 DOI: 10.1172/jci154942] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Rare genetic disorders, when considered together, are relatively common. Despite advancements in genetics and genomics technologies as well as increased understanding of genomic function and dysfunction, many genetic diseases continue to be difficult to diagnose. The goal of this Review is to increase the familiarity of genetic testing strategies for non-genetics providers. As genetic testing is increasingly used in primary care, many subspecialty clinics, and various inpatient settings, it is important that non-genetics providers have a fundamental understanding of the strengths and weaknesses of various genetic testing strategies as well as develop an ability to interpret genetic testing results. We provide background on commonly used genetic testing approaches, give examples of phenotypes in which the various genetic testing approaches are used, describe types of genetic and genomic variations, cover challenges in variant identification, provide examples in which next-generation sequencing (NGS) failed to uncover the variant responsible for a disease, and discuss opportunities for continued improvement in the application of NGS clinically. As genetic testing becomes increasingly a part of all areas of medicine, familiarity with genetic testing approaches and result interpretation is vital to decrease the burden of undiagnosed disease.
Collapse
Affiliation(s)
- Bryce A. Schuler
- Division of Medical Genetics and Genomics and
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Erica T. Nelson
- Division of Medical Genetics and Genomics and
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Mary Koziura
- Division of Medical Genetics and Genomics and
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Joy D. Cogan
- Division of Medical Genetics and Genomics and
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Rizwan Hamid
- Division of Medical Genetics and Genomics and
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - John A. Phillips
- Division of Medical Genetics and Genomics and
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| |
Collapse
|
38
|
Slavotinek A, Prasad H, Yip T, Rego S, Hoban H, Kvale M. Predicting genes from phenotypes using human phenotype ontology (HPO) terms. Hum Genet 2022; 141:1749-1760. [PMID: 35357580 DOI: 10.1007/s00439-022-02449-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 03/16/2022] [Indexed: 11/28/2022]
Abstract
The interpretation of genomic variants following whole exome sequencing (WES) can be aided using human phenotype ontology (HPO) terms to standardize clinical features and predict causative genes. We performed WES on 453 patients diagnosed prior to 18 years of age and identified 114 pathogenic (P) or likely pathogenic (LP) variants in 112 patients. We utilized PhenoDB to extract HPO terms from provider notes and then used Phen2Gene to generate a gene score and gene ranking from each list of HPO terms. We assigned Phen2Gene gene rankings to 6 rank classes, with class 1 covering raw gene rankings of 1 to 10 and class 2 covering rankings from 11 to 50 out of a total of 17,126 possible gene rankings. Phen2Gene ranked causative genes into rank class 1 or 2 in 27.7% of cases and the genes in rank class 1 were all associated with well-characterized phenotypes. We found significant associations between the gene score and the number of years, since the gene was first published, the number of HPO terms with an hierarchical depth greater or equal to 11, and the number of Online Mendelian Inheritance in Man terms associated with the phenotype and gene. We conclude that genes associated with recognizable phenotypes and terms deep in the HPO hierarchy have the best chance of producing a high gene score and ranking in class 1 to 2 using Phen2Gene software with HPO terms. Clinicians and laboratory staff should consider these results when HPO terms are employed to prioritize candidate genes.
Collapse
Affiliation(s)
- Anne Slavotinek
- Division of Genetics, Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA.
| | - Hannah Prasad
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Tiffany Yip
- Division of Genetics, Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
| | - Shannon Rego
- Division of Genetics, Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
| | - Hannah Hoban
- Division of Genetics, Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
| | - Mark Kvale
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| |
Collapse
|
39
|
Yuan X, Wang J, Dai B, Sun Y, Zhang K, Chen F, Peng Q, Huang Y, Zhang X, Chen J, Xu X, Chuan J, Mu W, Li H, Fang P, Gong Q, Zhang P. Evaluation of phenotype-driven gene prioritization methods for Mendelian diseases. Brief Bioinform 2022; 23:6521702. [PMID: 35134823 PMCID: PMC8921623 DOI: 10.1093/bib/bbac019] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 01/10/2022] [Accepted: 01/13/2022] [Indexed: 12/31/2022] Open
Abstract
It’s challenging work to identify disease-causing genes from the next-generation sequencing (NGS) data of patients with Mendelian disorders. To improve this situation, researchers have developed many phenotype-driven gene prioritization methods using a patient’s genotype and phenotype information, or phenotype information only as input to rank the candidate’s pathogenic genes. Evaluations of these ranking methods provide practitioners with convenience for choosing an appropriate tool for their workflows, but retrospective benchmarks are underpowered to provide statistically significant results in their attempt to differentiate. In this research, the performance of ten recognized causal-gene prioritization methods was benchmarked using 305 cases from the Deciphering Developmental Disorders (DDD) project and 209 in-house cases via a relatively unbiased methodology. The evaluation results show that methods using Human Phenotype Ontology (HPO) terms and Variant Call Format (VCF) files as input achieved better overall performance than those using phenotypic data alone. Besides, LIRICAL and AMELIE, two of the best methods in our benchmark experiments, complement each other in cases with the causal genes ranked highly, suggesting a possible integrative approach to further enhance the diagnostic efficiency. Our benchmarking provides valuable reference information to the computer-assisted rapid diagnosis in Mendelian diseases and sheds some light on the potential direction of future improvement on disease-causing gene prioritization methods.
Collapse
Affiliation(s)
- Xiao Yuan
- Changsha KingMed Center for Clinical Laboratory, Changsha, China.,Guangzhou Kingmed Center for Clinical Laboratory, Guangzhou, China.,Genetalks Biotech. Co., Ltd., Changsha, China
| | - Jing Wang
- Changsha KingMed Center for Clinical Laboratory, Changsha, China
| | - Bing Dai
- Changsha KingMed Center for Clinical Laboratory, Changsha, China
| | - Yanfang Sun
- Changsha KingMed Center for Clinical Laboratory, Changsha, China
| | - Keke Zhang
- Changsha KingMed Center for Clinical Laboratory, Changsha, China
| | - Fangfang Chen
- Changsha KingMed Center for Clinical Laboratory, Changsha, China
| | - Qian Peng
- Changsha KingMed Center for Clinical Laboratory, Changsha, China
| | - Yixuan Huang
- Beijing Geneworks Technology Co., Ltd., Beijing, China
| | - Xinlei Zhang
- Reproductive & Genetics Hospital of Citic & Xiangya, Changsha, China
| | - Junru Chen
- Genetalks Biotech. Co., Ltd., Changsha, China
| | - Xilin Xu
- Guangzhou Kingmed Center for Clinical Laboratory, Guangzhou, China
| | - Jun Chuan
- Changsha KingMed Center for Clinical Laboratory, Changsha, China.,Guangzhou Kingmed Center for Clinical Laboratory, Guangzhou, China
| | - Wenbo Mu
- Guangzhou Kingmed Center for Clinical Laboratory, Guangzhou, China
| | - Huiyuan Li
- Changsha KingMed Center for Clinical Laboratory, Changsha, China.,Guangzhou Kingmed Center for Clinical Laboratory, Guangzhou, China
| | - Ping Fang
- Guangzhou Kingmed Center for Clinical Laboratory, Guangzhou, China
| | - Qiang Gong
- Changsha KingMed Center for Clinical Laboratory, Changsha, China.,Guangzhou Kingmed Center for Clinical Laboratory, Guangzhou, China
| | - Peng Zhang
- Beijing Key Laboratory for Genetics of Birth Defects, Beijing Pediatric Research Institute, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, China
| |
Collapse
|
40
|
Sharo AG, Hu Z, Sunyaev SR, Brenner SE. StrVCTVRE: A supervised learning method to predict the pathogenicity of human genome structural variants. Am J Hum Genet 2022; 109:195-209. [PMID: 35032432 PMCID: PMC8874149 DOI: 10.1016/j.ajhg.2021.12.007] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 12/09/2021] [Indexed: 12/12/2022] Open
Abstract
Whole-genome sequencing resolves many clinical cases where standard diagnostic methods have failed. However, at least half of these cases remain unresolved after whole-genome sequencing. Structural variants (SVs; genomic variants larger than 50 base pairs) of uncertain significance are the genetic cause of a portion of these unresolved cases. As sequencing methods using long or linked reads become more accessible and SV detection algorithms improve, clinicians and researchers are gaining access to thousands of reliable SVs of unknown disease relevance. Methods to predict the pathogenicity of these SVs are required to realize the full diagnostic potential of long-read sequencing. To address this emerging need, we developed StrVCTVRE to distinguish pathogenic SVs from benign SVs that overlap exons. In a random forest classifier, we integrated features that capture gene importance, coding region, conservation, expression, and exon structure. We found that features such as expression and conservation are important but are absent from SV classification guidelines. We leveraged multiple resources to construct a size-matched training set of rare, putatively benign and pathogenic SVs. StrVCTVRE performs accurately across a wide SV size range on independent test sets, which will allow clinicians and researchers to eliminate about half of SVs from consideration while retaining a 90% sensitivity. We anticipate clinicians and researchers will use StrVCTVRE to prioritize SVs in probands where no SV is immediately compelling, empowering deeper investigation into novel SVs to resolve cases and understand new mechanisms of disease. StrVCTVRE runs rapidly and is publicly available.
Collapse
Affiliation(s)
- Andrew G Sharo
- Biophysics Graduate Group, University of California, Berkeley, Berkeley, CA 94720, USA; Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA.
| | - Zhiqiang Hu
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA; Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Shamil R Sunyaev
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA; Division of Genetics, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Steven E Brenner
- Biophysics Graduate Group, University of California, Berkeley, Berkeley, CA 94720, USA; Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA; Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA 94720, USA.
| |
Collapse
|
41
|
Türkyilmaz A, Sager SG, Topcu B, Kaplan AT, Günbey HP, Akin Y. Novel SH3PXD2B variant identified by whole-exome sequencing in a Turkish newborn with Frank-Ter Haar Syndrome. Clin Dysmorphol 2022; 31:45-49. [PMID: 34538861 DOI: 10.1097/mcd.0000000000000389] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
- Ayberk Türkyilmaz
- Department of Medical Genetics, Karadeniz Technical University Faculty of Medicine, Trabzon
| | | | | | | | | | - Yasemin Akin
- Department of Pediatrics, Kartal Dr. Lutfi Kirdar City Hospital, Istanbul, Turkey
| |
Collapse
|
42
|
Iwanicka-Pronicka K, Trubicka J, Szymanska E, Ciara E, Rokicki D, Pollak A, Pronicki M. Sensorineural hearing loss in GSD type I patients. A newly recognized symptomatic association of potential clinical significance and unclear pathomechanism. Int J Pediatr Otorhinolaryngol 2021; 151:110970. [PMID: 34775139 DOI: 10.1016/j.ijporl.2021.110970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 10/15/2021] [Accepted: 11/08/2021] [Indexed: 10/19/2022]
Abstract
OBJECTIVE Glycogen storage disease (GSD) type I is an inborn error of carbohydrates metabolism characterized by inability to convert glucose-6-phosphate to glucose. It presents with serious liver and metabolic complications, as well as in type Ib with severe infections due to neutropenia. So far, the sensorineural hearing impairment has not been reported in these patients. Bilateral, sensorineural hearing impairment was diagnosed in four unrelated GSDI patients. Congenital origin of hearing loss and descending audiometric curves warranted the need for future investigations. METHODS Hearing status was assessed in entire group of 40 children with GSD type I. Then, molecular testing, massive parallel sequencing was performed in the four probands and their parents in order to find possible genetic background of auditory dysfunction in these patients. RESULTS Pathogenic variants in G6PC and SLC37A4 related to the phenotypes of GSDI subtype Ia and subtype Ib were detected, each in two probands, respectively. No change in the genes involved in auditory pathway dysfunction was found. CONCLUSIONS Sensorineural hearing loss appears to be associated with GSDI in approximately one out of ten cases. Careful assessment and monitoring of auditory functions of patients with GSDI is recommended.
Collapse
Affiliation(s)
- Katarzyna Iwanicka-Pronicka
- Department of Audiology and Phoniatrics, The Children's Memorial Health Institute, Al. Dzieci Polskich 20; 04-730, Warsaw, Poland; Department of Medical Genetics, The Children's Memorial Health Institute, Al. Dzieci Polskich 20; 04-730, Warsaw, Poland.
| | - Joanna Trubicka
- Department of Medical Genetics, The Children's Memorial Health Institute, Al. Dzieci Polskich 20; 04-730, Warsaw, Poland; Department of Pathology, The Children's Memorial Health Institute, Al. Dzieci Polskich 20; 04-730, Warsaw, Poland
| | - Edyta Szymanska
- Department of Gastroenterology, Hepatology, Feeding Disorders and Pediatrics, The Childrens' Memorial Health Institute, Warsaw, Poland, Al. Dzieci Polskich 20; 04-730, Warsaw, Poland; Department of Pediatrics, Nutrition and Metabolic Diseases, The Children's Memorial Health Institute, Al. Dzieci Polskich 20; 04-730, Warsaw, Poland
| | - Elżbieta Ciara
- Department of Medical Genetics, The Children's Memorial Health Institute, Al. Dzieci Polskich 20; 04-730, Warsaw, Poland
| | - Dariusz Rokicki
- Department of Pediatrics, Nutrition and Metabolic Diseases, The Children's Memorial Health Institute, Al. Dzieci Polskich 20; 04-730, Warsaw, Poland
| | - Agnieszka Pollak
- Department of Medical Genetics, Medical University of Warsaw, A. Pawinskiego 3c, 02-106, Warszaw, Poland
| | - Maciej Pronicki
- Department of Pathology, The Children's Memorial Health Institute, Al. Dzieci Polskich 20; 04-730, Warsaw, Poland
| |
Collapse
|
43
|
Yang J, Shu L, Duan H, Li H. A Visual Phenotype-Based Differential Diagnosis Process for Rare Diseases. Interdiscip Sci 2021; 14:331-348. [PMID: 34751921 DOI: 10.1007/s12539-021-00490-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 10/23/2021] [Accepted: 10/28/2021] [Indexed: 02/01/2023]
Abstract
PURPOSE Phenotype-based rapid diagnosis can make up for the time-consuming genetic sequencing diagnosis of rare diseases. However, the collected phenotypes of patients can sometimes be inaccurate or incomplete, which limits the accuracy of diagnostic results. To solve this problem, we try to design a phenotype-based differential diagnosis process for rare diseases to achieve rapid and accurate diagnosis of rare diseases. METHODS The core of the differential diagnosis of rare diseases is to optimize the phenotype information of a specific patient and the visualized comparative analysis of diseases. To recommend additional phenotypes, replace the fuzzy phenotypes and filter the unexplained phenotypes for patients, we constructed a phenotype hierarchical network and a disease-phenotype differential network and calculated the phenotype co-occurrence relationship. In addition, we designed a visual comparative analysis method to explore the correlation and difference of disease phenotypes. RESULTS The evaluation based on the published 10 rare disease cases demonstrated that after the optimization of patient phenotype information through our differential diagnosis, the target disease often got a better ranking and recommendation score than before. We have deployed this scheme on the RDmap project ( http://rdmap.nbscn.org ). CONCLUSION Compared to genetic and molecular analysis, phenotype-based diagnosis is faster, cheaper, and easier. The differential diagnosis process we designed can optimize the phenotype information of patients and better locate the target disease. It can also help to make screening decisions before genetic testing.
Collapse
Affiliation(s)
- Jian Yang
- The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Binsheng Road 3333#, Hangzhou, 310052, Zhejiang, China.,The College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, Zhejiang, China
| | - Liqi Shu
- Rhode Island Hospital, Warren Alpert Medical School of Brown University, Rhode Island, USA
| | - Huilong Duan
- The College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, Zhejiang, China
| | - Haomin Li
- The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Binsheng Road 3333#, Hangzhou, 310052, Zhejiang, China.
| |
Collapse
|
44
|
Liu H, Hou L, Xu S, Li H, Chen X, Gao J, Wang Z, Han B, Liu X, Wan S. Discovering Cerebral Ischemic Stroke Associated Genes Based on Network Representation Learning. Front Genet 2021; 12:728333. [PMID: 34539754 PMCID: PMC8442767 DOI: 10.3389/fgene.2021.728333] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Accepted: 07/26/2021] [Indexed: 11/13/2022] Open
Abstract
Cerebral ischemic stroke (IS) is a complex disease caused by multiple factors including vascular risk factors, genetic factors, and environment factors, which accentuates the difficulty in discovering corresponding disease-related genes. Identifying the genes associated with IS is critical for understanding the biological mechanism of IS, which would be significantly beneficial to the diagnosis and clinical treatment of cerebral IS. However, existing methods to predict IS-related genes are mainly based on the hypothesis of guilt-by-association (GBA). These methods cannot capture the global structure information of the whole protein-protein interaction (PPI) network. Inspired by the success of network representation learning (NRL) in the field of network analysis, we apply NRL to the discovery of disease-related genes and launch the framework to identify the disease-related genes of cerebral IS. The utilized framework contains three main parts: capturing the topological information of the PPI network with NRL, denoising the gene feature with the participation of a stacked autoencoder (SAE), and optimizing a support vector machine (SVM) classifier to identify IS-related genes. Superior to the existing methods on IS-related gene prediction, our framework presents more accurate results. The case study also shows that the proposed method can identify IS-related genes.
Collapse
Affiliation(s)
- Haijie Liu
- Department of Neurology, Xuanwu Hospital, Capital Medical University, Beijing, China
| | - Liping Hou
- Department of Clinical Laboratory, General Hospital of Heilongjiang Province Land Reclamation Bureau, Harbin, China
| | - Shanhu Xu
- Affiliated Zhejiang Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - He Li
- Department of Automation, College of Information Science and Engineering, Tianjin Tianshi College, Tianjin, China
| | - Xiuju Chen
- Department of Neurology, Tianjin Nankai Hospital, Tianjin, China
| | - Juan Gao
- Department of Neurology, Baoding No. 1 Central Hospital, Baoding, China
| | - Ziwen Wang
- Graduate School of Chengde Medical College, Chengde, China
| | - Bo Han
- Department of Neurology, Xuanwu Hospital, Capital Medical University, Beijing, China
| | - Xiaoli Liu
- Affiliated Zhejiang Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Shu Wan
- Affiliated Zhejiang Hospital, Zhejiang University School of Medicine, Hangzhou, China
| |
Collapse
|
45
|
Umlai UKI, Bangarusamy DK, Estivill X, Jithesh PV. Genome sequencing data analysis for rare disease gene discovery. Brief Bioinform 2021; 23:6366880. [PMID: 34498682 DOI: 10.1093/bib/bbab363] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 07/24/2021] [Accepted: 08/17/2021] [Indexed: 12/14/2022] Open
Abstract
Rare diseases occur in a smaller proportion of the general population, which is variedly defined as less than 200 000 individuals (US) or in less than 1 in 2000 individuals (Europe). Although rare, they collectively make up to approximately 7000 different disorders, with majority having a genetic origin, and affect roughly 300 million people globally. Most of the patients and their families undergo a long and frustrating diagnostic odyssey. However, advances in the field of genomics have started to facilitate the process of diagnosis, though it is hindered by the difficulty in genome data analysis and interpretation. A major impediment in diagnosis is in the understanding of the diverse approaches, tools and datasets available for variant prioritization, the most important step in the analysis of millions of variants to select a few potential variants. Here we present a review of the latest methodological developments and spectrum of tools available for rare disease genetic variant discovery and recommend appropriate data interpretation methods for variant prioritization. We have categorized the resources based on various steps of the variant interpretation workflow, starting from data processing, variant calling, annotation, filtration and finally prioritization, with a special emphasis on the last two steps. The methods discussed here pertain to elucidating the genetic basis of disease in individual patient cases via trio- or family-based analysis of the genome data. We advocate the use of a combination of tools and datasets and to follow multiple iterative approaches to elucidate the potential causative variant.
Collapse
Affiliation(s)
- Umm-Kulthum Ismail Umlai
- Division of Genomics & Translational Biomedicine, College of Health & Life Sciences, Hamad Bin Khalifa University, B-147, Penrose House, PO Box 34110, Education City, Doha, Qatar
| | - Dhinoth Kumar Bangarusamy
- Division of Genomics & Translational Biomedicine, College of Health & Life Sciences, Hamad Bin Khalifa University, B-147, Penrose House, PO Box 34110, Education City, Doha, Qatar
| | - Xavier Estivill
- Quantitative Genomics Laboratories (qGenomics), Barcelona, Catalonia, Spain
| | - Puthen Veettil Jithesh
- Division of Genomics & Translational Biomedicine, College of Health & Life Sciences, Hamad Bin Khalifa University, B-147, Penrose House, PO Box 34110, Education City, Doha, Qatar
| |
Collapse
|
46
|
Holmgren SD, Boyles RR, Cronk RD, Duncan CG, Kwok RK, Lunn RM, Osborn KC, Thessen AE, Schmitt CP. Catalyzing Knowledge-Driven Discovery in Environmental Health Sciences through a Community-Driven Harmonized Language. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:8985. [PMID: 34501574 PMCID: PMC8430534 DOI: 10.3390/ijerph18178985] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 08/13/2021] [Accepted: 08/19/2021] [Indexed: 01/10/2023]
Abstract
Harmonized language is critical for helping researchers to find data, collecting scientific data to facilitate comparison, and performing pooled and meta-analyses. Using standard terms to link data to knowledge systems facilitates knowledge-driven analysis, allows for the use of biomedical knowledge bases for scientific interpretation and hypothesis generation, and increasingly supports artificial intelligence (AI) and machine learning. Due to the breadth of environmental health sciences (EHS) research and the continuous evolution in scientific methods, the gaps in standard terminologies, vocabularies, ontologies, and related tools hamper the capabilities to address large-scale, complex EHS research questions that require the integration of disparate data and knowledge sources. The results of prior workshops to advance a harmonized environmental health language demonstrate that future efforts should be sustained and grounded in scientific need. We describe a community initiative whose mission was to advance integrative environmental health sciences research via the development and adoption of a harmonized language. The products, outcomes, and recommendations developed and endorsed by this community are expected to enhance data collection and management efforts for NIEHS and the EHS community, making data more findable and interoperable. This initiative will provide a community of practice space to exchange information and expertise, be a coordination hub for identifying and prioritizing activities, and a collaboration platform for the development and adoption of semantic solutions. We encourage anyone interested in advancing this mission to engage in this community.
Collapse
Affiliation(s)
- Stephanie D. Holmgren
- Office of Data Science, National Institute of Environmental Health Sciences (NIEHS), Durham, NC 27709, USA;
| | | | | | - Christopher G. Duncan
- Genes, Environment, and Health Branch, Division of Extramural Research and Training, NIEHS, Durham, NC 27709, USA;
| | - Richard K. Kwok
- Epidemiology Branch, Division of Intramural Research, NIEHS, Durham, NC 27709, USA;
- Office of the Director, NIEHS, Bethesda, MD 20892, USA
| | - Ruth M. Lunn
- Integrative Health Assessment Branch, Division of the National Toxicology Program, NIEHS, Durham, NC 27709, USA;
| | | | - Anne E. Thessen
- Environmental and Molecular Toxicology Department, Oregon State University, Corvallis, OR 97331, USA;
| | - Charles P. Schmitt
- Office of Data Science, National Institute of Environmental Health Sciences (NIEHS), Durham, NC 27709, USA;
| |
Collapse
|
47
|
Kafkas Ş, Althubaiti S, Gkoutos GV, Hoehndorf R, Schofield PN. Linking common human diseases to their phenotypes; development of a resource for human phenomics. J Biomed Semantics 2021; 12:17. [PMID: 34425897 PMCID: PMC8383460 DOI: 10.1186/s13326-021-00249-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Accepted: 07/30/2021] [Indexed: 11/11/2022] Open
Abstract
Background In recent years a large volume of clinical genomics data has become available due to rapid advances in sequencing technologies. Efficient exploitation of this genomics data requires linkage to patient phenotype profiles. Current resources providing disease-phenotype associations are not comprehensive, and they often do not have broad coverage of the disease terminologies, particularly ICD-10, which is still the primary terminology used in clinical settings. Methods We developed two approaches to gather disease-phenotype associations. First, we used a text mining method that utilizes semantic relations in phenotype ontologies, and applies statistical methods to extract associations between diseases in ICD-10 and phenotype ontology classes from the literature. Second, we developed a semi-automatic way to collect ICD-10–phenotype associations from existing resources containing known relationships. Results We generated four datasets. Two of them are independent datasets linking diseases to their phenotypes based on text mining and semi-automatic strategies. The remaining two datasets are generated from these datasets and cover a subset of ICD-10 classes of common diseases contained in UK Biobank. We extensively validated our text mined and semi-automatically curated datasets by: comparing them against an expert-curated validation dataset containing disease–phenotype associations, measuring their similarity to disease–phenotype associations found in public databases, and assessing how well they could be used to recover gene–disease associations using phenotype similarity. Conclusion We find that our text mining method can produce phenotype annotations of diseases that are correct but often too general to have significant information content, or too specific to accurately reflect the typical manifestations of the sporadic disease. On the other hand, the datasets generated from integrating multiple knowledgebases are more complete (i.e., cover more of the required phenotype annotations for a given disease). We make all data freely available at 10.5281/zenodo.4726713. Supplementary Information The online version contains supplementary material available at (10.1186/s13326-021-00249-x).
Collapse
Affiliation(s)
- Şenay Kafkas
- Computational Bioscience Research Center (CBRC), Computer, Electrical, and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955, Saudi Arabia
| | - Sara Althubaiti
- Computational Bioscience Research Center (CBRC), Computer, Electrical, and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955, Saudi Arabia
| | - Georgios V Gkoutos
- Health Data Research UK, Midlands site, Edgbaston, Birmingham, B15 2TT, United Kingdom.,Institute of Cancer and Genomic Sciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United Kingdom
| | - Robert Hoehndorf
- Computational Bioscience Research Center (CBRC), Computer, Electrical, and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955, Saudi Arabia.
| | - Paul N Schofield
- Department of Physiology, Development & Neuroscience, University of Cambridge, Downing Street, Cambridge, CB2 3EG, United Kingdom
| |
Collapse
|
48
|
Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records. Genes (Basel) 2021; 12:genes12081159. [PMID: 34440331 PMCID: PMC8393657 DOI: 10.3390/genes12081159] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 07/25/2021] [Accepted: 07/26/2021] [Indexed: 12/30/2022] Open
Abstract
Human genetic disorders, such as Down syndrome, have a wide variety of clinical phenotypic presentations, and characterizing each nuanced phenotype and subtype can be difficult. In this study, we examined the electronic health records of 4095 individuals with Down syndrome at the Children’s Hospital of Philadelphia to create a method to characterize the phenotypic spectrum digitally. We extracted Human Phenotype Ontology (HPO) terms from quality-filtered patient notes using a natural language processing (NLP) approach MetaMap. We catalogued the most common HPO terms related to Down syndrome patients and compared the terms with those from a baseline population. We characterized the top 100 HPO terms by their frequencies at different ages of clinical visits and highlighted selected terms that have time-dependent distributions. We also discovered phenotypic terms that have not been significantly associated with Down syndrome, such as “Proptosis”, “Downslanted palpebral fissures”, and “Microtia”. In summary, our study demonstrated that the clinical phenotypic spectrum of individual with Mendelian diseases can be characterized through NLP-based digital phenotyping on population-scale electronic health records (EHRs).
Collapse
|
49
|
Janiec A, Halat-Wolska P, Obrycki Ł, Ciara E, Wójcik M, Płudowski P, Wierzbicka A, Kowalska E, Książyk JB, Kułaga Z, Pronicka E, Litwin M. Long-term outcome of the survivors of infantile hypercalcaemia with CYP24A1 and SLC34A1 mutations. Nephrol Dial Transplant 2021; 36:1484-1492. [PMID: 33099630 PMCID: PMC8311581 DOI: 10.1093/ndt/gfaa178] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2020] [Accepted: 05/25/2020] [Indexed: 11/12/2022] Open
Abstract
Background Infantile hypercalcaemia (IH) is a vitamin D3 metabolism disorder. The molecular basis for IH is biallelic mutations in the CYP24A1 or SLC34A1 gene. These changes lead to catabolism disorders (CYP24A1 mutations) or excessive generation of 1,25-dihydroxyvitamin D3 [1,25(OH)2D3] (SLC34A1 mutations). The incidence rate of IH in children and the risk level for developing end-stage renal disease (ESRD) are still unknown. The aim of this study was to analyse the long-term outcome of adolescents and young adults who suffered from IH in infancy. Design Forty-two children (23 girls; average age 10.7 ± 6.3 years) and 26 adults (14 women; average age 24.2 ± 4.4 years) with a personal history of hypercalcaemia with elevated 1,25(OH)2D3 levels were included in the analysis. In all patients, a genetic analysis of possible IH mutations was conducted, as well as laboratory tests and renal ultrasonography. Results IH was confirmed in 20 studied patients (10 females). CYP24A1 mutations were found in 16 patients (8 females) and SLC34A1 in 4 patients (2 females). The long-term outcome was assessed in 18 patients with an average age of 23.8 years (age range 2–34). The average glomerular filtration rate (GFR) was 72 mL/min/1.73 m2 (range 15–105). Two patients with a CYP24A1 mutation developed ESRD and underwent renal transplantation. A GFR <90 mL/min/1.73 m2 was found in 14 patients (77%), whereas a GFR <60 mL/min/1.73 m2 was seen in 5 patients (28%), including 2 adults after renal transplantation. Three of 18 patients still had serum calcium levels >2.6 mmol/L. A renal ultrasound revealed nephrocalcinosis in 16 of 18 (88%) patients, however, mild hypercalciuria was detected in only one subject. Conclusions Subjects who suffered from IH have a greater risk of progressive chronic kidney disease and nephrocalcinosis. This indicates that all survivors of IH should be closely monitored, with early implementation of preventive measures, e.g. inhibition of active metabolites of vitamin D3 synthesis.
Collapse
Affiliation(s)
- Agnieszka Janiec
- Department of Paediatrics, Nutrition and Metabolic Diseases, Children's Memorial Health Institute, Warsaw, Poland
| | - Paulina Halat-Wolska
- Department of Medical Genetics, Children's Memorial Health Institute, Warsaw, Poland
| | - Łukasz Obrycki
- Department of Nephrology, Kidney Transplantation and Arterial Hypertension, Children's Memorial Health Institute, Warsaw, Poland
| | - Elżbieta Ciara
- Department of Medical Genetics, Children's Memorial Health Institute, Warsaw, Poland
| | - Marek Wójcik
- Department of Biochemistry, Radioimmunology and Experimental Medicine, Children's Memorial Health Institute, Warsaw, Poland
| | - Paweł Płudowski
- Department of Biochemistry, Radioimmunology and Experimental Medicine, Children's Memorial Health Institute, Warsaw, Poland
| | - Aldona Wierzbicka
- Department of Biochemistry, Radioimmunology and Experimental Medicine, Children's Memorial Health Institute, Warsaw, Poland
| | - Ewa Kowalska
- Department of Biochemistry, Radioimmunology and Experimental Medicine, Children's Memorial Health Institute, Warsaw, Poland
| | - Janusz B Książyk
- Department of Paediatrics, Nutrition and Metabolic Diseases, Children's Memorial Health Institute, Warsaw, Poland
| | - Zbigniew Kułaga
- Department of Public Health, Children's Memorial Health Institute, Warsaw, Poland
| | - Ewa Pronicka
- Department of Paediatrics, Nutrition and Metabolic Diseases, Children's Memorial Health Institute, Warsaw, Poland.,Department of Medical Genetics, Children's Memorial Health Institute, Warsaw, Poland
| | - Mieczysław Litwin
- Department of Nephrology, Kidney Transplantation and Arterial Hypertension, Children's Memorial Health Institute, Warsaw, Poland
| |
Collapse
|
50
|
Obara-Moszyńska M, Budny B, Kałużna M, Zawadzka K, Jamsheer A, Rohde A, Ruchała M, Ziemnicka K, Niedziela M. CDON gene contributes to pituitary stalk interruption syndrome associated with unilateral facial and abducens nerve palsy. J Appl Genet 2021; 62:621-629. [PMID: 34235642 PMCID: PMC8571149 DOI: 10.1007/s13353-021-00649-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 06/21/2021] [Accepted: 06/28/2021] [Indexed: 11/06/2022]
Abstract
The relationship between congenital defects of the brain and facial anomalies was proven. The Hedgehog signaling pathway plays a fundamental role in normal craniofacial development in humans. Mutations in the sonic hedgehog (SHH) signaling gene CDON have been recently reported in patients with holoprosencephaly and with pituitary stalk interruption syndrome (PSIS). This study’s aim was an elucidation of an 18-year-old patient presenting PSIS, multiple pituitary hormone deficiency, and congenital unilateral facial and abducens nerve palsy. Additionally, bilateral sensorineural hearing loss, dominating at the right site, was diagnosed. From the second year of life, growth deceleration was observed, and from the age of eight, anterior pituitary hormone deficiencies were gradually confirmed and substituted. At the MRI, characteristic triad for PSIS (anterior pituitary hypoplasia, interrupted pituitary stalk and ectopic posterior lobe) was diagnosed. We performed a comprehensive genomic screening, including microarrays for structural rearrangements and whole-exome sequencing for a monogenic defect. A novel heterozygous missense variant in the CDON gene (c.1814G > T; p.Gly605Val) was identified. The variant was inherited from the mother, who, besides short stature, did not show any disease symptoms. The variant was absent in control databases and 100 healthy subjects originating from the same population. We report a novel variant in the CDON gene associated with PSIS and congenital cranial nerve palsy. The variant revealed autosomal dominant inheritance with incomplete penetrance in concordance with previous studies reporting CDON defects.
Collapse
Affiliation(s)
- Monika Obara-Moszyńska
- Department of Pediatric Endocrinology and Rheumatology, Poznan University of Medical Sciences, 27/33 Szpitalna Str, 60-572, Poznan, Poland.
| | - Bartłomiej Budny
- Department of Endocrinology, Metabolism and Internal Medicine, Poznan University of Medical Sciences, 49 Przybyszewskiego Str., 60-355, Poznan, Poland
| | - Małgorzata Kałużna
- Department of Endocrinology, Metabolism and Internal Medicine, Poznan University of Medical Sciences, 49 Przybyszewskiego Str., 60-355, Poznan, Poland
| | - Katarzyna Zawadzka
- MNM Diagnostics Sp. z o.o., 64 Macieja Rataja Str., 61-695, Poznan, Poland
| | - Aleksander Jamsheer
- Department of Medical Genetics, Poznan University of Medical Sciences, 8 Rokietnicka Str, 60-806, Poznan, Poland
| | - Anna Rohde
- Department of Pediatric Endocrinology and Rheumatology, Poznan University of Medical Sciences, 27/33 Szpitalna Str, 60-572, Poznan, Poland
| | - Marek Ruchała
- Department of Endocrinology, Metabolism and Internal Medicine, Poznan University of Medical Sciences, 49 Przybyszewskiego Str., 60-355, Poznan, Poland
| | - Katarzyna Ziemnicka
- Department of Endocrinology, Metabolism and Internal Medicine, Poznan University of Medical Sciences, 49 Przybyszewskiego Str., 60-355, Poznan, Poland
| | - Marek Niedziela
- Department of Pediatric Endocrinology and Rheumatology, Poznan University of Medical Sciences, 27/33 Szpitalna Str, 60-572, Poznan, Poland
| |
Collapse
|