1
|
Raycheva R, Kostadinov K, Mitova E, Bogoeva N, Iskrov G, Stefanov G, Stefanov R. Challenges in mapping European rare disease databases, relevant for ML-based screening technologies in terms of organizational, FAIR and legal principles: scoping review. Front Public Health 2023; 11:1214766. [PMID: 37780450 PMCID: PMC10540868 DOI: 10.3389/fpubh.2023.1214766] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 08/30/2023] [Indexed: 10/03/2023] Open
Abstract
Background Given the increased availability of data sources such as hospital information systems, electronic health records, and health-related registries, a novel approach is required to develop artificial intelligence-based decision support that can assist clinicians in their diagnostic decision-making and shorten rare disease patients' diagnostic odyssey. The aim is to identify key challenges in the process of mapping European rare disease databases, relevant to ML-based screening technologies in terms of organizational, FAIR and legal principles. Methods A scoping review was conducted based on the PRISMA-ScR checklist. The primary article search was conducted in three electronic databases (MEDLINE/Pubmed, Scopus, and Web of Science) and a secondary search was performed in Google scholar and on the organizations' websites. Each step of this review was carried out independently by two researchers. A charting form for relevant study analysis was developed and used to categorize data and identify data items in three domains - organizational, FAIR and legal. Results At the end of the screening process, 73 studies were eligible for review based on inclusion and exclusion criteria with more than 60% (n = 46) of the research published in the last 5 years and originated only from EU/EEA countries. Over the ten-year period (2013-2022), there is a clear cycling trend in the publications, with a peak of challenges reporting every four years. Within this trend, the following dynamic was identified: except for 2016, organizational challenges dominated the articles published up to 2018; legal challenges were the most frequently discussed topic from 2018 to 2022. The following distribution of the data items by domains was observed - (1) organizational (n = 36): data accessibility and sharing (20.2%); long-term sustainability (18.2%); governance, planning and design (17.2%); lack of harmonization and standardization (17.2%); quality of data collection (16.2%); and privacy risks and small sample size (11.1%); (2) FAIR (n = 15): findable (17.9%); accessible sustainability (25.0%); interoperable (39.3%); and reusable (17.9%); and (3) legal (n = 33): data protection by all means (34.4%); data management and ownership (22.9%); research under GDPR and member state law (20.8%); trust and transparency (13.5%); and digitalization of health (8.3%). We observed a specific pattern repeated in all domains during the process of data charting and data item identification - in addition to the outlined challenges, good practices, guidelines, and recommendations were also discussed. The proportion of publications addressing only good practices, guidelines, and recommendations for overcoming challenges when mapping RD databases in at least one domain was calculated to be 47.9% (n = 35). Conclusion Despite the opportunities provided by innovation - automation, electronic health records, hospital-based information systems, biobanks, rare disease registries and European Reference Networks - the results of the current scoping review demonstrate a diversity of the challenges that must still be addressed, with immediate actions on ensuring better governance of rare disease registries, implementing FAIR principles, and enhancing the EU legal framework.
Collapse
Affiliation(s)
- Ralitsa Raycheva
- Department of Social Medicine and Public Health, Faculty of Public Health, Medical University of Plovdiv, Plovdiv, Bulgaria
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria
| | - Kostadin Kostadinov
- Department of Social Medicine and Public Health, Faculty of Public Health, Medical University of Plovdiv, Plovdiv, Bulgaria
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria
| | - Elena Mitova
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria
| | - Nataliya Bogoeva
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria
| | - Georgi Iskrov
- Department of Social Medicine and Public Health, Faculty of Public Health, Medical University of Plovdiv, Plovdiv, Bulgaria
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria
| | - Georgi Stefanov
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria
| | - Rumen Stefanov
- Department of Social Medicine and Public Health, Faculty of Public Health, Medical University of Plovdiv, Plovdiv, Bulgaria
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria
| |
Collapse
|
2
|
Dong H, Suárez-Paniagua V, Zhang H, Wang M, Casey A, Davidson E, Chen J, Alex B, Whiteley W, Wu H. Ontology-driven and weakly supervised rare disease identification from clinical notes. BMC Med Inform Decis Mak 2023; 23:86. [PMID: 37147628 PMCID: PMC10162001 DOI: 10.1186/s12911-023-02181-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 04/21/2023] [Indexed: 05/07/2023] Open
Abstract
BACKGROUND Computational text phenotyping is the practice of identifying patients with certain disorders and traits from clinical notes. Rare diseases are challenging to be identified due to few cases available for machine learning and the need for data annotation from domain experts. METHODS We propose a method using ontologies and weak supervision, with recent pre-trained contextual representations from Bi-directional Transformers (e.g. BERT). The ontology-driven framework includes two steps: (i) Text-to-UMLS, extracting phenotypes by contextually linking mentions to concepts in Unified Medical Language System (UMLS), with a Named Entity Recognition and Linking (NER+L) tool, SemEHR, and weak supervision with customised rules and contextual mention representation; (ii) UMLS-to-ORDO, matching UMLS concepts to rare diseases in Orphanet Rare Disease Ontology (ORDO). The weakly supervised approach is proposed to learn a phenotype confirmation model to improve Text-to-UMLS linking, without annotated data from domain experts. We evaluated the approach on three clinical datasets, MIMIC-III discharge summaries, MIMIC-III radiology reports, and NHS Tayside brain imaging reports from two institutions in the US and the UK, with annotations. RESULTS The improvements in the precision were pronounced (by over 30% to 50% absolute score for Text-to-UMLS linking), with almost no loss of recall compared to the existing NER+L tool, SemEHR. Results on radiology reports from MIMIC-III and NHS Tayside were consistent with the discharge summaries. The overall pipeline processing clinical notes can extract rare disease cases, mostly uncaptured in structured data (manually assigned ICD codes). CONCLUSION The study provides empirical evidence for the task by applying a weakly supervised NLP pipeline on clinical notes. The proposed weak supervised deep learning approach requires no human annotation except for validation and testing, by leveraging ontologies, NER+L tools, and contextual representations. The study also demonstrates that Natural Language Processing (NLP) can complement traditional ICD-based approaches to better estimate rare diseases in clinical notes. We discuss the usefulness and limitations of the weak supervision approach and propose directions for future studies.
Collapse
Affiliation(s)
- Hang Dong
- Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, United Kingdom.
- Health Data Research UK, London, United Kingdom.
- Department of Computer Science, University of Oxford, Oxford, United Kingdom.
| | - Víctor Suárez-Paniagua
- Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, United Kingdom
- Health Data Research UK, London, United Kingdom
| | - Huayu Zhang
- Advanced Care Research Centre, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - Minhong Wang
- Institute of Health Informatics, University College London, London, United Kingdom
| | - Arlene Casey
- Advanced Care Research Centre, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - Emma Davidson
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Jiaoyan Chen
- Department of Computer Science, The University of Manchester, Manchester, United Kingdom
| | - Beatrice Alex
- Edinburgh Futures Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - William Whiteley
- Health Data Research UK, London, United Kingdom
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Honghan Wu
- Health Data Research UK, London, United Kingdom.
- Institute of Health Informatics, University College London, London, United Kingdom.
| |
Collapse
|
3
|
Yang J, Shu L, Duan H, Li H. A Robust Phenotype-driven Likelihood Ratio Analysis Approach Assisting Interpretable Clinical Diagnosis of Rare Diseases. J Biomed Inform 2023; 142:104372. [PMID: 37105510 DOI: 10.1016/j.jbi.2023.104372] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 02/20/2023] [Accepted: 04/20/2023] [Indexed: 04/29/2023]
Abstract
Phenotype-based prioritization of candidate genes and diseases has become a well-established approach for multi-omics diagnostics of rare diseases. Most current algorithms exploit semantic analysis and probabilistic statistics based on Human Phenotype Ontology and are commonly superior to naive search methods. However, these algorithms are mostly less interpretable and do not perform well in real clinical scenarios due to noise and imprecision of query terms, and the fact that individuals may not display all phenotypes of the disease they belong to. We present a Phenotype-driven Likelihood Ratio analysis approach (PheLR) assisting interpretable clinical diagnosis of rare diseases. With a likelihood ratio paradigm, PheLR estimates the posterior probability of candidate diseases and how much a phenotypic feature contributes to the prioritization result. Benchmarked using simulated and realistic patients, PheLR shows significant advantages over current approaches and is robust to noise and inaccuracy. To facilitate clinical practice and visualized differential diagnosis, PheLR is implemented as an online web tool (http://phelr.nbscn.org).
Collapse
Affiliation(s)
- Jian Yang
- Clinical Data Center, The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Zhejiang, China; The College of Biomedical Engineering and Instrument Science, Zhejiang University, Zhejiang, China
| | - Liqi Shu
- Rhode Island Hospital, Warren Alpert Medical School of Brown University, Rhode Island, USA
| | - Huilong Duan
- The College of Biomedical Engineering and Instrument Science, Zhejiang University, Zhejiang, China
| | - Haomin Li
- Clinical Data Center, The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Zhejiang, China.
| |
Collapse
|
4
|
Yuan C, Zhang W, Wang J, Huang C, Shu B, Liang Q, Huang T, Wang J, Shi Q, Tang D, Wang Y. Chinese Medicine Phenomics (Chinmedphenomics): Personalized, Precise and Promising. PHENOMICS (CHAM, SWITZERLAND) 2022; 2:383-388. [PMID: 36939806 PMCID: PMC9712866 DOI: 10.1007/s43657-022-00074-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 08/15/2022] [Accepted: 08/23/2022] [Indexed: 11/07/2022]
Abstract
The systematicness of phenomics and Traditional Chinese Medicine (TCM) enable these two disciplines to interlink with each other. This article discussed the similarity in theory and application between TCM and phenomics and illustrates their respective advantages in diagnosis and treatment of diseases, forming a new discipline eventually. Chinese medicine phenomics (Chinmedphenomics) is built on classic TCM, combined with phenomics technology, and the development of which needs the mega cohort with TCM syndrome and the characteristics of precision medicine as well as multi-disciplinary cooperation, which is personalized, precise and promising, providing unique scientific insights into understanding human health.
Collapse
Affiliation(s)
- Chunchun Yuan
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Institute of Spine, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Key Laboratory of the Ministry of Education about Theory and Treatment of Muscles and Bones, Shanghai, 200032 China
| | - Weiqiang Zhang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Institute of Spine, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Key Laboratory of the Ministry of Education about Theory and Treatment of Muscles and Bones, Shanghai, 200032 China
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200433 China
| | - Jing Wang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Institute of Spine, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Key Laboratory of the Ministry of Education about Theory and Treatment of Muscles and Bones, Shanghai, 200032 China
- Academic Research Center of Shixiaoshan’ Traumatology, Shanghai, 200032 China
- Famous Traditional Chinese Medicine Office, Shanghai, 200032 China
| | - Chen Huang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Institute of Spine, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Key Laboratory of the Ministry of Education about Theory and Treatment of Muscles and Bones, Shanghai, 200032 China
| | - Bing Shu
- Key Laboratory of the Ministry of Education about Theory and Treatment of Muscles and Bones, Shanghai, 200032 China
- Academic Research Center of Shixiaoshan’ Traumatology, Shanghai, 200032 China
- Famous Traditional Chinese Medicine Office, Shanghai, 200032 China
| | - Qianqian Liang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Institute of Spine, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Key Laboratory of the Ministry of Education about Theory and Treatment of Muscles and Bones, Shanghai, 200032 China
- Academic Research Center of Shixiaoshan’ Traumatology, Shanghai, 200032 China
- Famous Traditional Chinese Medicine Office, Shanghai, 200032 China
| | - Tingrui Huang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Institute of Spine, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Key Laboratory of the Ministry of Education about Theory and Treatment of Muscles and Bones, Shanghai, 200032 China
| | - Jiucun Wang
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200433 China
| | - Qi Shi
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Institute of Spine, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Key Laboratory of the Ministry of Education about Theory and Treatment of Muscles and Bones, Shanghai, 200032 China
- Academic Research Center of Shixiaoshan’ Traumatology, Shanghai, 200032 China
- Famous Traditional Chinese Medicine Office, Shanghai, 200032 China
| | - Dezhi Tang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Institute of Spine, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Key Laboratory of the Ministry of Education about Theory and Treatment of Muscles and Bones, Shanghai, 200032 China
- Academic Research Center of Shixiaoshan’ Traumatology, Shanghai, 200032 China
- Famous Traditional Chinese Medicine Office, Shanghai, 200032 China
| | - Yongjun Wang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Institute of Spine, Shanghai University of Traditional Chinese Medicine, Shanghai, 200032 China
- Key Laboratory of the Ministry of Education about Theory and Treatment of Muscles and Bones, Shanghai, 200032 China
- Academic Research Center of Shixiaoshan’ Traumatology, Shanghai, 200032 China
- Famous Traditional Chinese Medicine Office, Shanghai, 200032 China
| |
Collapse
|
5
|
[Rare-disease data standards]. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 2022; 65:1126-1132. [PMID: 36149471 DOI: 10.1007/s00103-022-03591-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 09/01/2022] [Indexed: 11/02/2022]
Abstract
The use of standardized data formats (data standards) in healthcare supports four main goals: (1) exchange of data, (2) integration of computer systems and tools, (3) data storage and archiving, and (4) support of federated databases. Standards are especially important for rare-disease research and clinical care.In this review, we introduce healthcare standards and present a selection of standards that are commonly used in the field of rare diseases. The Human Phenotype Ontology (HPO) is the most commonly used standard for annotating phenotypic abnormalities and supporting phenotype-driven analysis of diagnostic exome and genome sequencing. Numerous standards for diseases are available that support a range of needs. Online Mendelian Inheritance in Man (OMIM) and the Orphanet Rare Disease Ontology (ORDO) are the most important standards developed specifically for rare diseases. The Mondo Disease Ontology (Mondo) is a new disease ontology that aims to integrate data from a comprehensive range of current nosologies. New standards and schemas such as the Medical Action Ontology (MAxO) and the Global Alliance for Genomics and Health (GA4GH) phenopacket are being introduced to extend the scope of standards that support rare disease research.In order to provide optimal care for patients with SE in different healthcare settings, it will be necessary to better integrate standards for rare disease with electronic healthcare resources such as the Fast Healthcare Interoperability Resources (FHIR) standard for healthcare data exchange.
Collapse
|
6
|
Dhombres F, Morgan P, Chaudhari BP, Filges I, Sparks TN, Lapunzina P, Roscioli T, Agarwal U, Aggarwal S, Beneteau C, Cacheiro P, Carmody LC, Collardeau‐Frachon S, Dempsey EA, Dufke A, Duyzend MH, el Ghosh M, Giordano JL, Glad R, Grinfelde I, Iliescu DG, Ladewig MS, Munoz‐Torres MC, Pollazzon M, Radio FC, Rodo C, Silva RG, Smedley D, Sundaramurthi JC, Toro S, Valenzuela I, Vasilevsky NA, Wapner RJ, Zemet R, Haendel MA, Robinson PN. Prenatal phenotyping: A community effort to enhance the Human Phenotype Ontology. AMERICAN JOURNAL OF MEDICAL GENETICS. PART C, SEMINARS IN MEDICAL GENETICS 2022; 190:231-242. [PMID: 35872606 PMCID: PMC9588534 DOI: 10.1002/ajmg.c.31989] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 07/01/2022] [Indexed: 01/07/2023]
Abstract
Technological advances in both genome sequencing and prenatal imaging are increasing our ability to accurately recognize and diagnose Mendelian conditions prenatally. Phenotype-driven early genetic diagnosis of fetal genetic disease can help to strategize treatment options and clinical preventive measures during the perinatal period, to plan in utero therapies, and to inform parental decision-making. Fetal phenotypes of genetic diseases are often unique and at present are not well understood; more comprehensive knowledge about prenatal phenotypes and computational resources have an enormous potential to improve diagnostics and translational research. The Human Phenotype Ontology (HPO) has been widely used to support diagnostics and translational research in human genetics. To better support prenatal usage, the HPO consortium conducted a series of workshops with a group of domain experts in a variety of medical specialties, diagnostic techniques, as well as diseases and phenotypes related to prenatal medicine, including perinatal pathology, musculoskeletal anomalies, neurology, medical genetics, hydrops fetalis, craniofacial malformations, cardiology, neonatal-perinatal medicine, fetal medicine, placental pathology, prenatal imaging, and bioinformatics. We expanded the representation of prenatal phenotypes in HPO by adding 95 new phenotype terms under the Abnormality of prenatal development or birth (HP:0001197) grouping term, and revised definitions, synonyms, and disease annotations for most of the 152 terms that existed before the beginning of this effort. The expansion of prenatal phenotypes in HPO will support phenotype-driven prenatal exome and genome sequencing for precision genetic diagnostics of rare diseases to support prenatal care.
Collapse
Affiliation(s)
- Ferdinand Dhombres
- Sorbonne University, GRC26, INSERM, Limics, Armand Trousseau Hospital, Fetal Medicine Department, APHPParisFrance
| | - Patricia Morgan
- American College of Medical Genetics and Genomics, Newborn Screening Translational Research NetworkBethesdaMarylandUSA
| | - Bimal P. Chaudhari
- Institute for Genomic MedicineNationwide Children's HospitalColumbusOhioUSA
| | - Isabel Filges
- University Hospital Basel and University of Basel, Medical GeneticsBaselSwitzerland
| | - Teresa N. Sparks
- Department of Obstetrics, Gynecology, & Reproductive SciencesUniversity of California, San FranciscoSan FranciscoCaliforniaUSA
| | - Pablo Lapunzina
- CIBERER and Hospital Universitario La Paz, INGEMM‐Institute of Medical and Molecular GeneticsMadridSpain
| | - Tony Roscioli
- Neuroscience Research Australia (NeuRA), University of New South WalesSydneyNew South WalesAustralia
| | - Umber Agarwal
- Department of Maternal and Fetal MedicineLiverpool Women's NHS Foundation TrustLiverpoolUK
| | - Shagun Aggarwal
- Department of Medical GeneticsNizam's Institute of Medical SciencesHyderabadTelanganaIndia
| | - Claire Beneteau
- Service de Génétique Médicale, UF 9321 de Fœtopathologie et Génétique, CHU de NantesNantesFrance
| | - Pilar Cacheiro
- William Harvey Research InstituteQueen Mary University of LondonLondonUK
| | - Leigh C. Carmody
- Department of Genomic MedicineThe Jackson LaboratoryFarmingtonConnecticutUSA
| | | | - Esther A. Dempsey
- St George's University of London, Molecular and Clinical Sciences Research InstituteLondonUK
| | - Andreas Dufke
- University of Tübingen, Institute of Medical Genetics and Applied GenomicsTübingenGermany
| | | | | | - Jessica L. Giordano
- Department of Obstetrics and GynecologyColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | - Ragnhild Glad
- Department of Obstetrics and GynecologyUniversity Hospital of North NorwayTromsøNorway
| | - Ieva Grinfelde
- Department of Medical Genetics and Prenatal diagnosisChildren's University HospitalRigaLatvia
| | - Dominic G. Iliescu
- Department of Obstetrics and GynecologyUniversity of Medicine and Pharmacy CraiovaCraiovaDoljRomania
| | - Markus S. Ladewig
- Department of OphthalmologyKlinikum SaarbrückenSaarbrückenSaarlandGermany
| | - Monica C. Munoz‐Torres
- Department of Biochemistry and Molecular GeneticsUniversity of Colorado Anschutz Medical CampusAuroraColoradoUSA
| | - Marzia Pollazzon
- Azienda USL‐IRCCS di Reggio EmiliaMedical Genetics UnitReggio EmiliaItaly
| | | | - Carlota Rodo
- Vall d'Hebron Hospital Campus, Maternal & Fetal MedicineBarcelonaSpain
| | - Raquel Gouveia Silva
- Hospital Santa Maria, Serviço de Genética, Departamento de PediatriaHospital de Santa Maria, Centro Hospitalar Universitário Lisboa Norte, Centro Académico de Medicina de LisboaLisboaPortugal
| | - Damian Smedley
- William Harvey Research InstituteQueen Mary University of LondonLondonUK
| | | | - Sabrina Toro
- Department of Biochemistry and Molecular GeneticsUniversity of Colorado Anschutz Medical CampusAuroraColoradoUSA
| | - Irene Valenzuela
- Hospital Vall d'Hebron, Clinical and Molecular Genetics AreaBarcelonaSpain
| | - Nicole A. Vasilevsky
- Department of Biochemistry and Molecular GeneticsUniversity of Colorado Anschutz Medical CampusAuroraColoradoUSA
| | - Ronald J. Wapner
- Department of Obstetrics and GynecologyColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | - Roni Zemet
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexasUSA
| | - Melissa A Haendel
- Department of Biochemistry and Molecular GeneticsUniversity of Colorado Anschutz Medical CampusAuroraColoradoUSA
| | - Peter N. Robinson
- Department of Genomic MedicineThe Jackson LaboratoryFarmingtonConnecticutUSA
| |
Collapse
|
7
|
Steinhaus R, Proft S, Seelow E, Schalau T, Robinson PN, Seelow D. Deep phenotyping: symptom annotation made simple with SAMS. Nucleic Acids Res 2022; 50:W677-W681. [PMID: 35524573 PMCID: PMC9252818 DOI: 10.1093/nar/gkac329] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 04/16/2022] [Accepted: 04/24/2022] [Indexed: 11/14/2022] Open
Abstract
Precision medicine needs precise phenotypes. The Human Phenotype Ontology (HPO) uses clinical signs instead of diagnoses and has become the standard annotation for patients’ phenotypes when describing single gene disorders. Use of the HPO beyond human genetics is however still limited. With SAMS (Symptom Annotation Made Simple), we want to bring sign-based phenotyping to routine clinical care, to hospital patients as well as to outpatients. Our web-based application provides access to three widely used annotation systems: HPO, OMIM, Orphanet. Whilst data can be stored in our database, phenotypes can also be imported and exported as Global Alliance for Genomics and Health (GA4GH) Phenopackets without using the database. The web interface can easily be integrated into local databases, e.g. clinical information systems. SAMS offers users to share their data with others, empowering patients to record their own signs and symptoms (or those of their children) and thus provide their doctors with additional information. We think that our approach will lead to better characterised patients which is not only helpful for finding disease mutations but also to better understand the pathophysiology of diseases and to recruit patients for studies and clinical trials. SAMS is freely available at https://www.genecascade.org/SAMS/.
Collapse
Affiliation(s)
- Robin Steinhaus
- Exploratory Diagnostic Sciences, Berliner Institut für Gesundheitsforschung, Berlin 10117, Germany.,Institut für Medizinische Genetik und Humangenetik, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 13353, Germany
| | - Sebastian Proft
- Exploratory Diagnostic Sciences, Berliner Institut für Gesundheitsforschung, Berlin 10117, Germany.,Institut für Medizinische Genetik und Humangenetik, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 13353, Germany
| | - Evelyn Seelow
- Medizinische Klinik mit Schwerpunkt Nephrologie und Internistische Intensivmedizin, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
| | - Tobias Schalau
- Exploratory Diagnostic Sciences, Berliner Institut für Gesundheitsforschung, Berlin 10117, Germany.,FB Mathematik und Informatik, Freie Universität Berlin, Berlin 14195, Germany
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06030, USA.,Institute for Systems Genomics, University of Connecticut, Farmington, CT 06030, USA
| | - Dominik Seelow
- Exploratory Diagnostic Sciences, Berliner Institut für Gesundheitsforschung, Berlin 10117, Germany.,Institut für Medizinische Genetik und Humangenetik, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 13353, Germany
| |
Collapse
|
8
|
Dupras C, Bunnik EM. Toward a Framework for Assessing Privacy Risks in Multi-Omic Research and Databases. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2021; 21:46-64. [PMID: 33433298 DOI: 10.1080/15265161.2020.1863516] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
While the accumulation and increased circulation of genomic data have captured much attention over the past decade, privacy risks raised by the diversification and integration of omics have been largely overlooked. In this paper, we propose the outline of a framework for assessing privacy risks in multi-omic research and databases. Following a comparison of privacy risks associated with genomic and epigenomic data, we dissect ten privacy risk-impacting omic data properties that affect either the risk of re-identification of research participants, or the sensitivity of the information potentially conveyed by biological data. We then propose a three-step approach for the assessment of privacy risks in the multi-omic era. Thus, we lay grounds for a data property-based, 'pan-omic' approach that moves away from genetic exceptionalism. We conclude by inviting our peers to refine these theoretical foundations, put them to the test in their respective fields, and translate our approach into practical guidance.
Collapse
|
9
|
Robinson PN, Ravanmehr V, Jacobsen JOB, Danis D, Zhang XA, Carmody LC, Gargano MA, Thaxton CL, Karlebach G, Reese J, Holtgrewe M, Köhler S, McMurry JA, Haendel MA, Smedley D. Interpretable Clinical Genomics with a Likelihood Ratio Paradigm. Am J Hum Genet 2020; 107:403-417. [PMID: 32755546 PMCID: PMC7477017 DOI: 10.1016/j.ajhg.2020.06.021] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2020] [Accepted: 06/26/2020] [Indexed: 10/23/2022] Open
Abstract
Human Phenotype Ontology (HPO)-based analysis has become standard for genomic diagnostics of rare diseases. Current algorithms use a variety of semantic and statistical approaches to prioritize the typically long lists of genes with candidate pathogenic variants. These algorithms do not provide robust estimates of the strength of the predictions beyond the placement in a ranked list, nor do they provide measures of how much any individual phenotypic observation has contributed to the prioritization result. However, given that the overall success rate of genomic diagnostics is only around 25%-50% or less in many cohorts, a good ranking cannot be taken to imply that the gene or disease at rank one is necessarily a good candidate. Here, we present an approach to genomic diagnostics that exploits the likelihood ratio (LR) framework to provide an estimate of (1) the posttest probability of candidate diagnoses, (2) the LR for each observed HPO phenotype, and (3) the predicted pathogenicity of observed genotypes. LIkelihood Ratio Interpretation of Clinical AbnormaLities (LIRICAL) placed the correct diagnosis within the first three ranks in 92.9% of 384 case reports comprising 262 Mendelian diseases, and the correct diagnosis had a mean posttest probability of 67.3%. Simulations show that LIRICAL is robust to many typically encountered forms of genomic and phenomic noise. In summary, LIRICAL provides accurate, clinically interpretable results for phenotype-driven genomic diagnostics.
Collapse
Affiliation(s)
- Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA; Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, USA.
| | - Vida Ravanmehr
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Julius O B Jacobsen
- William Harvey Research Institute, Charterhouse Square, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| | - Daniel Danis
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | | | - Leigh C Carmody
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Michael A Gargano
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Courtney L Thaxton
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Guy Karlebach
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Justin Reese
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Manuel Holtgrewe
- Charité Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany
| | - Sebastian Köhler
- Charité Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany
| | | | | | - Damian Smedley
- William Harvey Research Institute, Charterhouse Square, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| |
Collapse
|
10
|
Akdemir D, Knox R, Isidro y Sánchez J. Combining Partially Overlapping Multi-Omics Data in Databases Using Relationship Matrices. FRONTIERS IN PLANT SCIENCE 2020; 11:947. [PMID: 32765543 PMCID: PMC7381228 DOI: 10.3389/fpls.2020.00947] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 06/10/2020] [Indexed: 05/08/2023]
Abstract
Private and public breeding programs, as well as companies and universities, have developed different genomics technologies that have resulted in the generation of unprecedented amounts of sequence data, which bring new challenges in terms of data management, query, and analysis. The magnitude and complexity of these datasets bring new challenges but also an opportunity to use the data available as a whole. Detailed phenotype data, combined with increasing amounts of genomic data, have an enormous potential to accelerate the identification of key traits to improve our understanding of quantitative genetics. Data harmonization enables cross-national and international comparative research, facilitating the extraction of new scientific knowledge. In this paper, we address the complex issue of combining high dimensional and unbalanced omics data. More specifically, we propose a covariance-based method for combining partial datasets in the genotype to phenotype spectrum. This method can be used to combine partially overlapping relationship/covariance matrices. Here, we show with applications that our approach might be advantageous to feature imputation based approaches; we demonstrate how this method can be used in genomic prediction using heterogeneous marker data and also how to combine the data from multiple phenotypic experiments to make inferences about previously unobserved trait relationships. Our results demonstrate that it is possible to harmonize datasets to improve available information across gene-banks, data repositories, or other data resources.
Collapse
Affiliation(s)
- Deniz Akdemir
- Agriculture & Food Science Centre, Animal and Crop Science Division, University College Dublin, Dublin, Ireland
| | - Ron Knox
- SCRDC-CRDSW, Swift Current Research and Developmental Centre, Swift Current, SK, Canada
| | - Julio Isidro y Sánchez
- Agriculture & Food Science Centre, Animal and Crop Science Division, University College Dublin, Dublin, Ireland
- Centro de Biotecnología y Genómica de Plantas (CBGP, UPM – INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, Madrid, Spain
| |
Collapse
|
11
|
Köhler S, Carmody L, Vasilevsky N, Jacobsen JOB, Danis D, Gourdine JP, Gargano M, Harris NL, Matentzoglu N, McMurry JA, Osumi-Sutherland D, Cipriani V, Balhoff JP, Conlin T, Blau H, Baynam G, Palmer R, Gratian D, Dawkins H, Segal M, Jansen AC, Muaz A, Chang WH, Bergerson J, Laulederkind SJF, Yüksel Z, Beltran S, Freeman AF, Sergouniotis PI, Durkin D, Storm AL, Hanauer M, Brudno M, Bello SM, Sincan M, Rageth K, Wheeler MT, Oegema R, Lourghi H, Della Rocca MG, Thompson R, Castellanos F, Priest J, Cunningham-Rundles C, Hegde A, Lovering RC, Hajek C, Olry A, Notarangelo L, Similuk M, Zhang XA, Gómez-Andrés D, Lochmüller H, Dollfus H, Rosenzweig S, Marwaha S, Rath A, Sullivan K, Smith C, Milner JD, Leroux D, Boerkoel CF, Klion A, Carter MC, Groza T, Smedley D, Haendel MA, Mungall C, Robinson PN. Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Res 2020; 47:D1018-D1027. [PMID: 30476213 PMCID: PMC6324074 DOI: 10.1093/nar/gky1105] [Citation(s) in RCA: 412] [Impact Index Per Article: 103.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 10/24/2018] [Indexed: 12/12/2022] Open
Abstract
The Human Phenotype Ontology (HPO)—a standardized vocabulary of phenotypic abnormalities associated with 7000+ diseases—is used by thousands of researchers, clinicians, informaticians and electronic health record systems around the world. Its detailed descriptions of clinical abnormalities and computable disease definitions have made HPO the de facto standard for deep phenotyping in the field of rare disease. The HPO’s interoperability with other ontologies has enabled it to be used to improve diagnostic accuracy by incorporating model organism data. It also plays a key role in the popular Exomiser tool, which identifies potential disease-causing variants from whole-exome or whole-genome sequencing data. Since the HPO was first introduced in 2008, its users have become both more numerous and more diverse. To meet these emerging needs, the project has added new content, language translations, mappings and computational tooling, as well as integrations with external community data. The HPO continues to collaborate with clinical adopters to improve specific areas of the ontology and extend standardized disease descriptions. The newly redesigned HPO website (www.human-phenotype-ontology.org) simplifies browsing terms and exploring clinical features, diseases, and human genes.
Collapse
Affiliation(s)
- Sebastian Köhler
- Charité Centrum für Therapieforschung, Charité-Universitätsmedizin Berlin Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin 10117, Germany.,Einstein Center Digital Future, Berlin 10117, Germany.,Monarch Initiative, monarchinitiative.org
| | - Leigh Carmody
- Monarch Initiative, monarchinitiative.org.,The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Nicole Vasilevsky
- Monarch Initiative, monarchinitiative.org.,Oregon Health & Science University, Portland, OR 97217, USA
| | - Julius O B Jacobsen
- Monarch Initiative, monarchinitiative.org.,Genomics England, Queen Mary University of London, Dawson Hall, Charterhouse Square, London EC1M 6BQ, UK
| | - Daniel Danis
- Monarch Initiative, monarchinitiative.org.,The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Jean-Philippe Gourdine
- Monarch Initiative, monarchinitiative.org.,Oregon Health & Science University, Portland, OR 97217, USA
| | - Michael Gargano
- Monarch Initiative, monarchinitiative.org.,The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Nomi L Harris
- Monarch Initiative, monarchinitiative.org.,Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nicolas Matentzoglu
- Monarch Initiative, monarchinitiative.org.,European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, UK
| | - Julie A McMurry
- Monarch Initiative, monarchinitiative.org.,Linus Pauling institute, Oregon State University, Corvallis, OR, USA
| | - David Osumi-Sutherland
- Monarch Initiative, monarchinitiative.org.,European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, UK
| | - Valentina Cipriani
- Monarch Initiative, monarchinitiative.org.,William Harvey Research Institute, Queen Mary University College of London.,UCL Genetics Institute, University College of London.,UCL Institute of Ophthalmology, University College of London
| | - James P Balhoff
- Monarch Initiative, monarchinitiative.org.,Renaissance Computing Institute, University of North Carolina at Chapel Hill
| | - Tom Conlin
- Monarch Initiative, monarchinitiative.org.,Linus Pauling institute, Oregon State University, Corvallis, OR, USA
| | - Hannah Blau
- Monarch Initiative, monarchinitiative.org.,The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Gareth Baynam
- Western Australian Register of Developmental Anomalies and Genetic Services of Western Australia, Department of Health, Government of Western Australia, WA, Australia.,School of Paediatrics and Telethon Kids Institute, University of Western Australia, Perth, WA, Australia.,Institute for Immunology and Infectious Diseases, Murdoch University, Perth, WA, Australia.,Spatial Sciences, Department of Science and Engineering, Curtin University, Perth, WA, Australia.,The Office of Population Health Genomics, Department of Health, Government of Western Australia, Perth, WA, Australia
| | - Richard Palmer
- Spatial Sciences, Department of Science and Engineering, Curtin University, Perth, WA, Australia
| | - Dylan Gratian
- Western Australian Register of Developmental Anomalies and Genetic Services of Western Australia, Department of Health, Government of Western Australia, WA, Australia
| | - Hugh Dawkins
- The Office of Population Health Genomics, Department of Health, Government of Western Australia, Perth, WA, Australia
| | | | - Anna C Jansen
- Neurogenetics Research Group, Vrije Universiteit Brussel, Brussels, Belgium.,Pediatric Neurology Unit, Department of Pediatrics, UZ Brussel, Brussels, Belgium
| | - Ahmed Muaz
- Monarch Initiative, monarchinitiative.org.,Garvan Institute of Medical Research, Darlinghurst, Sydney, NSW 2010, Australia
| | - Willie H Chang
- Centre for Computational Medicine, Hospital for Sick Children and Department of Computer Science, University of Toronto, Toronto, Canada
| | - Jenna Bergerson
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Stanley J F Laulederkind
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin & Marquette University, 8701 Watertown Plank Road Milwaukee, WI 53226, USA
| | | | - Sergi Beltran
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, Barcelona 08028, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Alexandra F Freeman
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | | | - Daniel Durkin
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Andrea L Storm
- ICF, Rockville, MD, USA.,National Center for Advancing Translational Sciences, Office of Rare Diseases Research, National Institutes of Health, Bethesda, MD, USA
| | - Marc Hanauer
- INSERM, US14-Orphanet, Plateforme Maladies Rares, 75014 Paris, France
| | - Michael Brudno
- Centre for Computational Medicine, Hospital for Sick Children and Department of Computer Science, University of Toronto, Toronto, Canada
| | | | - Murat Sincan
- Sanford Imagenetics, Sanford Health, Sioux Falls, SD, USA
| | - Kayli Rageth
- Sanford Imagenetics, Sanford Health, Sioux Falls, SD, USA
| | - Matthew T Wheeler
- Center for Undiagnosed Diseases, Stanford University School of Medicine, Stanford, CA, USA
| | - Renske Oegema
- Department of Genetics, University Medical Center Utrecht, the Netherlands
| | - Halima Lourghi
- INSERM, US14-Orphanet, Plateforme Maladies Rares, 75014 Paris, France
| | - Maria G Della Rocca
- ICF, Rockville, MD, USA.,National Center for Advancing Translational Sciences, Office of Rare Diseases Research, National Institutes of Health, Bethesda, MD, USA
| | - Rachel Thompson
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, UK
| | | | - James Priest
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Ayushi Hegde
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Ruth C Lovering
- Institute of Cardiovascular Science, University College London, UK
| | | | - Annie Olry
- INSERM, US14-Orphanet, Plateforme Maladies Rares, 75014 Paris, France
| | - Luigi Notarangelo
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Morgan Similuk
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Xingmin A Zhang
- Monarch Initiative, monarchinitiative.org.,The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - David Gómez-Andrés
- Child Neurology Unit. Hospital Universitari Vall d'Hebron, Vall d'Hebron Research Institute (VHIR), Barcelona, Spain
| | - Hanns Lochmüller
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, Barcelona 08028, Spain.,Department of Neuropediatrics and Muscle Disorders, Medical Center-University of Freiburg, Faculty of Medicine, Freiburg, Germany.,Children's Hospital of Eastern Ontario Research Institute, University of Ottawa, Ottawa, Canada.,Division of Neurology, Department of Medicine, The Ottawa Hospital, Ottawa, Canada
| | - Hélène Dollfus
- Centre for Rare Eye Diseases CARGO, SENSGENE FSMR Network, Strasbourg University Hospital, Strasbourg, France
| | - Sergio Rosenzweig
- Immunology Service, Department of Laboratory Medicine, NIH Clinical Center, Bethesda, MD, USA
| | - Shruti Marwaha
- Center for Undiagnosed Diseases, Stanford University School of Medicine, Stanford, CA, USA
| | - Ana Rath
- INSERM, US14-Orphanet, Plateforme Maladies Rares, 75014 Paris, France
| | - Kathleen Sullivan
- Department of Pediatrics, Division of Allergy Immunology, The Children's Hospital of Philadelphia, University of Pennsylvania Perelman School of Medicine, 3615 Civic Center Boulevard, Philadelphia, PA 19104, USA
| | | | - Joshua D Milner
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Dorothée Leroux
- Centre for Rare Eye Diseases CARGO, SENSGENE FSMR Network, Strasbourg University Hospital, Strasbourg, France
| | | | - Amy Klion
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Melody C Carter
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Tudor Groza
- Monarch Initiative, monarchinitiative.org.,Garvan Institute of Medical Research, Darlinghurst, Sydney, NSW 2010, Australia
| | - Damian Smedley
- Monarch Initiative, monarchinitiative.org.,Genomics England, Queen Mary University of London, Dawson Hall, Charterhouse Square, London EC1M 6BQ, UK
| | - Melissa A Haendel
- Monarch Initiative, monarchinitiative.org.,Oregon Health & Science University, Portland, OR 97217, USA.,Linus Pauling institute, Oregon State University, Corvallis, OR, USA
| | - Chris Mungall
- Monarch Initiative, monarchinitiative.org.,Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Peter N Robinson
- Monarch Initiative, monarchinitiative.org.,The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA.,Institute for Systems Genomics, University of Connecticut, Farmington, CT, USA
| |
Collapse
|
12
|
Chang WH, Mashouri P, Lozano AX, Johnstone B, Husić M, Olry A, Maiella S, Balci TB, Sawyer SL, Robinson PN, Rath A, Brudno M. Phenotate: crowdsourcing phenotype annotations as exercises in undergraduate classes. Genet Med 2020; 22:1391-1400. [PMID: 32366968 DOI: 10.1038/s41436-020-0812-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Revised: 04/09/2020] [Accepted: 04/10/2020] [Indexed: 11/10/2022] Open
Abstract
PURPOSE Computational documentation of genetic disorders is highly reliant on structured data for differential diagnosis, pathogenic variant identification, and patient matchmaking. However, most information on rare diseases (RDs) exists in freeform text, such as academic literature. To increase availability of structured RD data, we developed a crowdsourcing approach for collecting phenotype information using student assignments. METHODS We developed Phenotate, a web application for crowdsourcing disease phenotype annotations through assignments for undergraduate genetics students. Using student-collected data, we generated composite annotations for each disease through a machine learning approach. These annotations were compared with those from clinical practitioners and gold standard curated data. RESULTS Deploying Phenotate in five undergraduate genetics courses, we collected annotations for 22 diseases. Student-sourced annotations showed strong similarity to gold standards, with F-measures ranging from 0.584 to 0.868. Furthermore, clinicians used Phenotate annotations to identify diseases with comparable accuracy to other annotation sources and gold standards. For six disorders, no gold standards were available, allowing us to create some of the first structured annotations for them, while students demonstrated ability to research RDs. CONCLUSION Phenotate enables crowdsourcing RD phenotypic annotations through educational assignments. Presented as an intuitive web-based tool, it offers pedagogical benefits and augments the computable RD knowledgebase.
Collapse
Affiliation(s)
- Willie H Chang
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada.,Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Pouria Mashouri
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Alexander X Lozano
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada.,Faculty of Medicine, University of Toronto, Toronto, ON, Canada.,Department of Materials Science & Engineering, Stanford University, Stanford, CA, USA
| | - Brittney Johnstone
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada.,Sunnybrook Health Sciences Centre, Toronto, ON, Canada
| | - Mia Husić
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Annie Olry
- Orphanet, Institut national de la santé et de la recherche médicale, Paris, France
| | - Sylvie Maiella
- Orphanet, Institut national de la santé et de la recherche médicale, Paris, France
| | - Tugce B Balci
- Medical Genetics Program of Southwestern Ontario, London Health Sciences Centre, London, ON, Canada
| | - Sarah L Sawyer
- Department of Genetics, Children's Hospital of Eastern Ontario and Children's Hospital of Eastern Ontario Research Institute, University of Ottawa, Ottawa, ON, Canada
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.,Institute for Systems Genomics, University of Connecticut, Farmington, CT, USA
| | - Ana Rath
- Orphanet, Institut national de la santé et de la recherche médicale, Paris, France
| | - Michael Brudno
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada. .,Department of Computer Science, University of Toronto, Toronto, ON, Canada. .,Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, ON, Canada. .,University Health Network, Toronto, ON, Canada.
| |
Collapse
|
13
|
Supplementation of the ESID registry working definitions for the clinical diagnosis of inborn errors of immunity with encoded human phenotype ontology (HPO) terms. THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY-IN PRACTICE 2020; 8:1778. [DOI: 10.1016/j.jaip.2020.02.019] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2020] [Accepted: 02/11/2020] [Indexed: 11/22/2022]
|
14
|
Waagmeester A, Stupp G, Burgstaller-Muehlbacher S, Good BM, Griffith M, Griffith OL, Hanspers K, Hermjakob H, Hudson TS, Hybiske K, Keating SM, Manske M, Mayers M, Mietchen D, Mitraka E, Pico AR, Putman T, Riutta A, Queralt-Rosinach N, Schriml LM, Shafee T, Slenter D, Stephan R, Thornton K, Tsueng G, Tu R, Ul-Hasan S, Willighagen E, Wu C, Su AI. Wikidata as a knowledge graph for the life sciences. eLife 2020; 9:e52614. [PMID: 32180547 PMCID: PMC7077981 DOI: 10.7554/elife.52614] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 02/28/2020] [Indexed: 12/22/2022] Open
Abstract
Wikidata is a community-maintained knowledge base that has been assembled from repositories in the fields of genomics, proteomics, genetic variants, pathways, chemical compounds, and diseases, and that adheres to the FAIR principles of findability, accessibility, interoperability and reusability. Here we describe the breadth and depth of the biomedical knowledge contained within Wikidata, and discuss the open-source tools we have built to add information to Wikidata and to synchronize it with source databases. We also demonstrate several use cases for Wikidata, including the crowdsourced curation of biomedical ontologies, phenotype-based diagnosis of disease, and drug repurposing.
Collapse
Affiliation(s)
| | - Gregory Stupp
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Sebastian Burgstaller-Muehlbacher
- Center for Integrative Bioinformatics Vienna, Max Perutz Laboratories, University of Vienna and Medical University of ViennaViennaAustria
| | - Benjamin M Good
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Malachi Griffith
- McDonnell Genome Institute, Washington University School of MedicineSt. LouisUnited States
| | - Obi L Griffith
- McDonnell Genome Institute, Washington University School of MedicineSt. LouisUnited States
| | - Kristina Hanspers
- Institute of Data Science and Biotechnology, Gladstone InstitutesSan FranciscoUnited States
| | | | - Toby S Hudson
- School of Chemistry, The University of SydneySydneyAustralia
| | - Kevin Hybiske
- Division of Allergy and Infectious Diseases, Department of Medicine, University of WashingtonSeattleUnited States
| | - Sarah M Keating
- European Bioinformatics Institute (EMBL-EBI)HinxtonUnited Kingdom
| | - Magnus Manske
- Wellcome Trust Sanger InstituteCambridgeUnited Kingdom
| | - Michael Mayers
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Daniel Mietchen
- School of Data Science, University of VirginiaCharlottesvilleUnited States
| | - Elvira Mitraka
- University of Maryland School of MedicineBaltimoreUnited States
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone InstitutesSan FranciscoUnited States
| | - Timothy Putman
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Anders Riutta
- Institute of Data Science and Biotechnology, Gladstone InstitutesSan FranciscoUnited States
| | - Nuria Queralt-Rosinach
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Lynn M Schriml
- University of Maryland School of MedicineBaltimoreUnited States
| | - Thomas Shafee
- Department of Animal Plant and Soil Sciences, La Trobe UniversityMelbourneAustralia
| | - Denise Slenter
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht UniversityMaastrichtNetherlands
| | | | | | - Ginger Tsueng
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Roger Tu
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Sabah Ul-Hasan
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht UniversityMaastrichtNetherlands
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| |
Collapse
|
15
|
Finke MT, Filice RW, Kahn CE. Integrating ontologies of human diseases, phenotypes, and radiological diagnosis. J Am Med Inform Assoc 2019; 26:149-154. [PMID: 30624645 DOI: 10.1093/jamia/ocy161] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Accepted: 11/13/2018] [Indexed: 11/12/2022] Open
Abstract
Mappings between ontologies enable reuse and interoperability of biomedical knowledge. The Radiology Gamuts Ontology (RGO)-an ontology of 16 918 diseases, interventions, and imaging observations-provides a resource for differential diagnosis and automated textual report understanding in radiology. An automated process with subsequent manual review was used to identify exact and partial matches of RGO entities to the Disease Ontology (DO) and the Human Phenotype Ontology (HPO). Exact mappings identified equivalent concepts; partial mappings identified subclass and superclass relationships. A total of 7913 distinct RGO entities (46.8%) were mapped to one or both of the two target ontologies. Integration of RGO's causal knowledge resulted in 9605 axioms that expressed direct causal relationships between DO diseases and HPO phenotypic abnormalities, and allowed one to formulate queries about causal relations using the abstraction properties in those two ontologies. The mappings can be used to support automated diagnostic reasoning, data mining, and knowledge discovery.
Collapse
Affiliation(s)
- Michael T Finke
- Pacific Northwest University of Health Sciences, Yakima, WA, USA
| | - Ross W Filice
- Department of Radiology, MedStar Georgetown University Hospital, Washington, DC, USA
| | - Charles E Kahn
- Department of Radiology and Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
16
|
Sergouniotis PI, Maxime E, Leroux D, Olry A, Thompson R, Rath A, Robinson PN, Dollfus H. An ontological foundation for ocular phenotypes and rare eye diseases. Orphanet J Rare Dis 2019; 14:8. [PMID: 30626441 PMCID: PMC6327432 DOI: 10.1186/s13023-018-0980-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Accepted: 12/14/2018] [Indexed: 12/03/2022] Open
Abstract
Background The optical accessibility of the eye and technological advances in ophthalmic diagnostics have put ophthalmology at the forefront of data-driven medicine. The focus of this study is rare eye disorders, a group of conditions whose clinical heterogeneity and geographic dispersion make data-driven, evidence-based practice particularly challenging. Inter-institutional collaboration and information sharing is crucial but the lack of standardised terminology poses an important barrier. Ontologies are computational tools that include sets of vocabulary terms arranged in hierarchical structures. They can be used to provide robust terminology standards and to enhance data interoperability. Here, we discuss the development of the ophthalmology-related component of two well-established biomedical ontologies, the Human Phenotype Ontology (HPO; includes signs, symptoms and investigation findings) and the Orphanet Rare Disease Ontology (ORDO; includes rare disease nomenclature/nosology). Methods A variety of approaches were used including automated matching to existing resources and extensive manual curation. To achieve the latter, a study group including clinicians, patient representatives and ontology developers from 17 countries was formed. A broad range of terms was discussed and validated during a dedicated workshop attended by 60 members of the group. Results A comprehensive, structured and well-defined set of terms has been agreed on including 1106 terms relating to ocular phenotypes (HPO) and 1202 terms relating to rare eye disease nomenclature (ORDO). These terms and their relevant annotations can be accessed in http://www.human-phenotype-ontology.org/ and http://www.orpha.net/; comments, corrections, suggestions and requests for new terms can be made through these websites. This is an ongoing, community-driven endeavour and both HPO and ORDO are regularly updated. Conclusions To our knowledge, this is the first effort of such scale to provide terminology standards for the rare eye disease community. We hope that this work will not only improve coding and standardise information exchange in clinical care and research, but also it will catalyse the transition to an evidence-based precision ophthalmology paradigm. Electronic supplementary material The online version of this article (10.1186/s13023-018-0980-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Emmanuel Maxime
- Orphanet, INSERM (Institut National de la Santé et de la Recherche Médicale), Paris, France
| | - Dorothée Leroux
- Centre for Rare Eye Diseases CARGO, SENSGENE FSMR Network, Strasbourg University Hospital, Strasbourg, France
| | - Annie Olry
- Orphanet, INSERM (Institut National de la Santé et de la Recherche Médicale), Paris, France
| | | | - Ana Rath
- Orphanet, INSERM (Institut National de la Santé et de la Recherche Médicale), Paris, France
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Hélène Dollfus
- Centre for Rare Eye Diseases CARGO, SENSGENE FSMR Network, Strasbourg University Hospital, Strasbourg, France. .,Laboratoire de Génétique Médicale, Faculté de Médecine de Strasbourg, INSERM U1112, 11 rue Humann, 67 085, Strasbourg, France.
| | | |
Collapse
|
17
|
Abstract
Inherited metabolic disorders (IMDs) are debilitating inherited diseases, with phenotypic, biochemical and genetic heterogeneity, frequently leading to prolonged diagnostic odysseys. Mitochondrial disorders represent one of the most severe classes of IMDs, wherein defects in >350 genes lead to multi-system disease. Diagnostic rates have improved considerably following the adoption of next-generation sequencing (NGS) technologies, but are still far from perfect. Phenomic annotation is an emerging concept which is being utilised to enhance interpretation of NGS results. To test whether phenomic correlations have utility in mitochondrial disease and IMDs, we created a gene-to-phenotype interaction network with searchable elements, for Leigh syndrome, a frequently observed paediatric mitochondrial disorder. The Leigh Map comprises data on 92 genes and 275 phenotypes standardised in human phenotype ontology terms, with 80% predictive accuracy. This commentary highlights the usefulness of the Leigh Map and similar resources and the challenges associated with integrating phenomic technologies into clinical practice.
Collapse
Affiliation(s)
- Joyeeta Rahman
- UCL Great Ormond Street Institute of Child Health, London, UK
| | - Shamima Rahman
- UCL Great Ormond Street Institute of Child Health, London, UK
| |
Collapse
|
18
|
Jia J, Wang R, An Z, Guo Y, Ni X, Shi T. RDAD: A Machine Learning System to Support Phenotype-Based Rare Disease Diagnosis. Front Genet 2018; 9:587. [PMID: 30564269 PMCID: PMC6288202 DOI: 10.3389/fgene.2018.00587] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Accepted: 11/15/2018] [Indexed: 01/21/2023] Open
Abstract
DNA sequencing has allowed for the discovery of the genetic cause for a considerable number of diseases, paving the way for new disease diagnostics. However, due to the lack of clinical samples and records, the molecular cause for rare diseases is always hard to identify, significantly limiting the number of rare Mendelian diseases diagnosed through sequencing technologies. Clinical phenotype information therefore becomes a major resource to diagnose rare diseases. In this article, we adopted both a phenotypic similarity method and a machine learning method to build four diagnostic models to support rare disease diagnosis. All the diagnostic models were validated using the real medical records from RAMEDIS. Each model provides a list of the top 10 candidate diseases as the prediction outcome and the results showed that all models had a high diagnostic precision (≥98%) with the highest recall reaching up to 95% while the models with machine learning methods showed the best performance. To promote effective diagnosis for rare disease in clinical application, we developed the phenotype-based Rare Disease Auxiliary Diagnosis system (RDAD) to assist clinicians in diagnosing rare diseases with the above four diagnostic models. The system is freely accessible through http://www.unimd.org/RDAD/.
Collapse
Affiliation(s)
- Jinmeng Jia
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Ruiyuan Wang
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Zhongxin An
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Yongli Guo
- Beijing Key Laboratory for Pediatric Diseases of Otolaryngology, Head and Neck Surgery, The Ministry of Education Key Laboratory of Major Diseases in Children, Beijing Pediatric Research Institute, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, China
| | - Xi Ni
- Beijing Key Laboratory for Pediatric Diseases of Otolaryngology, Head and Neck Surgery, The Ministry of Education Key Laboratory of Major Diseases in Children, Beijing Pediatric Research Institute, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, China
| | - Tieliu Shi
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
- National Center for International Research of Biological Targeting Diagnosis and Therapy/Guangxi Key Laboratory of Biological Targeting Diagnosis and Therapy Research/Collaborative Innovation Center for Targeting Tumor Diagnosis and Therapy, Guangxi Medical University, Nanning, Guangxi, China
| |
Collapse
|
19
|
Thompson R, Abicht A, Beeson D, Engel AG, Eymard B, Maxime E, Lochmüller H. A nomenclature and classification for the congenital myasthenic syndromes: preparing for FAIR data in the genomic era. Orphanet J Rare Dis 2018; 13:211. [PMID: 30477555 PMCID: PMC6260762 DOI: 10.1186/s13023-018-0955-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Accepted: 11/14/2018] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Congenital myasthenic syndromes (CMS) are a heterogeneous group of inherited neuromuscular disorders sharing the common feature of fatigable weakness due to defective neuromuscular transmission. Despite rapidly increasing knowledge about the genetic origins, specific features and potential treatments for the known CMS entities, the lack of standardized classification at the most granular level has hindered the implementation of computer-based systems for knowledge capture and reuse. Where individual clinical or genetic entities do not exist in disease coding systems, they are often invisible in clinical records and inadequately annotated in information systems, and features that apply to one disease but not another cannot be adequately differentiated. RESULTS We created a detailed classification of all CMS disease entities suitable for use in clinical and genetic databases and decision support systems. To avoid conflict with existing coding systems as well as with expert-defined group-level classifications, we developed a collaboration with the Orphanet nomenclature for rare diseases, creating a clinically understandable name for each entity and placing it within a logical hierarchy that paves the way towards computer-aided clinical systems and improved knowledge bases for CMS that can adequately differentiate between types and ascribe relevant expert knowledge to each. CONCLUSIONS We suggest that data science approaches can be used effectively in the clinical domain in a way that does not disrupt preexisting expert classification and that enhances the utility of existing coding systems. Our classification provides a comprehensive view of the individual CMS entities in a manner that supports differential diagnosis and understanding of the range and heterogeneity of the disease but that also enables robust computational coding and hierarchy for machine-readability. It can be extended as required in the light of future scientific advances, but already provides the starting point for the creation of FAIR (Findable, Accessible, Interoperable and Reusable) knowledge bases of data on the congenital myasthenic syndromes.
Collapse
Affiliation(s)
- Rachel Thompson
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, UK
| | | | - David Beeson
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU UK
| | | | | | - Emmanuel Maxime
- INSERM US14 - Orphanet, Plateforme Maladies Rares, 75014 Paris, France
| | - Hanns Lochmüller
- Children’s Hospital of Eastern Ontario (CHEO) Research Institute, University of Ottawa, Ottawa, ON K1H 8L1 Canada
- Department of Neuropediatrics and Muscle Disorders, Medical Center – University of Freiburg, Faculty of Medicine, Freiburg, Germany
- Centro Nacional de Análisis Genómico (CNAG-CRG), Center for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| |
Collapse
|