1
|
Objectivizing issues in the diagnosis of complex rare diseases: lessons learned from testing existing diagnosis support systems on ciliopathies. BMC Med Inform Decis Mak 2024; 24:134. [PMID: 38789985 PMCID: PMC11127295 DOI: 10.1186/s12911-024-02538-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 05/17/2024] [Indexed: 05/26/2024] Open
Abstract
BACKGROUND There are approximately 8,000 different rare diseases that affect roughly 400 million people worldwide. Many of them suffer from delayed diagnosis. Ciliopathies are rare monogenic disorders characterized by a significant phenotypic and genetic heterogeneity that raises an important challenge for clinical diagnosis. Diagnosis support systems (DSS) applied to electronic health record (EHR) data may help identify undiagnosed patients, which is of paramount importance to improve patients' care. Our objective was to evaluate three online-accessible rare disease DSSs using phenotypes derived from EHRs for the diagnosis of ciliopathies. METHODS Two datasets of ciliopathy cases, either proven or suspected, and two datasets of controls were used to evaluate the DSSs. Patient phenotypes were automatically extracted from their EHRs and converted to Human Phenotype Ontology terms. We tested the ability of the DSSs to diagnose cases in contrast to controls based on Orphanet ontology. RESULTS A total of 79 cases and 38 controls were selected. Performances of the DSSs on ciliopathy real world data (best DSS with area under the ROC curve = 0.72) were not as good as published performances on the test set used in the DSS development phase. None of these systems obtained results which could be described as "expert-level". Patients with multisystemic symptoms were generally easier to diagnose than patients with isolated symptoms. Diseases easily confused with ciliopathy generally affected multiple organs and had overlapping phenotypes. Four challenges need to be considered to improve the performances: to make the DSSs interoperable with EHR systems, to validate the performances in real-life settings, to deal with data quality, and to leverage methods and resources for rare and complex diseases. CONCLUSION Our study provides insights into the complexities of diagnosing highly heterogenous rare diseases and offers lessons derived from evaluation existing DSSs in real-world settings. These insights are not only beneficial for ciliopathy diagnosis but also hold relevance for the enhancement of DSS for various complex rare disorders, by guiding the development of more clinically relevant rare disease DSSs, that could support early diagnosis and finally make more patients eligible for treatment.
Collapse
|
2
|
Aspirin-exacerbated respiratory disease is associated with variants in filaggrin, epithelial integrity, and cellular interactions. THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY. GLOBAL 2024; 3:100205. [PMID: 38317805 PMCID: PMC10838899 DOI: 10.1016/j.jacig.2024.100205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 11/15/2023] [Accepted: 12/01/2023] [Indexed: 02/07/2024]
Abstract
Background Previous studies have determined that up to 6% of patients with aspirin-exacerbated respiratory disease (AERD) have family history of AERD, indicating a possible link with genetic polymorphisms. However, whole exome sequencing (WES) studies of such associations are currently lacking. Objectives We sought to examine whether WES can identify pathogenic variants associated with AERD. Methods Diagnoses of AERD were confirmed in patients with nasal polyps and asthma. WES was performed using an Illumina sequencing platform. Human Phenotype Ontology terms were used to define the patients' phenotypes. Exomiser was used to annotate, filter, and prioritize possible disease-causing genetic variants. Results Of 39 patients with AERD, 41% reported a family history of asthma and 5% reported a family history of AERD. Pathogenic exome variants in the filaggrin gene (FLG) were found in 2 patients (5%). Other variants not known to be pathogenic were detected in an additional 16 patients (41%) in genes related to epithelial integrity and cellular interactions, including genes encoding desmoglein 3 (DSG3), dynein axonemal heavy chain 9 (DNAH9), collagen type VII alpha 1 chain (COL7A1), collagen type XVII alpha 1 chain (COL17A1), chromodomain helicase DNA binding protein-7 (CHD7), TSC complex subunit 2/tuberous sclerosis-2 protein (TSC2), P-selectin (SELP), and platelet-derived growth factor receptor-alpha (PDGFRA). Conclusion WES identified a monogenic susceptibility to AERD in 5% of patients with FLG pathogenic variants. Other variants not previously identified as pathogenic were found in genes relevant to epithelial integrity and cellular interactions and may further reveal genetic factors that contribute to this condition.
Collapse
|
3
|
Improving reporting standards for phenotyping algorithm in biomedical research: 5 fundamental dimensions. J Am Med Inform Assoc 2024; 31:1036-1041. [PMID: 38269642 PMCID: PMC10990558 DOI: 10.1093/jamia/ocae005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/12/2023] [Accepted: 01/08/2024] [Indexed: 01/26/2024] Open
Abstract
INTRODUCTION Phenotyping algorithms enable the interpretation of complex health data and definition of clinically relevant phenotypes; they have become crucial in biomedical research. However, the lack of standardization and transparency inhibits the cross-comparison of findings among different studies, limits large scale meta-analyses, confuses the research community, and prevents the reuse of algorithms, which results in duplication of efforts and the waste of valuable resources. RECOMMENDATIONS Here, we propose five independent fundamental dimensions of phenotyping algorithms-complexity, performance, efficiency, implementability, and maintenance-through which researchers can describe, measure, and deploy any algorithms efficiently and effectively. These dimensions must be considered in the context of explicit use cases and transparent methods to ensure that they do not reflect unexpected biases or exacerbate inequities.
Collapse
|
4
|
Improving prenatal diagnosis through standards and aggregation. Prenat Diagn 2024; 44:454-464. [PMID: 38242839 PMCID: PMC11006584 DOI: 10.1002/pd.6522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 12/17/2023] [Accepted: 12/22/2023] [Indexed: 01/21/2024]
Abstract
Advances in sequencing and imaging technologies enable enhanced assessment in the prenatal space, with a goal to diagnose and predict the natural history of disease, to direct targeted therapies, and to implement clinical management, including transfer of care, election of supportive care, and selection of surgical interventions. The current lack of standardization and aggregation stymies variant interpretation and gene discovery, which hinders the provision of prenatal precision medicine, leaving clinicians and patients without an accurate diagnosis. With large amounts of data generated, it is imperative to establish standards for data collection, processing, and aggregation. Aggregated and homogeneously processed genetic and phenotypic data permits dissection of the genomic architecture of prenatal presentations of disease and provides a dataset on which data analysis algorithms can be tuned to the prenatal space. Here we discuss the importance of generating aggregate data sets and how the prenatal space is driving the development of interoperable standards and phenotype-driven tools.
Collapse
|
5
|
Multi-gene panel sequencing in highly consanguineous families and patients with congenital forms of skeletal dysplasias. Clin Genet 2024. [PMID: 38378010 DOI: 10.1111/cge.14509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 02/05/2024] [Accepted: 02/07/2024] [Indexed: 02/22/2024]
Abstract
Skeletal dysplasias (SKDs) are a heterogeneous group of more than 750 genetic disorders characterized by abnormal development, growth, and maintenance of bones or cartilage in the human skeleton. SKDs are often caused by variants in early patterning genes and in many cases part of multiple malformation syndromes and occur in combination with non-skeletal phenotypes. The aim of this study was to investigate the underlying genetic cause of congenital SKDs in highly consanguineous Pakistani families, as well as in sporadic and familial SKD cases from India using multigene panel sequencing analysis. Therefore, we performed panel sequencing of 386 bone-related genes in 7 highly consanguineous families from Pakistan and 27 cases from India affected with SKDs. In the highly consanguineous families, we were able to identify the underlying genetic cause in five out of seven families, resulting in a diagnostic yield of 71%. Whereas, in the sporadic and familial SKD cases, we identified 12 causative variants, corresponding to a diagnostic yield of 44%. The genetic heterogeneity in our cohorts was very high and we were able to detect various types of variants, including missense, nonsense, and frameshift variants, across multiple genes known to cause different types of SKDs. In conclusion, panel sequencing proved to be a highly effective way to decipher the genetic basis of SKDs in highly consanguineous families as well as sporadic and or familial cases from South Asia. Furthermore, our findings expand the allelic spectrum of skeletal dysplasias.
Collapse
|
6
|
Refined preferences of prioritizers improve intelligent diagnosis for Mendelian diseases. Sci Rep 2024; 14:2845. [PMID: 38310124 PMCID: PMC10838329 DOI: 10.1038/s41598-024-53461-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 01/31/2024] [Indexed: 02/05/2024] Open
Abstract
Phenotype-guided gene prioritizers have proved a highly efficient approach to identifying causal genes for Mendelian diseases. In our previous study, we preliminarily evaluated the performance of ten prioritizers. However, all the selected software was run based on default settings and singleton mode. With a large-scale family dataset from Deciphering Developmental Disorders (DDD) project (N = 305) and an in-house trio cohort (N = 152), the four optimal performers in our prior study including Exomiser, PhenIX, AMELIE, and LIRCIAL were further assessed through parameter optimization and/or the utilization of trio mode. The in-depth assessment revealed high diagnostic yields of the four prioritizers with refined preferences, each alone or together: (1) 83.3-91.8% of the causal genes were presented among the first ten candidates in the final ranking lists of the four tools; (2) Over 97.7% of the causal genes were successfully captured within the top 50 by either of the four software. Exomiser did best in directly hitting the target (ranking the causal gene at the very top) while LIRICAL displayed a predominant overall detection capability. Besides, cases affected by low-penetrance and high-frequency pathogenic variants were found misjudged during the automated prioritization process. The discovery of the limitations shed light on the specific directions of future enhancement for causal-gene ranking tools.
Collapse
|
7
|
RDmaster: A novel phenotype-oriented dialogue system supporting differential diagnosis of rare disease. Comput Biol Med 2024; 169:107924. [PMID: 38181610 DOI: 10.1016/j.compbiomed.2024.107924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 12/18/2023] [Accepted: 01/01/2024] [Indexed: 01/07/2024]
Abstract
BACKGROUND Clinicians often lack the necessary expertise to differentially diagnose multiple underlying rare diseases (RDs) due to their complex and overlapping clinical features, leading to misdiagnoses and delayed treatments. The aim of this study is to develop a novel electronic differential diagnostic support system for RDs. METHOD Through integrating two Bayesian diagnostic methods, a candidate list was generated with enhance clinical interpretability for the further Q&A based differential diagnosis (DDX). To achieve an efficient Q&A dialogue strategy, we introduce a novel metric named the adaptive information gain and Gini index (AIGGI) to evaluate the expected gain of interrogated phenotypes within real-time diagnostic states. RESULTS This DDX tool called RDmaster has been implemented as a web-based platform (http://rdmaster.nbscn.org/). A diagnostic trial involving 238 published RD patients revealed that RDmaster outperformed existing RD diagnostic tools, as well as ChatGPT, and was shown to enhance the diagnostic accuracy through its Q&A system. CONCLUSIONS The RDmaster offers an effective multi-omics differential diagnostic technique and outperforms existing tools and popular large language models, particularly enhancing differential diagnosis in collecting diagnostically beneficial phenotypes.
Collapse
|
8
|
Using multi-scale genomics to associate poorly annotated genes with rare diseases. Genome Med 2024; 16:4. [PMID: 38178268 PMCID: PMC10765705 DOI: 10.1186/s13073-023-01276-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Accepted: 12/15/2023] [Indexed: 01/06/2024] Open
Abstract
BACKGROUND Next-generation sequencing (NGS) has significantly transformed the landscape of identifying disease-causing genes associated with genetic disorders. However, a substantial portion of sequenced patients remains undiagnosed. This may be attributed not only to the challenges posed by harder-to-detect variants, such as non-coding and structural variations but also to the existence of variants in genes not previously associated with the patient's clinical phenotype. This study introduces EvORanker, an algorithm that integrates unbiased data from 1,028 eukaryotic genomes to link mutated genes to clinical phenotypes. METHODS EvORanker utilizes clinical data, multi-scale phylogenetic profiling, and other omics data to prioritize disease-associated genes. It was evaluated on solved exomes and simulated genomes, compared with existing methods, and applied to 6260 knockout genes with mouse phenotypes lacking human associations. Additionally, EvORanker was made accessible as a user-friendly web tool. RESULTS In the analyzed exomic cohort, EvORanker accurately identified the "true" disease gene as the top candidate in 69% of cases and within the top 5 candidates in 95% of cases, consistent with results from the simulated dataset. Notably, EvORanker outperformed existing methods, particularly for poorly annotated genes. In the case of the 6260 knockout genes with mouse phenotypes, EvORanker linked 41% of these genes to observed human disease phenotypes. Furthermore, in two unsolved cases, EvORanker successfully identified DLGAP2 and LPCAT3 as disease candidates for previously uncharacterized genetic syndromes. CONCLUSIONS We highlight clade-based phylogenetic profiling as a powerful systematic approach for prioritizing potential disease genes. Our study showcases the efficacy of EvORanker in associating poorly annotated genes to disease phenotypes observed in patients. The EvORanker server is freely available at https://ccanavati.shinyapps.io/EvORanker/ .
Collapse
|
9
|
Development and clinical validation of real-time artificial intelligence diagnostic companion for fetal ultrasound examination. ULTRASOUND IN OBSTETRICS & GYNECOLOGY : THE OFFICIAL JOURNAL OF THE INTERNATIONAL SOCIETY OF ULTRASOUND IN OBSTETRICS AND GYNECOLOGY 2023; 62:353-360. [PMID: 37161503 DOI: 10.1002/uog.26242] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 02/13/2023] [Accepted: 03/20/2023] [Indexed: 05/11/2023]
Abstract
OBJECTIVE Prenatal diagnosis of a rare disease on ultrasound relies on a physician's ability to remember an intractable amount of knowledge. We developed a real-time decision support system (DSS) that suggests, at each step of the examination, the next phenotypic feature to assess, optimizing the diagnostic pathway to the smallest number of possible diagnoses. The objective of this study was to evaluate the performance of this real-time DSS using clinical data. METHODS This validation study was conducted on a database of 549 perinatal phenotypes collected from two referral centers (one in France and one in the UK). Inclusion criteria were: at least one anomaly was visible on fetal ultrasound after 11 weeks' gestation; the anomaly was confirmed postnatally; an associated rare disease was confirmed or ruled out based on postnatal/postmortem investigation, including physical examination, genetic testing and imaging; and, when confirmed, the syndrome was known by the DSS software. The cases were assessed retrospectively by the software, using either the full phenotype as a single input, or a stepwise input of phenotypic features, as prompted by the software, mimicking its use in a real-life clinical setting. Adjudication of discordant cases, in which there was disagreement between the DSS output and the postnatally confirmed ('ascertained') diagnosis, was performed by a panel of external experts. The proportion of ascertained diagnoses within the software's top-10 differential diagnoses output was evaluated, as well as the sensitivity and specificity of the software to select correctly as its best guess a syndromic or isolated condition. RESULTS The dataset covered 110/408 (27%) diagnoses within the software's database, yielding a cumulative prevalence of 83%. For syndromic cases, the ascertained diagnosis was within the top-10 list in 93% and 83% of cases using the full-phenotype and stepwise input, respectively, after adjudication. The full-phenotype and stepwise approaches were associated, respectively, with a specificity of 94% and 96% and a sensitivity of 99% and 84%. The stepwise approach required an average of 13 queries to reach the final set of diagnoses. CONCLUSIONS The DSS showed high performance when applied to real-world data. This validation study suggests that such software can improve perinatal care, efficiently providing complex and otherwise overlooked knowledge to care-providers involved in ultrasound-based prenatal diagnosis. © 2023 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of International Society of Ultrasound in Obstetrics and Gynecology.
Collapse
|
10
|
PhenoScore quantifies phenotypic variation for rare genetic diseases by combining facial analysis with other clinical features using a machine-learning framework. Nat Genet 2023; 55:1598-1607. [PMID: 37550531 DOI: 10.1038/s41588-023-01469-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 07/05/2023] [Indexed: 08/09/2023]
Abstract
Several molecular and phenotypic algorithms exist that establish genotype-phenotype correlations, including facial recognition tools. However, no unified framework that investigates both facial data and other phenotypic data directly from individuals exists. We developed PhenoScore: an open-source, artificial intelligence-based phenomics framework, combining facial recognition technology with Human Phenotype Ontology data analysis to quantify phenotypic similarity. Here we show PhenoScore's ability to recognize distinct phenotypic entities by establishing recognizable phenotypes for 37 of 40 investigated syndromes against clinical features observed in individuals with other neurodevelopmental disorders and show it is an improvement on existing approaches. PhenoScore provides predictions for individuals with variants of unknown significance and enables sophisticated genotype-phenotype studies by testing hypotheses on possible phenotypic (sub)groups. PhenoScore confirmed previously known phenotypic subgroups caused by variants in the same gene for SATB1, SETBP1 and DEAF1 and provides objective clinical evidence for two distinct ADNP-related phenotypes, already established functionally.
Collapse
|
11
|
Individualised human phenotype ontology gene panels improve clinical whole exome and genome sequencing analytical efficacy in a cohort of developmental and epileptic encephalopathies. Mol Genet Genomic Med 2023; 11:e2167. [PMID: 36967109 PMCID: PMC10337286 DOI: 10.1002/mgg3.2167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 02/21/2023] [Accepted: 03/01/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUND The majority of genetic epilepsies remain unsolved in terms of specific genotype. Phenotype-based genomic analyses have shown potential to strengthen genomic analysis in various ways, including improving analytical efficacy. METHODS We have tested a standardised phenotyping method termed 'Phenomodels' for integrating deep-phenotyping information with our in-house developed clinical whole exome/genome sequencing analytical pipeline. Phenomodels includes a user-friendly epilepsy phenotyping template and an objective measure for selecting which template terms to include in individualised Human Phenotype Ontology (HPO) gene panels. In a pilot study of 38 previously solved cases of developmental and epileptic encephalopathies, we compared the sensitivity and specificity of the individualised HPO gene panels with the clinical epilepsy gene panel. RESULTS The Phenomodels template showed high sensitivity for capturing relevant phenotypic information, where 37/38 individuals' HPO gene panels included the causative gene. The HPO gene panels also had far fewer variants to assess than the epilepsy gene panel. CONCLUSION We have demonstrated a viable approach for incorporating standardised phenotype information into clinical genomic analyses, which may enable more efficient analysis.
Collapse
|
12
|
Using knowledge graphs to infer gene expression in plants. Front Artif Intell 2023; 6:1201002. [PMID: 37384147 PMCID: PMC10298150 DOI: 10.3389/frai.2023.1201002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 05/23/2023] [Indexed: 06/30/2023] Open
Abstract
Introduction Climate change is already affecting ecosystems around the world and forcing us to adapt to meet societal needs. The speed with which climate change is progressing necessitates a massive scaling up of the number of species with understood genotype-environment-phenotype (G×E×P) dynamics in order to increase ecosystem and agriculture resilience. An important part of predicting phenotype is understanding the complex gene regulatory networks present in organisms. Previous work has demonstrated that knowledge about one species can be applied to another using ontologically-supported knowledge bases that exploit homologous structures and homologous genes. These types of structures that can apply knowledge about one species to another have the potential to enable the massive scaling up that is needed through in silico experimentation. Methods We developed one such structure, a knowledge graph (KG) using information from Planteome and the EMBL-EBI Expression Atlas that connects gene expression, molecular interactions, functions, and pathways to homology-based gene annotations. Our preliminary analysis uses data from gene expression studies in Arabidopsis thaliana and Populus trichocarpa plants exposed to drought conditions. Results A graph query identified 16 pairs of homologous genes in these two taxa, some of which show opposite patterns of gene expression in response to drought. As expected, analysis of the upstream cis-regulatory region of these genes revealed that homologs with similar expression behavior had conserved cis-regulatory regions and potential interaction with similar trans-elements, unlike homologs that changed their expression in opposite ways. Discussion This suggests that even though the homologous pairs share common ancestry and functional roles, predicting expression and phenotype through homology inference needs careful consideration of integrating cis and trans-regulatory components in the curated and inferred knowledge graph.
Collapse
|
13
|
Buffering of genetic dominance by allele-specific protein complex assembly. SCIENCE ADVANCES 2023; 9:eadf9845. [PMID: 37256959 PMCID: PMC10413657 DOI: 10.1126/sciadv.adf9845] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 04/24/2023] [Indexed: 06/02/2023]
Abstract
Protein complex assembly often occurs while subunits are being translated, resulting in complexes whose subunits were translated from the same mRNA in an allele-specific manner. It has thus been hypothesized that such cotranslational assembly may counter the assembly-mediated dominant-negative effect, whereby co-assembly of mutant and wild-type subunits "poisons" complex activity. Here, we show that cotranslationally assembling subunits are much less likely to be associated with autosomal dominant relative to recessive disorders, and that subunits with dominant-negative disease mutations are significantly depleted in cotranslational assembly compared to those associated with loss-of-function mutations. We also find that complexes with known dominant-negative effects tend to expose their interfaces late during translation, lessening the likelihood of cotranslational assembly. Finally, by combining complex properties with other features, we trained a computational model for predicting proteins likely to be associated with non-loss-of-function disease mechanisms, which we believe will be of considerable utility for protein variant interpretation.
Collapse
|
14
|
Democratising or disrupting diagnosis? Ethical issues raised by the use of AI tools for rare disease diagnosis. SSM. QUALITATIVE RESEARCH IN HEALTH 2023; 3:100240. [PMID: 37426704 PMCID: PMC10323712 DOI: 10.1016/j.ssmqr.2023.100240] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 02/13/2023] [Accepted: 02/13/2023] [Indexed: 07/11/2023]
Abstract
Computational phenotyping (CP) technology uses facial recognition algorithms to classify and potentially diagnose rare genetic disorders on the basis of digitised facial images. This AI technology has a number of research as well as clinical applications, such as supporting diagnostic decision-making. Using the example of CP, we examine stakeholders' views of the benefits and costs of using AI as a diagnostic tool within the clinic. Through a series of in-depth interviews (n = 20) with: clinicians, clinical researchers, data scientists, industry and support group representatives, we report stakeholder views regarding the adoption of this technology in a clinical setting. While most interviewees were supportive of employing CP as a diagnostic tool in some capacity we observed ambivalence around the potential for artificial intelligence to overcome diagnostic uncertainty in a clinical context. Thus, while there was widespread agreement amongst interviewees concerning the public benefits of AI assisted diagnosis, namely, its potential to increase diagnostic yield and enable faster more objective and accurate diagnoses by up skilling non specialists and thereby enabling access to diagnosis that is potentially lacking, interviewees also raised concerns about ensuring algorithmic reliability, expunging algorithmic bias and that the use of AI could result in deskilling the specialist clinical workforce. We conclude that, prior to widespread clinical implementation, on-going reflection is needed regarding the trade-offs required to determine acceptable levels of bias and conclude that diagnostic AI tools should only be employed as an assistive technology within the dysmorphology clinic.
Collapse
|
15
|
A Robust Phenotype-driven Likelihood Ratio Analysis Approach Assisting Interpretable Clinical Diagnosis of Rare Diseases. J Biomed Inform 2023; 142:104372. [PMID: 37105510 DOI: 10.1016/j.jbi.2023.104372] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 02/20/2023] [Accepted: 04/20/2023] [Indexed: 04/29/2023]
Abstract
Phenotype-based prioritization of candidate genes and diseases has become a well-established approach for multi-omics diagnostics of rare diseases. Most current algorithms exploit semantic analysis and probabilistic statistics based on Human Phenotype Ontology and are commonly superior to naive search methods. However, these algorithms are mostly less interpretable and do not perform well in real clinical scenarios due to noise and imprecision of query terms, and the fact that individuals may not display all phenotypes of the disease they belong to. We present a Phenotype-driven Likelihood Ratio analysis approach (PheLR) assisting interpretable clinical diagnosis of rare diseases. With a likelihood ratio paradigm, PheLR estimates the posterior probability of candidate diseases and how much a phenotypic feature contributes to the prioritization result. Benchmarked using simulated and realistic patients, PheLR shows significant advantages over current approaches and is robust to noise and inaccuracy. To facilitate clinical practice and visualized differential diagnosis, PheLR is implemented as an online web tool (http://phelr.nbscn.org).
Collapse
|
16
|
High-Resolution and Multidimensional Phenotypes Can Complement Genomics Data to Diagnose Diseases in the Neonatal Population. PHENOMICS (CHAM, SWITZERLAND) 2023; 3:204-215. [PMID: 37197647 PMCID: PMC10110825 DOI: 10.1007/s43657-022-00071-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 08/03/2022] [Accepted: 08/08/2022] [Indexed: 05/19/2023]
Abstract
Advances in genomic medicine have greatly improved our understanding of human diseases. However, phenome is not well understood. High-resolution and multidimensional phenotypes have shed light on the mechanisms underlying neonatal diseases in greater details and have the potential to optimize clinical strategies. In this review, we first highlight the value of analyzing traditional phenotypes using a data science approach in the neonatal population. We then discuss recent research on high-resolution, multidimensional, and structured phenotypes in neonatal critical diseases. Finally, we briefly introduce current technologies available for the analysis of multidimensional data and the value that can be provided by integrating these data into clinical practice. In summary, a time series of multidimensional phenome can improve our understanding of disease mechanisms and diagnostic decision-making, stratify patients, and provide clinicians with optimized strategies for therapeutic intervention; however, the available technologies for collecting multidimensional data and the best platform for connecting multiple modalities should be considered.
Collapse
|
17
|
Abstract
Developing personalized diagnostic strategies and targeted treatments requires a deep understanding of disease biology and the ability to dissect the relationship between molecular and genetic factors and their phenotypic consequences. However, such knowledge is fragmented across publications, non-standardized repositories, and evolving ontologies describing various scales of biological organization between genotypes and clinical phenotypes. Here, we present PrimeKG, a multimodal knowledge graph for precision medicine analyses. PrimeKG integrates 20 high-quality resources to describe 17,080 diseases with 4,050,249 relationships representing ten major biological scales, including disease-associated protein perturbations, biological processes and pathways, anatomical and phenotypic scales, and the entire range of approved drugs with their therapeutic action, considerably expanding previous efforts in disease-rooted knowledge graphs. PrimeKG contains an abundance of 'indications', 'contradictions', and 'off-label use' drug-disease edges that lack in other knowledge graphs and can support AI analyses of how drugs affect disease-associated networks. We supplement PrimeKG's graph structure with language descriptions of clinical guidelines to enable multimodal analyses and provide instructions for continual updates of PrimeKG as new data become available.
Collapse
|
18
|
Computational model for disease research. Brief Bioinform 2023; 24:6987819. [PMID: 36642407 DOI: 10.1093/bib/bbac615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
|
19
|
Collaborative Platforms and Matchmaking Algorithms for Research and Education, Establishment, and Optimization of Consortia. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2023; 1424:125-133. [PMID: 37486486 DOI: 10.1007/978-3-031-31982-2_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/25/2023]
Abstract
Matchmaking has a great position in the rational allocation of resources in several fields, ranging from market operation to people's daily lives. Matchmakers have evolved through artificial intelligence technologies and are being introduced in numerous aspects of industry, research, and academia in solving decision issues, research innovation design, and building robust and efficient networks. The goal of this report is to describe the collaborative platforms and matchmaking algorithms for research and education, as well as the establishment and optimization of consortia.
Collapse
|
20
|
PhenoExam: gene set analyses through integration of different phenotype databases. BMC Bioinformatics 2022; 23:567. [PMID: 36587217 PMCID: PMC9805686 DOI: 10.1186/s12859-022-05122-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 12/22/2022] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Gene set enrichment analysis (detecting phenotypic terms that emerge as significant in a set of genes) plays an important role in bioinformatics focused on diseases of genetic basis. To facilitate phenotype-oriented gene set analysis, we developed PhenoExam, a freely available R package for tool developers and a web interface for users, which performs: (1) phenotype and disease enrichment analysis on a gene set; (2) measures statistically significant phenotype similarities between gene sets and (3) detects significant differential phenotypes or disease terms across different databases. RESULTS PhenoExam generates sensitive and accurate phenotype enrichment analyses. It is also effective in segregating gene sets or Mendelian diseases with very similar phenotypes. We tested the tool with two similar diseases (Parkinson and dystonia), to show phenotype-level similarities but also potentially interesting differences. Moreover, we used PhenoExam to validate computationally predicted new genes potentially associated with epilepsy. CONCLUSIONS We developed PhenoExam, a freely available R package and Web application, which performs phenotype enrichment and disease enrichment analysis on gene set G, measures statistically significant phenotype similarities between pairs of gene sets G and G' and detects statistically significant exclusive phenotypes or disease terms, across different databases. We proved with simulations and real cases that it is useful to distinguish between gene sets or diseases with very similar phenotypes. Github R package URL is https://github.com/alexcis95/PhenoExam . Shiny App URL is https://alejandrocisterna.shinyapps.io/phenoexamweb/ .
Collapse
|
21
|
Phenotype-aware prioritisation of rare Mendelian disease variants. Trends Genet 2022; 38:1271-1283. [PMID: 35934592 PMCID: PMC9950798 DOI: 10.1016/j.tig.2022.07.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/06/2022] [Accepted: 07/05/2022] [Indexed: 01/24/2023]
Abstract
A molecular diagnosis from the analysis of sequencing data in rare Mendelian diseases has a huge impact on the management of patients and their families. Numerous patient phenotype-aware variant prioritisation (VP) tools have been developed to help automate this process, and shorten the diagnostic odyssey, but performance statistics on real patient data are limited. Here we identify, assess, and compare the performance of all up-to-date, freely available, and programmatically accessible tools using a whole-exome, retinal disease dataset from 134 individuals with a molecular diagnosis. All tools were able to identify around two-thirds of the genetic diagnoses as the top-ranked candidate, with LIRICAL performing best overall. Finally, we discuss the challenges to overcome most cases remaining undiagnosed after current, state-of-the-art practices.
Collapse
|
22
|
Wiedemann-Steiner Syndrome: Case Report and Review of Literature. CHILDREN (BASEL, SWITZERLAND) 2022; 9:children9101545. [PMID: 36291481 PMCID: PMC9600770 DOI: 10.3390/children9101545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Revised: 10/04/2022] [Accepted: 10/08/2022] [Indexed: 11/07/2022]
Abstract
Wiedemann–Steiner syndrome (WDSTS) is an autosomal dominant disorder with a broad and variable phenotypic spectrum characterized by intellectual disability, prenatal and postnatal growth retardation, hypertrichosis, characteristic facial features, behavioral problems, and congenital anomalies involving different systems. Here, we report a five-year-old boy who was diagnosed with WDSTS based on the results of Trio-based whole-exome sequencing and an assessment of his clinical features. He had intellectual disability, short stature, hirsutism, and atypical facial features, including a low hairline, down-slanting palpebral fissures, hypertelorism, long eyelashes, broad and arching eyebrows, synophrys, a bulbous nose, a broad nasal tip, and dental/oral anomalies. However, not all individuals with WDSTS exhibit the classic phenotype, so the spectrum of the disorder can vary widely from relatively atypical facial features to multiple systemic symptoms. Here, we summarize the clinical and molecular spectrum, diagnosis and differential diagnosis, long-term management, and care planning of WDSTS to improve the awareness of both pediatricians and clinical geneticists and to promote the diagnosis and treatment of the disease.
Collapse
|
23
|
Early-Onset Osteoporosis: Rare Monogenic Forms Elucidate the Complexity of Disease Pathogenesis Beyond Type I Collagen. J Bone Miner Res 2022; 37:1623-1641. [PMID: 35949115 PMCID: PMC9542053 DOI: 10.1002/jbmr.4668] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 07/22/2022] [Accepted: 08/01/2022] [Indexed: 12/05/2022]
Abstract
Early-onset osteoporosis (EOOP), characterized by low bone mineral density (BMD) and fractures, affects children, premenopausal women and men aged <50 years. EOOP may be secondary to a chronic illness, long-term medication, nutritional deficiencies, etc. If no such cause is identified, EOOP is regarded primary and may then be related to rare variants in genes playing a pivotal role in bone homeostasis. If the cause remains unknown, EOOP is considered idiopathic. The scope of this review is to guide through clinical and genetic diagnostics of EOOP, summarize the present knowledge on rare monogenic forms of EOOP, and describe how analysis of bone biopsy samples can lead to a better understanding of the disease pathogenesis. The diagnostic pathway of EOOP is often complicated and extensive assessments may be needed to reliably exclude secondary causes. Due to the genetic heterogeneity and overlapping features in the various genetic forms of EOOP and other bone fragility disorders, the genetic diagnosis usually requires the use of next-generation sequencing to investigate several genes simultaneously. Recent discoveries have elucidated the complexity of disease pathogenesis both regarding genetic architecture and bone tissue-level pathology. Two rare monogenic forms of EOOP are due to defects in genes partaking in the canonical WNT pathway: LRP5 and WNT1. Variants in the genes encoding plastin-3 (PLS3) and sphingomyelin synthase 2 (SGMS2) have also been found in children and young adults with skeletal fragility. The molecular mechanisms leading from gene defects to clinical manifestations are often not fully understood. Detailed analysis of patient-derived transiliac bone biopsies gives valuable information to understand disease pathogenesis, distinguishes EOOP from other bone fragility disorders, and guides in patient management, but is not widely available in clinical settings. Despite the great advances in this field, EOOP remains an insufficiently explored entity and further research is needed to optimize diagnostic and therapeutic approaches. © 2022 The Authors. Journal of Bone and Mineral Research published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research (ASBMR).
Collapse
|
24
|
Phenotype-driven approaches to enhance variant prioritization and diagnosis of rare disease. Hum Mutat 2022; 43:1071-1081. [PMID: 35391505 PMCID: PMC9288531 DOI: 10.1002/humu.24380] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 01/25/2022] [Accepted: 04/03/2022] [Indexed: 11/20/2022]
Abstract
Rare disease diagnostics and disease gene discovery have been revolutionized by whole-exome and genome sequencing but identifying the causative variant(s) from the millions in each individual remains challenging. The use of deep phenotyping of patients and reference genotype-phenotype knowledge, alongside variant data such as allele frequency, segregation, and predicted pathogenicity, has proved an effective strategy to tackle this issue. Here we review the numerous tools that have been developed to automate this approach and demonstrate the power of such an approach on several thousand diagnosed cases from the 100,000 Genomes Project. Finally, we discuss the challenges that need to be overcome if we are going to improve detection rates and help the majority of patients that still remain without a molecular diagnosis after state-of-the-art genomic interpretation.
Collapse
|
25
|
Clinical Spectrum of Hereditary Hypophosphatemic Rickets With Hypercalciuria (HHRH). J Bone Miner Res 2022; 37:1580-1591. [PMID: 35689455 DOI: 10.1002/jbmr.4630] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 05/19/2022] [Accepted: 06/04/2022] [Indexed: 11/11/2022]
Abstract
Hereditary hypophosphatemic rickets with hypercalciuria (HHRH) represents an FGF23-independent disease caused by biallelic variants in the solute carrier family 34-member 3 (SLC34A3) gene. HHRH is characterized by chronic hypophosphatemia and an increased risk for nephrocalcinosis and rickets/osteomalacia, muscular weakness, and secondary limb deformity. Biochemical changes, but no relevant skeletal changes, have been reported for heterozygous SLC34A3 carriers. Therefore, we assessed the characteristics of individuals with biallelic and monoallelic SLC34A3 variants. In 8 index patients and 5 family members, genetic analysis was performed using a custom gene panel. The skeletal assessment comprised biochemical parameters, areal bone mineral density (aBMD), and bone microarchitecture. Pathogenic SLC34A3 variants were revealed in 7 of 13 individuals (2 homozygous, 5 heterozygous), whereas 3 of 13 carried monoallelic variants of unknown significance. Whereas both homozygous individuals had nephrocalcinosis, only one displayed a skeletal phenotype consistent with HHRH. Reduced to low-normal phosphate levels, decreased tubular reabsorption of phosphate (TRP), and high-normal to elevated values of 1,25-OH2 -D3 accompanied by normal cFGF23 levels were revealed independently of mutational status. Interestingly, individuals with nephrocalcinosis showed significantly increased calcium excretion and 1,25-OH2 -D3 levels but normal phosphate reabsorption. Furthermore, aBMD Z-score <-2.0 was revealed in 4 of 8 heterozygous carriers, and HR-pQCT analysis showed a moderate decrease in structural parameters. Our findings highlight the clinical relevance also of monoallelic SLC34A3 variants, including their potential skeletal impairment. Calcium excretion and 1,25-OH2 -D3 levels, but not TRP, were associated with nephrocalcinosis. Future studies should investigate the effects of distinct SLC34A3 variants and optimize treatment and monitoring regimens to prevent nephrocalcinosis and skeletal deterioration. © 2022 The Authors. Journal of Bone and Mineral Research published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research (ASBMR).
Collapse
|
26
|
PheNominal: an EHR-integrated web application for structured deep phenotyping at the point of care. BMC Med Inform Decis Mak 2022; 22:198. [PMID: 35902925 PMCID: PMC9335954 DOI: 10.1186/s12911-022-01927-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Accepted: 07/06/2022] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Clinical phenotype information greatly facilitates genetic diagnostic interpretations pipelines in disease. While post-hoc extraction using natural language processing on unstructured clinical notes continues to improve, there is a need to improve point-of-care collection of patient phenotypes. Therefore, we developed "PheNominal", a point-of-care web application, embedded within Epic electronic health record (EHR) workflows, to permit capture of standardized phenotype data. METHODS Using bi-directional web services available within commercial EHRs, we developed a lightweight web application that allows users to rapidly browse and identify relevant terms from the Human Phenotype Ontology (HPO). Selected terms are saved discretely within the patient's EHR, permitting reuse both in clinical notes as well as in downstream diagnostic and research pipelines. RESULTS In the 16 months since implementation, PheNominal was used to capture discrete phenotype data for over 1500 individuals and 11,000 HPO terms during clinic and inpatient encounters for a genetic diagnostic consultation service within a quaternary-care pediatric academic medical center. An average of 7 HPO terms were captured per patient. Compared to a manual workflow, the average time to enter terms for a patient was reduced from 15 to 5 min per patient, and there were fewer annotation errors. CONCLUSIONS Modern EHRs support integration of external applications using application programming interfaces. We describe a practical application of these interfaces to facilitate deep phenotype capture in a discrete, structured format within a busy clinical workflow. Future versions will include a vendor-agnostic implementation using FHIR. We describe pilot efforts to integrate structured phenotyping through controlled dictionaries into diagnostic and research pipelines, reducing manual effort for phenotype documentation and reducing errors in data entry.
Collapse
|
27
|
Creation and evaluation of full-text literature-derived, feature-weighted disease models of genetically determined developmental disorders. Database (Oxford) 2022; 2022:baac038. [PMID: 35670729 PMCID: PMC9216525 DOI: 10.1093/database/baac038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 03/26/2022] [Accepted: 05/25/2022] [Indexed: 11/24/2022]
Abstract
There are >2500 different genetically determined developmental disorders (DD), which, as a group, show very high levels of both locus and allelic heterogeneity. This has led to the wide-spread use of evidence-based filtering of genome-wide sequence data as a diagnostic tool in DD. Determining whether the association of a filtered variant at a specific locus is a plausible explanation of the phenotype in the proband is crucial and commonly requires extensive manual literature review by both clinical scientists and clinicians. Access to a database of weighted clinical features extracted from rigorously curated literature would increase the efficiency of this process and facilitate the development of robust phenotypic similarity metrics. However, given the large and rapidly increasing volume of published information, conventional biocuration approaches are becoming impractical. Here, we present a scalable, automated method for the extraction of categorical phenotypic descriptors from the full-text literature. Papers identified through literature review were downloaded and parsed using the Cadmus custom retrieval package. Human Phenotype Ontology terms were extracted using MetaMap, with 76-84% precision and 65-73% recall. Mean terms per paper increased from 9 in title + abstract, to 68 using full text. We demonstrate that these literature-derived disease models plausibly reflect true disease expressivity more accurately than widely used manually curated models, through comparison with prospectively gathered data from the Deciphering Developmental Disorders study. The area under the curve for receiver operating characteristic (ROC) curves increased by 5-10% through the use of literature-derived models. This work shows that scalable automated literature curation increases performance and adds weight to the need for this strategy to be integrated into informatic variant analysis pipelines. Database URL: https://doi.org/10.1093/database/baac038.
Collapse
|
28
|
VIPPID: a gene-specific single nucleotide variant pathogenicity prediction tool for primary immunodeficiency diseases. Brief Bioinform 2022; 23:6590436. [PMID: 35598327 PMCID: PMC9487673 DOI: 10.1093/bib/bbac176] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 04/05/2022] [Accepted: 04/18/2022] [Indexed: 01/04/2023] Open
Abstract
Abstract
Distinguishing pathogenic variants from non-pathogenic ones remains a major challenge in clinical genetic testing of primary immunodeficiency (PID) patients. Most of the existing mutation pathogenicity prediction tools treat all mutations as homogeneous entities, ignoring the differences in characteristics of different genes, and use the same model for genes in different diseases. In this study, we developed a single nucleotide variant (SNV) pathogenicity prediction tool, Variant Impact Predictor for PIDs (VIPPID; https://mylab.shinyapps.io/VIPPID/), which was tailored for PIDs genes and used a specific model for each of the most prevalent PID known genes. It employed a Conditional Inference Forest model and utilized information of 85 features of SNVs and scores from 20 existing prediction tools. Evaluation of VIPPID showed that it had superior performance (area under the curve = 0.91) over non-specific conventional tools. In addition, we also showed that the gene-specific model outperformed the non-gene-specific models. Our study demonstrated that disease-specific and gene-specific models can improve SNV pathogenicity prediction performance. This observation supports the notion that each feature of mutations in the model can be potentially used, in a new algorithm, to investigate the characteristics and function of the encoded proteins.
Collapse
|
29
|
Revisiting benchmark study for response to methodological critiques of 'Evaluation of phenotype-driven gene prioritization methods for Mendelian diseases'. Brief Bioinform 2022; 23:6580907. [PMID: 35514206 DOI: 10.1093/bib/bbac181] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 03/23/2022] [Accepted: 04/22/2022] [Indexed: 01/20/2023] Open
Abstract
Evaluation of phenotype-driven gene prioritization approaches for Mendelian diseases could facilitate the software development and method selection for the workflow configuration and clinical practice. In our original article, the performance of 10 well-recognized causal-gene prioritization methods was benchmarked using 305 cases from the deciphering developmental disorders (DDD) project and 209 in-house cases via a relatively unbiased methodology. The evaluation results showed that LIRICAL and AMELIE were two of the best methods in our benchmark experiments, and the possible integrative approach of these two methods may enhance the diagnostic efficiency. However, some methodological critiques were raised by the authors of Exomiser and PhenIX, so we revisited our benchmarking studies to answer their comments in this letter.
Collapse
|
30
|
Abstract
Rare genetic disorders, when considered together, are relatively common. Despite advancements in genetics and genomics technologies as well as increased understanding of genomic function and dysfunction, many genetic diseases continue to be difficult to diagnose. The goal of this Review is to increase the familiarity of genetic testing strategies for non-genetics providers. As genetic testing is increasingly used in primary care, many subspecialty clinics, and various inpatient settings, it is important that non-genetics providers have a fundamental understanding of the strengths and weaknesses of various genetic testing strategies as well as develop an ability to interpret genetic testing results. We provide background on commonly used genetic testing approaches, give examples of phenotypes in which the various genetic testing approaches are used, describe types of genetic and genomic variations, cover challenges in variant identification, provide examples in which next-generation sequencing (NGS) failed to uncover the variant responsible for a disease, and discuss opportunities for continued improvement in the application of NGS clinically. As genetic testing becomes increasingly a part of all areas of medicine, familiarity with genetic testing approaches and result interpretation is vital to decrease the burden of undiagnosed disease.
Collapse
|
31
|
Predicting genes from phenotypes using human phenotype ontology (HPO) terms. Hum Genet 2022; 141:1749-1760. [PMID: 35357580 DOI: 10.1007/s00439-022-02449-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 03/16/2022] [Indexed: 11/28/2022]
Abstract
The interpretation of genomic variants following whole exome sequencing (WES) can be aided using human phenotype ontology (HPO) terms to standardize clinical features and predict causative genes. We performed WES on 453 patients diagnosed prior to 18 years of age and identified 114 pathogenic (P) or likely pathogenic (LP) variants in 112 patients. We utilized PhenoDB to extract HPO terms from provider notes and then used Phen2Gene to generate a gene score and gene ranking from each list of HPO terms. We assigned Phen2Gene gene rankings to 6 rank classes, with class 1 covering raw gene rankings of 1 to 10 and class 2 covering rankings from 11 to 50 out of a total of 17,126 possible gene rankings. Phen2Gene ranked causative genes into rank class 1 or 2 in 27.7% of cases and the genes in rank class 1 were all associated with well-characterized phenotypes. We found significant associations between the gene score and the number of years, since the gene was first published, the number of HPO terms with an hierarchical depth greater or equal to 11, and the number of Online Mendelian Inheritance in Man terms associated with the phenotype and gene. We conclude that genes associated with recognizable phenotypes and terms deep in the HPO hierarchy have the best chance of producing a high gene score and ranking in class 1 to 2 using Phen2Gene software with HPO terms. Clinicians and laboratory staff should consider these results when HPO terms are employed to prioritize candidate genes.
Collapse
|
32
|
Evaluation of phenotype-driven gene prioritization methods for Mendelian diseases. Brief Bioinform 2022; 23:6521702. [PMID: 35134823 PMCID: PMC8921623 DOI: 10.1093/bib/bbac019] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 01/10/2022] [Accepted: 01/13/2022] [Indexed: 12/31/2022] Open
Abstract
It’s challenging work to identify disease-causing genes from the next-generation sequencing (NGS) data of patients with Mendelian disorders. To improve this situation, researchers have developed many phenotype-driven gene prioritization methods using a patient’s genotype and phenotype information, or phenotype information only as input to rank the candidate’s pathogenic genes. Evaluations of these ranking methods provide practitioners with convenience for choosing an appropriate tool for their workflows, but retrospective benchmarks are underpowered to provide statistically significant results in their attempt to differentiate. In this research, the performance of ten recognized causal-gene prioritization methods was benchmarked using 305 cases from the Deciphering Developmental Disorders (DDD) project and 209 in-house cases via a relatively unbiased methodology. The evaluation results show that methods using Human Phenotype Ontology (HPO) terms and Variant Call Format (VCF) files as input achieved better overall performance than those using phenotypic data alone. Besides, LIRICAL and AMELIE, two of the best methods in our benchmark experiments, complement each other in cases with the causal genes ranked highly, suggesting a possible integrative approach to further enhance the diagnostic efficiency. Our benchmarking provides valuable reference information to the computer-assisted rapid diagnosis in Mendelian diseases and sheds some light on the potential direction of future improvement on disease-causing gene prioritization methods.
Collapse
|
33
|
StrVCTVRE: A supervised learning method to predict the pathogenicity of human genome structural variants. Am J Hum Genet 2022; 109:195-209. [PMID: 35032432 PMCID: PMC8874149 DOI: 10.1016/j.ajhg.2021.12.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 12/09/2021] [Indexed: 12/12/2022] Open
Abstract
Whole-genome sequencing resolves many clinical cases where standard diagnostic methods have failed. However, at least half of these cases remain unresolved after whole-genome sequencing. Structural variants (SVs; genomic variants larger than 50 base pairs) of uncertain significance are the genetic cause of a portion of these unresolved cases. As sequencing methods using long or linked reads become more accessible and SV detection algorithms improve, clinicians and researchers are gaining access to thousands of reliable SVs of unknown disease relevance. Methods to predict the pathogenicity of these SVs are required to realize the full diagnostic potential of long-read sequencing. To address this emerging need, we developed StrVCTVRE to distinguish pathogenic SVs from benign SVs that overlap exons. In a random forest classifier, we integrated features that capture gene importance, coding region, conservation, expression, and exon structure. We found that features such as expression and conservation are important but are absent from SV classification guidelines. We leveraged multiple resources to construct a size-matched training set of rare, putatively benign and pathogenic SVs. StrVCTVRE performs accurately across a wide SV size range on independent test sets, which will allow clinicians and researchers to eliminate about half of SVs from consideration while retaining a 90% sensitivity. We anticipate clinicians and researchers will use StrVCTVRE to prioritize SVs in probands where no SV is immediately compelling, empowering deeper investigation into novel SVs to resolve cases and understand new mechanisms of disease. StrVCTVRE runs rapidly and is publicly available.
Collapse
|
34
|
Novel SH3PXD2B variant identified by whole-exome sequencing in a Turkish newborn with Frank-Ter Haar Syndrome. Clin Dysmorphol 2022; 31:45-49. [PMID: 34538861 DOI: 10.1097/mcd.0000000000000389] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
35
|
Sensorineural hearing loss in GSD type I patients. A newly recognized symptomatic association of potential clinical significance and unclear pathomechanism. Int J Pediatr Otorhinolaryngol 2021; 151:110970. [PMID: 34775139 DOI: 10.1016/j.ijporl.2021.110970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 10/15/2021] [Accepted: 11/08/2021] [Indexed: 10/19/2022]
Abstract
OBJECTIVE Glycogen storage disease (GSD) type I is an inborn error of carbohydrates metabolism characterized by inability to convert glucose-6-phosphate to glucose. It presents with serious liver and metabolic complications, as well as in type Ib with severe infections due to neutropenia. So far, the sensorineural hearing impairment has not been reported in these patients. Bilateral, sensorineural hearing impairment was diagnosed in four unrelated GSDI patients. Congenital origin of hearing loss and descending audiometric curves warranted the need for future investigations. METHODS Hearing status was assessed in entire group of 40 children with GSD type I. Then, molecular testing, massive parallel sequencing was performed in the four probands and their parents in order to find possible genetic background of auditory dysfunction in these patients. RESULTS Pathogenic variants in G6PC and SLC37A4 related to the phenotypes of GSDI subtype Ia and subtype Ib were detected, each in two probands, respectively. No change in the genes involved in auditory pathway dysfunction was found. CONCLUSIONS Sensorineural hearing loss appears to be associated with GSDI in approximately one out of ten cases. Careful assessment and monitoring of auditory functions of patients with GSDI is recommended.
Collapse
|
36
|
A Visual Phenotype-Based Differential Diagnosis Process for Rare Diseases. Interdiscip Sci 2021; 14:331-348. [PMID: 34751921 DOI: 10.1007/s12539-021-00490-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 10/23/2021] [Accepted: 10/28/2021] [Indexed: 02/01/2023]
Abstract
PURPOSE Phenotype-based rapid diagnosis can make up for the time-consuming genetic sequencing diagnosis of rare diseases. However, the collected phenotypes of patients can sometimes be inaccurate or incomplete, which limits the accuracy of diagnostic results. To solve this problem, we try to design a phenotype-based differential diagnosis process for rare diseases to achieve rapid and accurate diagnosis of rare diseases. METHODS The core of the differential diagnosis of rare diseases is to optimize the phenotype information of a specific patient and the visualized comparative analysis of diseases. To recommend additional phenotypes, replace the fuzzy phenotypes and filter the unexplained phenotypes for patients, we constructed a phenotype hierarchical network and a disease-phenotype differential network and calculated the phenotype co-occurrence relationship. In addition, we designed a visual comparative analysis method to explore the correlation and difference of disease phenotypes. RESULTS The evaluation based on the published 10 rare disease cases demonstrated that after the optimization of patient phenotype information through our differential diagnosis, the target disease often got a better ranking and recommendation score than before. We have deployed this scheme on the RDmap project ( http://rdmap.nbscn.org ). CONCLUSION Compared to genetic and molecular analysis, phenotype-based diagnosis is faster, cheaper, and easier. The differential diagnosis process we designed can optimize the phenotype information of patients and better locate the target disease. It can also help to make screening decisions before genetic testing.
Collapse
|
37
|
Discovering Cerebral Ischemic Stroke Associated Genes Based on Network Representation Learning. Front Genet 2021; 12:728333. [PMID: 34539754 PMCID: PMC8442767 DOI: 10.3389/fgene.2021.728333] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Accepted: 07/26/2021] [Indexed: 11/13/2022] Open
Abstract
Cerebral ischemic stroke (IS) is a complex disease caused by multiple factors including vascular risk factors, genetic factors, and environment factors, which accentuates the difficulty in discovering corresponding disease-related genes. Identifying the genes associated with IS is critical for understanding the biological mechanism of IS, which would be significantly beneficial to the diagnosis and clinical treatment of cerebral IS. However, existing methods to predict IS-related genes are mainly based on the hypothesis of guilt-by-association (GBA). These methods cannot capture the global structure information of the whole protein-protein interaction (PPI) network. Inspired by the success of network representation learning (NRL) in the field of network analysis, we apply NRL to the discovery of disease-related genes and launch the framework to identify the disease-related genes of cerebral IS. The utilized framework contains three main parts: capturing the topological information of the PPI network with NRL, denoising the gene feature with the participation of a stacked autoencoder (SAE), and optimizing a support vector machine (SVM) classifier to identify IS-related genes. Superior to the existing methods on IS-related gene prediction, our framework presents more accurate results. The case study also shows that the proposed method can identify IS-related genes.
Collapse
|
38
|
Genome sequencing data analysis for rare disease gene discovery. Brief Bioinform 2021; 23:6366880. [PMID: 34498682 DOI: 10.1093/bib/bbab363] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 07/24/2021] [Accepted: 08/17/2021] [Indexed: 12/14/2022] Open
Abstract
Rare diseases occur in a smaller proportion of the general population, which is variedly defined as less than 200 000 individuals (US) or in less than 1 in 2000 individuals (Europe). Although rare, they collectively make up to approximately 7000 different disorders, with majority having a genetic origin, and affect roughly 300 million people globally. Most of the patients and their families undergo a long and frustrating diagnostic odyssey. However, advances in the field of genomics have started to facilitate the process of diagnosis, though it is hindered by the difficulty in genome data analysis and interpretation. A major impediment in diagnosis is in the understanding of the diverse approaches, tools and datasets available for variant prioritization, the most important step in the analysis of millions of variants to select a few potential variants. Here we present a review of the latest methodological developments and spectrum of tools available for rare disease genetic variant discovery and recommend appropriate data interpretation methods for variant prioritization. We have categorized the resources based on various steps of the variant interpretation workflow, starting from data processing, variant calling, annotation, filtration and finally prioritization, with a special emphasis on the last two steps. The methods discussed here pertain to elucidating the genetic basis of disease in individual patient cases via trio- or family-based analysis of the genome data. We advocate the use of a combination of tools and datasets and to follow multiple iterative approaches to elucidate the potential causative variant.
Collapse
|
39
|
Catalyzing Knowledge-Driven Discovery in Environmental Health Sciences through a Community-Driven Harmonized Language. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:8985. [PMID: 34501574 PMCID: PMC8430534 DOI: 10.3390/ijerph18178985] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 08/13/2021] [Accepted: 08/19/2021] [Indexed: 01/10/2023]
Abstract
Harmonized language is critical for helping researchers to find data, collecting scientific data to facilitate comparison, and performing pooled and meta-analyses. Using standard terms to link data to knowledge systems facilitates knowledge-driven analysis, allows for the use of biomedical knowledge bases for scientific interpretation and hypothesis generation, and increasingly supports artificial intelligence (AI) and machine learning. Due to the breadth of environmental health sciences (EHS) research and the continuous evolution in scientific methods, the gaps in standard terminologies, vocabularies, ontologies, and related tools hamper the capabilities to address large-scale, complex EHS research questions that require the integration of disparate data and knowledge sources. The results of prior workshops to advance a harmonized environmental health language demonstrate that future efforts should be sustained and grounded in scientific need. We describe a community initiative whose mission was to advance integrative environmental health sciences research via the development and adoption of a harmonized language. The products, outcomes, and recommendations developed and endorsed by this community are expected to enhance data collection and management efforts for NIEHS and the EHS community, making data more findable and interoperable. This initiative will provide a community of practice space to exchange information and expertise, be a coordination hub for identifying and prioritizing activities, and a collaboration platform for the development and adoption of semantic solutions. We encourage anyone interested in advancing this mission to engage in this community.
Collapse
|
40
|
Linking common human diseases to their phenotypes; development of a resource for human phenomics. J Biomed Semantics 2021; 12:17. [PMID: 34425897 PMCID: PMC8383460 DOI: 10.1186/s13326-021-00249-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Accepted: 07/30/2021] [Indexed: 11/11/2022] Open
Abstract
Background In recent years a large volume of clinical genomics data has become available due to rapid advances in sequencing technologies. Efficient exploitation of this genomics data requires linkage to patient phenotype profiles. Current resources providing disease-phenotype associations are not comprehensive, and they often do not have broad coverage of the disease terminologies, particularly ICD-10, which is still the primary terminology used in clinical settings. Methods We developed two approaches to gather disease-phenotype associations. First, we used a text mining method that utilizes semantic relations in phenotype ontologies, and applies statistical methods to extract associations between diseases in ICD-10 and phenotype ontology classes from the literature. Second, we developed a semi-automatic way to collect ICD-10–phenotype associations from existing resources containing known relationships. Results We generated four datasets. Two of them are independent datasets linking diseases to their phenotypes based on text mining and semi-automatic strategies. The remaining two datasets are generated from these datasets and cover a subset of ICD-10 classes of common diseases contained in UK Biobank. We extensively validated our text mined and semi-automatically curated datasets by: comparing them against an expert-curated validation dataset containing disease–phenotype associations, measuring their similarity to disease–phenotype associations found in public databases, and assessing how well they could be used to recover gene–disease associations using phenotype similarity. Conclusion We find that our text mining method can produce phenotype annotations of diseases that are correct but often too general to have significant information content, or too specific to accurately reflect the typical manifestations of the sporadic disease. On the other hand, the datasets generated from integrating multiple knowledgebases are more complete (i.e., cover more of the required phenotype annotations for a given disease). We make all data freely available at 10.5281/zenodo.4726713. Supplementary Information The online version contains supplementary material available at (10.1186/s13326-021-00249-x).
Collapse
|
41
|
Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records. Genes (Basel) 2021; 12:genes12081159. [PMID: 34440331 PMCID: PMC8393657 DOI: 10.3390/genes12081159] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 07/25/2021] [Accepted: 07/26/2021] [Indexed: 12/30/2022] Open
Abstract
Human genetic disorders, such as Down syndrome, have a wide variety of clinical phenotypic presentations, and characterizing each nuanced phenotype and subtype can be difficult. In this study, we examined the electronic health records of 4095 individuals with Down syndrome at the Children’s Hospital of Philadelphia to create a method to characterize the phenotypic spectrum digitally. We extracted Human Phenotype Ontology (HPO) terms from quality-filtered patient notes using a natural language processing (NLP) approach MetaMap. We catalogued the most common HPO terms related to Down syndrome patients and compared the terms with those from a baseline population. We characterized the top 100 HPO terms by their frequencies at different ages of clinical visits and highlighted selected terms that have time-dependent distributions. We also discovered phenotypic terms that have not been significantly associated with Down syndrome, such as “Proptosis”, “Downslanted palpebral fissures”, and “Microtia”. In summary, our study demonstrated that the clinical phenotypic spectrum of individual with Mendelian diseases can be characterized through NLP-based digital phenotyping on population-scale electronic health records (EHRs).
Collapse
|
42
|
Long-term outcome of the survivors of infantile hypercalcaemia with CYP24A1 and SLC34A1 mutations. Nephrol Dial Transplant 2021; 36:1484-1492. [PMID: 33099630 PMCID: PMC8311581 DOI: 10.1093/ndt/gfaa178] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2020] [Accepted: 05/25/2020] [Indexed: 11/12/2022] Open
Abstract
Background Infantile hypercalcaemia (IH) is a vitamin D3 metabolism disorder. The molecular basis for IH is biallelic mutations in the CYP24A1 or SLC34A1 gene. These changes lead to catabolism disorders (CYP24A1 mutations) or excessive generation of 1,25-dihydroxyvitamin D3 [1,25(OH)2D3] (SLC34A1 mutations). The incidence rate of IH in children and the risk level for developing end-stage renal disease (ESRD) are still unknown. The aim of this study was to analyse the long-term outcome of adolescents and young adults who suffered from IH in infancy. Design Forty-two children (23 girls; average age 10.7 ± 6.3 years) and 26 adults (14 women; average age 24.2 ± 4.4 years) with a personal history of hypercalcaemia with elevated 1,25(OH)2D3 levels were included in the analysis. In all patients, a genetic analysis of possible IH mutations was conducted, as well as laboratory tests and renal ultrasonography. Results IH was confirmed in 20 studied patients (10 females). CYP24A1 mutations were found in 16 patients (8 females) and SLC34A1 in 4 patients (2 females). The long-term outcome was assessed in 18 patients with an average age of 23.8 years (age range 2–34). The average glomerular filtration rate (GFR) was 72 mL/min/1.73 m2 (range 15–105). Two patients with a CYP24A1 mutation developed ESRD and underwent renal transplantation. A GFR <90 mL/min/1.73 m2 was found in 14 patients (77%), whereas a GFR <60 mL/min/1.73 m2 was seen in 5 patients (28%), including 2 adults after renal transplantation. Three of 18 patients still had serum calcium levels >2.6 mmol/L. A renal ultrasound revealed nephrocalcinosis in 16 of 18 (88%) patients, however, mild hypercalciuria was detected in only one subject. Conclusions Subjects who suffered from IH have a greater risk of progressive chronic kidney disease and nephrocalcinosis. This indicates that all survivors of IH should be closely monitored, with early implementation of preventive measures, e.g. inhibition of active metabolites of vitamin D3 synthesis.
Collapse
|
43
|
CDON gene contributes to pituitary stalk interruption syndrome associated with unilateral facial and abducens nerve palsy. J Appl Genet 2021; 62:621-629. [PMID: 34235642 PMCID: PMC8571149 DOI: 10.1007/s13353-021-00649-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 06/21/2021] [Accepted: 06/28/2021] [Indexed: 11/06/2022]
Abstract
The relationship between congenital defects of the brain and facial anomalies was proven. The Hedgehog signaling pathway plays a fundamental role in normal craniofacial development in humans. Mutations in the sonic hedgehog (SHH) signaling gene CDON have been recently reported in patients with holoprosencephaly and with pituitary stalk interruption syndrome (PSIS). This study’s aim was an elucidation of an 18-year-old patient presenting PSIS, multiple pituitary hormone deficiency, and congenital unilateral facial and abducens nerve palsy. Additionally, bilateral sensorineural hearing loss, dominating at the right site, was diagnosed. From the second year of life, growth deceleration was observed, and from the age of eight, anterior pituitary hormone deficiencies were gradually confirmed and substituted. At the MRI, characteristic triad for PSIS (anterior pituitary hypoplasia, interrupted pituitary stalk and ectopic posterior lobe) was diagnosed. We performed a comprehensive genomic screening, including microarrays for structural rearrangements and whole-exome sequencing for a monogenic defect. A novel heterozygous missense variant in the CDON gene (c.1814G > T; p.Gly605Val) was identified. The variant was inherited from the mother, who, besides short stature, did not show any disease symptoms. The variant was absent in control databases and 100 healthy subjects originating from the same population. We report a novel variant in the CDON gene associated with PSIS and congenital cranial nerve palsy. The variant revealed autosomal dominant inheritance with incomplete penetrance in concordance with previous studies reporting CDON defects.
Collapse
|
44
|
AnnotSV and knotAnnotSV: a web server for human structural variations annotations, ranking and analysis. Nucleic Acids Res 2021; 49:W21-W28. [PMID: 34023905 PMCID: PMC8262758 DOI: 10.1093/nar/gkab402] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 04/16/2021] [Accepted: 04/29/2021] [Indexed: 11/13/2022] Open
Abstract
With the dramatic increase of pangenomic analysis, Human geneticists have generated large amount of genomic data including millions of small variants (SNV/indel) but also thousands of structural variations (SV) mainly from next-generation sequencing and array-based techniques. While the identification of the complete SV repertoire of a patient is getting possible, the interpretation of each SV remains challenging. To help identifying human pathogenic SV, we have developed a web server dedicated to their annotation and ranking (AnnotSV) as well as their visualization and interpretation (knotAnnotSV) freely available at the following address: https://www.lbgi.fr/AnnotSV/. A large amount of annotations from >20 sources is integrated in our web server including among others genes, haploinsufficiency, triplosensitivity, regulatory elements, known pathogenic or benign genomic regions, phenotypic data. An ACMG/ClinGen compliant prioritization module allows the scoring and the ranking of SV into 5 SV classes from pathogenic to benign. Finally, the visualization interface displays the annotated SV in an interactive way including popups, search fields, filtering options, advanced colouring to highlight pathogenic SV and hyperlinks to the UCSC genome browser or other public databases. This web server is designed for diagnostic and research analysis by providing important resources to the user.
Collapse
|
45
|
Digital/computational phenotyping: What are the differences in the science and the ethics? BIG DATA & SOCIETY 2021; 8:20539517211062885. [PMID: 37790725 PMCID: PMC10544038 DOI: 10.1177/20539517211062885] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
The concept of 'digital phenotyping' was originally developed by researchers in the mental health field, but it has travelled to other disciplines and areas. This commentary draws upon our experiences of working in two scientific projects that are based at the University of Oxford's Big Data Institute - The RADAR-AD project and The Minerva Initiative - which are developing algorithmic phenotyping technologies. We describe and analyse the concepts of digital biomarkers and computational phenotyping that underlie these projects, explain how they are linked to other research in digital phenotyping and compare and contrast some of their epistemological and ethical implications. In particular, we argue that the phenotyping paradigm in both projects is grounded on an assumption of 'objectivity' that is articulated in different ways depending on the role that is given to the computational/digital tools. Using the concept of 'affordance', we show how specific functionalities relate to potential uses and social implications of these technologies and argue that it is important to distinguish among them as the concept of digital phenotyping is increasingly being used with a variety of meanings.
Collapse
|
46
|
AMELIE speeds Mendelian diagnosis by matching patient phenotype and genotype to primary literature. Sci Transl Med 2021; 12:12/544/eaau9113. [PMID: 32434849 DOI: 10.1126/scitranslmed.aau9113] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Revised: 08/14/2019] [Accepted: 04/22/2020] [Indexed: 12/21/2022]
Abstract
The diagnosis of Mendelian disorders requires labor-intensive literature research. Trained clinicians can spend hours looking for the right publication(s) supporting a single gene that best explains a patient's disease. AMELIE (Automatic Mendelian Literature Evaluation) greatly accelerates this process. AMELIE parses all 29 million PubMed abstracts and downloads and further parses hundreds of thousands of full-text articles in search of information supporting the causality and associated phenotypes of most published genetic variants. AMELIE then prioritizes patient candidate variants for their likelihood of explaining any patient's given set of phenotypes. Diagnosis of singleton patients (without relatives' exomes) is the most time-consuming scenario, and AMELIE ranked the causative gene at the very top for 66% of 215 diagnosed singleton Mendelian patients from the Deciphering Developmental Disorders project. Evaluating only the top 11 AMELIE-scored genes of 127 (median) candidate genes per patient resulted in a rapid diagnosis in more than 90% of cases. AMELIE-based evaluation of all cases was 3 to 19 times more efficient than hand-curated database-based approaches. We replicated these results on a retrospective cohort of clinical cases from Stanford Children's Health and the Manton Center for Orphan Disease Research. An analysis web portal with our most recent update, programmatic interface, and code is available at AMELIE.stanford.edu.
Collapse
|
47
|
CLN8 Mutations Presenting with a Phenotypic Continuum of Neuronal Ceroid Lipofuscinosis-Literature Review and Case Report. Genes (Basel) 2021; 12:genes12070956. [PMID: 34201538 PMCID: PMC8307369 DOI: 10.3390/genes12070956] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 06/18/2021] [Accepted: 06/21/2021] [Indexed: 11/16/2022] Open
Abstract
CLN8 is a ubiquitously expressed membrane-spanning protein that localizes primarily in the ER, with partial localization in the ER-Golgi intermediate compartment. Mutations in CLN8 cause late-infantile neuronal ceroid lipofuscinosis (LINCL). We describe a female pediatric patient with LINCL. She exhibited a typical phenotype associated with LINCL, except she did not present spontaneous myoclonus, her symptoms occurrence was slower and developed focal sensory visual seizures. In addition, whole-exome sequencing identified a novel homozygous variant in CLN8, c.531G>T, resulting in p.Trp177Cys. Ultrastructural examination featured abundant lipofuscin deposits within mucosal cells, macrophages, and monocytes. We report a novel CLN8 mutation as a cause for NCL8 in a girl with developmental delay and epilepsy, cerebellar syndrome, visual loss, and progressive cognitive and motor regression. This case, together with an analysis of the available literature, emphasizes the existence of a continuous spectrum of CLN8-associated phenotypes rather than a sharp distinction between them.
Collapse
|
48
|
Relevant genetic variants are common in women with pregnancy and lactation-associated osteoporosis (PLO) and predispose to more severe clinical manifestations. Bone 2021; 147:115911. [PMID: 33716164 DOI: 10.1016/j.bone.2021.115911] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 01/19/2021] [Accepted: 03/08/2021] [Indexed: 12/17/2022]
Abstract
Pregnancy and lactation-associated osteoporosis (PLO) is a rare skeletal disorder characterized by early-onset osteoporosis typically manifestating with vertebral compression fractures or transient osteoporosis of the hip. We hypothesized that genetic variants may play a role in the development of PLO. This study aimed to analyze the presence of genetic variants and a potential association with the clinical presentation in PLO. 42 women with PLO were included from 2013 to 2019 in a multicenter study in Germany. All cases underwent comprehensive genetic analysis based on a custom-designed gene panel including genes relevant for skeletal disorders. The skeletal status was assessed using dual-energy X-ray absorptiometry (DXA). Subgroups were further analyzed by serum bone turnover markers (n = 31) and high-resolution peripheral computed tomography (HR-pQCT; n = 23). We detected relevant genetic variants in 21 women (50%), with LRP5, WNT1 and COL1A1/A2 being the most commonly involved genes. The mean number of vertebral compression fractures was 3.3 ± 3.4 per case with a significantly higher occurrence in the subgroup with genetic variants (4.8 ± 3.7 vs. 1.8 ± 2.3, p = 0.02). Among the total cohort, DXA Z-scores were significantly lower at the lumbar spine compared to the femoral neck (p = 0.002). HR-pQCT revealed a pronounced reduction of trabecular and cortical thickness, while trabecular number was within the reference range. Eighteen women (43%) received a bone-specific therapy (primarily teriparatide). Overall, a steep increase in bone mass (+37.7%) was observed after 3 years. In conclusion, pregnancy and lactation represent skeletal risk factors, which may unmask hereditary bone disorders leading to PLO. These cases were affected more severely. Nevertheless, a timely diagnosis and adequate treatment can ensure a substantial recovery potential even without specific therapy. Patients with genetically induced low bone turnover (e.g.; LRP5, WNT1) may especially benefit from osteo-anabolic medication.
Collapse
|
49
|
Modeling seizures in the Human Phenotype Ontology according to contemporary ILAE concepts makes big phenotypic data tractable. Epilepsia 2021; 62:1293-1305. [PMID: 33949685 PMCID: PMC8272408 DOI: 10.1111/epi.16908] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 02/19/2021] [Accepted: 04/01/2021] [Indexed: 01/08/2023]
Abstract
Objective: The clinical features of epilepsy determine how it is defined, which in turn guides management. Therefore, consideration of the fundamental clinical entities that comprise an epilepsy is essential in the study of causes, trajectories, and treatment responses. The Human Phenotype Ontology (HPO) is used widely in clinical and research genetics for concise communication and modeling of clinical features, allowing extracted data to be harmonized using logical inference. We sought to redesign the HPO seizure subontology to improve its consistency with current epileptological concepts, supporting the use of large clinical data sets in high-throughput clinical and research genomics. Methods: We created a new HPO seizure subontology based on the 2017 International League Against Epilepsy (ILAE) Operational Classification of Seizure Types, and integrated concepts of status epilepticus, febrile, reflex, and neonatal seizures at different levels of detail. We compared the HPO seizure subontology prior to, and following, our revision, according to the information that could be inferred about the seizures of 791 individuals from three independent cohorts: 2 previously published and 150 newly recruited individuals. Each cohort’s data were provided in a different format and harmonized using the two versions of the HPO. Results: The new seizure subontology increased the number of descriptive concepts for seizures 5-fold. The number of seizure descriptors that could be annotated to the cohort increased by 40% and the total amount of information about individuals’ seizures increased by 38%. The most important qualitative difference was the relationship of focal to bilateral tonic-clonic seizure to generalized-onset and focal-onset seizures.
Collapse
|
50
|
Clinical Phenotype and Relevance of LRP5 and LRP6 Variants in Patients With Early-Onset Osteoporosis (EOOP). J Bone Miner Res 2021; 36:271-282. [PMID: 33118644 DOI: 10.1002/jbmr.4197] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 10/08/2020] [Accepted: 10/13/2020] [Indexed: 02/06/2023]
Abstract
Reduced bone mineral density (BMD; ie, Z-score ≤-2.0) occurring at a young age (ie, premenopausal women and men <50 years) in the absence of secondary osteoporosis is considered early-onset osteoporosis (EOOP). Mutations affecting the WNT signaling pathway are of special interest because of their key role in bone mass regulation. Here, we analyzed the effects of relevant LRP5 and LRP6 variants on the clinical phenotype, bone turnover, BMD, and bone microarchitecture. After exclusion of secondary osteoporosis, EOOP patients (n = 372) were genotyped by gene panel sequencing, and segregation analysis of variants in LRP5/LRP6 was performed. The clinical assessment included the evaluation of bone turnover parameters, BMD by dual-energy X-ray absorptiometry, and microarchitecture via high-resolution peripheral quantitative computed tomography (HR-pQCT). In 50 individuals (31 EOOP index patients, 19 family members), relevant variants affecting LRP5 or LRP6 were detected (42 LRP5 and 8 LRP6 variants), including 10 novel variants. Seventeen variants were classified as disease causing, 14 were variants of unknown significance, and 19 were BMD-associated single-nucleotide polymorphisms (SNPs). One patient harbored compound heterozygous LRP5 mutations causing osteoporosis-pseudoglioma syndrome. Fractures were reported in 37 of 50 individuals, consisting of vertebral (18 of 50) and peripheral (29 of 50) fractures. Low bone formation was revealed in all individuals. A Z-score ≤-2.0 was detected in 31 of 50 individuals, and values at the spine were significantly lower than those at the hip (-2.1 ± 1.3 versus -1.6 ± 0.8; p = .003). HR-pQCT analysis (n = 34) showed impaired microarchitecture in trabecular and cortical compartments. Significant differences regarding the clinical phenotype were detectable between index patients and family members but not between different variant classes. Relevant variants in LRP5 and LRP6 contribute to EOOP in a substantial number of individuals, leading to a high number of fractures, low bone formation, reduced Z-scores, and impaired microarchitecture. This detailed skeletal characterization improves the interpretation of known and novel LRP5 and LRP6 variants. © 2020 The Authors. Journal of Bone and Mineral Research published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research (ASBMR).
Collapse
|