1
|
Wang M, Liu G, Ni Z, Yang Q, Li X, Bi Z. Acute kidney injury comorbidity analysis based on international classification of diseases-10 codes. BMC Med Inform Decis Mak 2024; 24:35. [PMID: 38310256 PMCID: PMC10837944 DOI: 10.1186/s12911-024-02435-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 01/22/2024] [Indexed: 02/05/2024] Open
Abstract
OBJECTIVE Acute kidney injury (AKI) is a clinical syndrome that occurs as a result of a dramatic decline in kidney function caused by a variety of etiological factors. Its main biomarkers, serum creatinine and urine output, are not effective in diagnosing early AKI. For this reason, this study provides insight into this syndrome by exploring the comorbidities of AKI, which may facilitate the early diagnosis of AKI. In addition, organ crosstalk in AKI was systematically explored based on comorbidities to obtain clinically reliable results. METHODS We collected data from the Medical Information Mart for Intensive Care-IV database on patients aged [Formula: see text] 18 years in intensive care units (ICU) who were diagnosed with AKI using the criteria proposed by Kidney Disease: Improving Global Outcomes. The Apriori algorithm was used to mine association rules on the diagnoses of 55,486 AKI and non-AKI patients in the ICU. The comorbidities of AKI mined were validated through the Electronic Intensive Care Unit database, the Colombian Open Health Database, and medical literature, after which comorbidity results were visualized using a disease network. Finally, organ diseases were identified and classified from comorbidities to investigate renal crosstalk with other distant organs in AKI. RESULTS We found 579 AKI comorbidities, and the main ones were disorders of lipoprotein metabolism, essential hypertension, and disorders of fluid, electrolyte, and acid-base balance. Of the 579 comorbidities, 554 were verifiable and 25 were new and not previously reported. In addition, crosstalk between the kidneys and distant non-renal organs including the liver, heart, brain, lungs, and gut was observed in AKI with the strongest heart-kidney crosstalk, followed by lung-kidney crosstalk. CONCLUSION The comorbidities mined in this study using association rules are scientific and may be used for the early diagnosis of AKI and the construction of AKI predictive models. Furthermore, the organ crosstalk results obtained through comorbidities may provide supporting information for the management of short- and long-term treatment practices for organ dysfunction.
Collapse
Affiliation(s)
- Menglu Wang
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, 511436, China
| | - Guangjian Liu
- Shenzhen Dymind Biotechnology Co., Ltd, Shenzhen, 518000, China
| | - Zhennan Ni
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, 511436, China
| | - Qianjun Yang
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, 511436, China
| | - Xiaojun Li
- Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, 510623, China.
| | - Zhisheng Bi
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, 511436, China.
- Department of Emergency Medicine, the Second Affiliated Hospital, Guangzhou Medical University, Guangzhou, 510260, China.
| |
Collapse
|
2
|
Glen AK, Ma C, Mendoza L, Womack F, Wood EC, Sinha M, Acevedo L, Kvarfordt LG, Peene RC, Liu S, Hoffman AS, Roach JC, Deutsch EW, Ramsey SA, Koslicki D. ARAX: a graph-based modular reasoning tool for translational biomedicine. Bioinformatics 2023; 39:7031241. [PMID: 36752514 PMCID: PMC10027432 DOI: 10.1093/bioinformatics/btad082] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 12/17/2022] [Accepted: 02/07/2023] [Indexed: 04/12/2023] Open
Abstract
MOTIVATION With the rapidly growing volume of knowledge and data in biomedical databases, improved methods for knowledge-graph-based computational reasoning are needed in order to answer translational questions. Previous efforts to solve such challenging computational reasoning problems have contributed tools and approaches, but progress has been hindered by the lack of an expressive analysis workflow language for translational reasoning and by the lack of a reasoning engine-supporting that language-that federates semantically integrated knowledge-bases. RESULTS We introduce ARAX, a new reasoning system for translational biomedicine that provides a web browser user interface and an application programming interface (API). ARAX enables users to encode translational biomedical questions and to integrate knowledge across sources to answer the user's query and facilitate exploration of results. For ARAX, we developed new approaches to query planning, knowledge-gathering, reasoning and result ranking and dynamically integrate knowledge providers for answering biomedical questions. To illustrate ARAX's application and utility in specific disease contexts, we present several use-case examples. AVAILABILITY AND IMPLEMENTATION The source code and technical documentation for building the ARAX server-side software and its built-in knowledge database are freely available online (https://github.com/RTXteam/RTX). We provide a hosted ARAX service with a web browser interface at arax.rtx.ai and a web API endpoint at arax.rtx.ai/api/arax/v1.3/ui/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Luis Mendoza
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Finn Womack
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, PA 16802, USA
| | - E C Wood
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Meghamala Sinha
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Liliana Acevedo
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Lindsey G Kvarfordt
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Ross C Peene
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Shaopeng Liu
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, PA 16802, USA
| | - Andrew S Hoffman
- Interdisciplinary Hub for Digitalization and Society, Radboud University, Nijmegen 6500GL, The Netherlands
| | - Jared C Roach
- Institute for Systems Biology, Seattle, WA 98109, USA
| | | | | | | |
Collapse
|
3
|
Liu C, Ta CN, Havrilla JM, Nestor JG, Spotnitz ME, Geneslaw AS, Hu Y, Chung WK, Wang K, Weng C. OARD: Open annotations for rare diseases and their phenotypes based on real-world data. Am J Hum Genet 2022; 109:1591-1604. [PMID: 35998640 PMCID: PMC9502051 DOI: 10.1016/j.ajhg.2022.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 08/01/2022] [Indexed: 11/23/2022] Open
Abstract
Diagnosis for rare genetic diseases often relies on phenotype-driven methods, which hinge on the accuracy and completeness of the rare disease phenotypes in the underlying annotation knowledgebase. Existing knowledgebases are often manually curated with additional annotations found in published case reports. Despite their potential, real-world data such as electronic health records (EHRs) have not been fully exploited to derive rare disease annotations. Here, we present open annotation for rare diseases (OARD), a real-world-data-derived resource with annotation for rare-disease-related phenotypes. This resource is derived from the EHRs of two academic health institutions containing more than 10 million individuals spanning wide age ranges and different disease subgroups. By leveraging ontology mapping and advanced natural-language-processing (NLP) methods, OARD automatically and efficiently extracts concepts for both rare diseases and their phenotypic traits from billing codes and lab tests as well as over 100 million clinical narratives. The rare disease prevalence derived by OARD is highly correlated with those annotated in the original rare disease knowledgebase. By performing association analysis, we identified more than 1 million novel disease-phenotype association pairs that were previously missed by human annotation, and >60% were confirmed true associations via manual review of a list of sampled pairs. Compared to the manual curated annotation, OARD is 100% data driven and its pipeline can be shared across different institutions. By supporting privacy-preserving sharing of aggregated summary statistics, such as term frequencies and disease-phenotype associations, it fills an important gap to facilitate data-driven research in the rare disease community.
Collapse
Affiliation(s)
- Cong Liu
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
| | - Casey N Ta
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
| | - Jim M Havrilla
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Jordan G Nestor
- Division of Nephrology, Department of Medicine, Columbia University, New York, NY 10032, USA
| | - Matthew E Spotnitz
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
| | - Andrew S Geneslaw
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Yu Hu
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Wendy K Chung
- Department of Pediatrics, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
4
|
Kashatnikova DA, Khadzhieva MB, Kolobkov DS, Belopolskaya OB, Smelaya TV, Gracheva AS, Kalinina EV, Larin SS, Kuzovlev AN, Salnikova LE. Pneumonia and Related Conditions in Critically Ill Patients—Insights from Basic and Experimental Studies. Int J Mol Sci 2022; 23:ijms23179896. [PMID: 36077293 PMCID: PMC9456259 DOI: 10.3390/ijms23179896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 08/22/2022] [Accepted: 08/25/2022] [Indexed: 11/17/2022] Open
Abstract
Pneumonia is an acute infectious disease with high morbidity and mortality rates. Pneumonia’s development, severity and outcome depend on age, comorbidities and the host immune response. In this study, we combined theoretical and experimental investigations to characterize pneumonia and its comorbidities as well as to assess the host immune response measured by TREC/KREC levels in patients with pneumonia. The theoretical study was carried out using the Columbia Open Health Data (COHD) resource, which provides access to clinical concept prevalence and co-occurrence from electronic health records. The experimental study included TREC/KREC assays in young adults (18–40 years) with community-acquired (CAP) (n = 164) or nosocomial (NP) (n = 99) pneumonia and healthy controls (n = 170). Co-occurring rates between pneumonia, sepsis, acute respiratory distress syndrome (ARDS) and some other related conditions common in intensive care units were the top among 4170, 3382 and 963 comorbidities in pneumonia, sepsis and ARDS, respectively. CAP patients had higher TREC levels, while NP patients had lower TREC/KREC levels compared to controls. Low TREC and KREC levels were predictive for the development of NP, ARDS, sepsis and lethal outcome (AUCTREC in the range 0.71–0.82, AUCKREC in the range 0.67–0.74). TREC/KREC analysis can be considered as a potential prognostic test in patients with pneumonia.
Collapse
Affiliation(s)
- Darya A. Kashatnikova
- The Laboratory of Ecological Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia
| | - Maryam B. Khadzhieva
- The Laboratory of Ecological Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia
- The Laboratory of Clinical Pathophysiology of Critical Conditions, Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Moscow 107031, Russia
- The Laboratory of Molecular Immunology, Rogachev National Research Center of Pediatric Hematology, Oncology and Immunology, Moscow 117198, Russia
| | - Dmitry S. Kolobkov
- The Laboratory of Ecological Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia
| | - Olesya B. Belopolskaya
- The Resource Center “Bio-Bank Center”, Research Park of St. Petersburg State University, St. Petersburg 199034, Russia
| | - Tamara V. Smelaya
- The Laboratory of Clinical Pathophysiology of Critical Conditions, Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Moscow 107031, Russia
| | - Alesya S. Gracheva
- The Laboratory of Ecological Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia
- The Laboratory of Clinical Pathophysiology of Critical Conditions, Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Moscow 107031, Russia
| | - Ekaterina V. Kalinina
- The Laboratory of Molecular Immunology, Rogachev National Research Center of Pediatric Hematology, Oncology and Immunology, Moscow 117198, Russia
| | - Sergey S. Larin
- The Laboratory of Molecular Immunology, Rogachev National Research Center of Pediatric Hematology, Oncology and Immunology, Moscow 117198, Russia
| | - Artem N. Kuzovlev
- The Laboratory of Clinical Pathophysiology of Critical Conditions, Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Moscow 107031, Russia
| | - Lyubov E. Salnikova
- The Laboratory of Ecological Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia
- The Laboratory of Clinical Pathophysiology of Critical Conditions, Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Moscow 107031, Russia
- The Laboratory of Molecular Immunology, Rogachev National Research Center of Pediatric Hematology, Oncology and Immunology, Moscow 117198, Russia
- Correspondence:
| |
Collapse
|
5
|
Fecho K, Ahalt SC, Appold S, Arunachalam S, Pfaff E, Stillwell L, Valencia A, Xu H, Peden DB. Development and Application of an Open Tool for Sharing and Analyzing Integrated Clinical and Environmental Exposures Data: Asthma Use Case. JMIR Form Res 2022; 6:e32357. [PMID: 35363149 PMCID: PMC9015759 DOI: 10.2196/32357] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 12/21/2021] [Accepted: 12/22/2021] [Indexed: 11/25/2022] Open
Abstract
Background The Integrated Clinical and Environmental Exposures Service (ICEES) serves as an open-source, disease-agnostic, regulatory-compliant framework and approach for openly exposing and exploring clinical data that have been integrated at the patient level with a variety of environmental exposures data. ICEES is equipped with tools to support basic statistical exploration of the integrated data in a completely open manner. Objective This study aims to further develop and apply ICEES as a novel tool for openly exposing and exploring integrated clinical and environmental data. We focus on an asthma use case. Methods We queried the ICEES open application programming interface (OpenAPI) using a functionality that supports chi-square tests between feature variables and a primary outcome measure, with a Bonferroni correction for multiple comparisons (α=.001). We focused on 2 primary outcomes that are indicative of asthma exacerbations: annual emergency department (ED) or inpatient visits for respiratory issues; and annual prescriptions for prednisone. Results Of the 157,410 patients within the asthma cohort, 26,332 (16.73%) had 1 or more annual ED or inpatient visits for respiratory issues, and 17,056 (10.84%) had 1 or more annual prescriptions for prednisone. We found that close proximity to a major roadway or highway, exposure to high levels of particulate matter ≤2.5 μm (PM2.5) or ozone, female sex, Caucasian race, low residential density, lack of health insurance, and low household income were significantly associated with asthma exacerbations (P<.001). Asthma exacerbations did not vary by rural versus urban residence. Moreover, the results were largely consistent across outcome measures. Conclusions Our results demonstrate that the open-source ICEES can be used to replicate and extend published findings on factors that influence asthma exacerbations. As a disease-agnostic, open-source approach for integrating, exposing, and exploring patient-level clinical and environmental exposures data, we believe that ICEES will have broad adoption by other institutions and application in environmental health and other biomedical fields.
Collapse
Affiliation(s)
- Karamarie Fecho
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Stanley C Ahalt
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Stephen Appold
- Kenan-Flagler Business School, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Saravanan Arunachalam
- Institute for the Environment, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Emily Pfaff
- North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Lisa Stillwell
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Alejandro Valencia
- Institute for the Environment, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Hao Xu
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - David B Peden
- North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
- Division of Allergy & Immunology, Department of Pediatrics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
- Center for Environmental Medicine, Asthma and Lung Biology, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| |
Collapse
|
6
|
An artificial intelligence framework integrating longitudinal electronic health records with real-world data enables continuous pan-cancer prognostication. NATURE CANCER 2022; 2:709-722. [PMID: 35121948 DOI: 10.1038/s43018-021-00236-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 06/14/2021] [Indexed: 12/11/2022]
Abstract
Despite widespread adoption of electronic health records (EHRs), most hospitals are not ready to implement data science research in the clinical pipelines. Here, we develop MEDomics, a continuously learning infrastructure through which multimodal health data are systematically organized and data quality is assessed with the goal of applying artificial intelligence for individual prognosis. Using this framework, currently composed of thousands of individuals with cancer and millions of data points over a decade of data recording, we demonstrate prognostic utility of this framework in oncology. As proof of concept, we report an analysis using this infrastructure, which identified the Framingham risk score to be robustly associated with mortality among individuals with early-stage and advanced-stage cancer, a potentially actionable finding from a real-world cohort of individuals with cancer. Finally, we show how natural language processing (NLP) of medical notes could be used to continuously update estimates of prognosis as a given individual's disease course unfolds.
Collapse
|
7
|
Artificial Intelligence in Clinical Immunology. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_83] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
8
|
Buchlak QD, Esmaili N, Bennett C, Farrokhi F. Natural Language Processing Applications in the Clinical Neurosciences: A Machine Learning Augmented Systematic Review. ACTA NEUROCHIRURGICA. SUPPLEMENT 2022; 134:277-289. [PMID: 34862552 DOI: 10.1007/978-3-030-85292-4_32] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Natural language processing (NLP), a domain of artificial intelligence (AI) that models human language, has been used in medicine to automate diagnostics, detect adverse events, support decision making and predict clinical outcomes. However, applications to the clinical neurosciences appear to be limited. NLP has matured with the implementation of deep transformer models (e.g., XLNet, BERT, T5, and RoBERTa) and transfer learning. The objectives of this study were to (1) systematically review NLP applications in the clinical neurosciences, and (2) explore NLP analysis to facilitate literature synthesis, providing clear examples to demonstrate the potential capabilities of these technologies for a clinical audience. Our NLP analysis consisted of keyword identification, text summarization and document classification. A total of 48 articles met inclusion criteria. NLP has been applied in the clinical neurosciences to facilitate literature synthesis, data extraction, patient identification, automated clinical reporting and outcome prediction. The number of publications applying NLP has increased rapidly over the past five years. Document classifiers trained to differentiate included and excluded articles demonstrated moderate performance (XLNet AUC = 0.66, BERT AUC = 0.59, RoBERTa AUC = 0.62). The T5 transformer model generated acceptable abstract summaries. The application of NLP has the potential to enhance research and practice in the clinical neurosciences.
Collapse
Affiliation(s)
- Quinlan D Buchlak
- School of Medicine, The University of Notre Dame Australia, Sydney, NSW, Australia.
| | - Nazanin Esmaili
- School of Medicine, The University of Notre Dame Australia, Sydney, NSW, Australia
- Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW, Australia
| | - Christine Bennett
- School of Medicine, The University of Notre Dame Australia, Sydney, NSW, Australia
| | - Farrokh Farrokhi
- Neuroscience Institute, Virginia Mason Medical Center, Seattle, WA, USA
| |
Collapse
|
9
|
Nguyen TM, Bharti S, Yue Z, Willey CD, Chen JY. Statistical Enrichment Analysis of Samples: A General-Purpose Tool to Annotate Metadata Neighborhoods of Biological Samples. Front Big Data 2021; 4:725276. [PMID: 34604741 PMCID: PMC8481385 DOI: 10.3389/fdata.2021.725276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 09/06/2021] [Indexed: 11/13/2022] Open
Abstract
Unsupervised learning techniques, such as clustering and embedding, have been increasingly popular to cluster biomedical samples from high-dimensional biomedical data. Extracting clinical data or sample meta-data shared in common among biomedical samples of a given biological condition remains a major challenge. Here, we describe a powerful analytical method called Statistical Enrichment Analysis of Samples (SEAS) for interpreting clustered or embedded sample data from omics studies. The method derives its power by focusing on sample sets, i.e., groups of biological samples that were constructed for various purposes, e.g., manual curation of samples sharing specific characteristics or automated clusters generated by embedding sample omic profiles from multi-dimensional omics space. The samples in the sample set share common clinical measurements, which we refer to as "clinotypes," such as age group, gender, treatment status, or survival days. We demonstrate how SEAS yields insights into biological data sets using glioblastoma (GBM) samples. Notably, when analyzing the combined The Cancer Genome Atlas (TCGA)-patient-derived xenograft (PDX) data, SEAS allows approximating the different clinical outcomes of radiotherapy-treated PDX samples, which has not been solved by other tools. The result shows that SEAS may support the clinical decision. The SEAS tool is publicly available as a freely available software package at https://aimed-lab.shinyapps.io/SEAS/.
Collapse
Affiliation(s)
- Thanh M Nguyen
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Samuel Bharti
- Centre for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, India
| | - Zongliang Yue
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Christopher D Willey
- Department of Radiation Oncology, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Jake Y Chen
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| |
Collapse
|
10
|
Abstract
A huge array of data in nephrology is collected through patient registries, large epidemiological studies, electronic health records, administrative claims, clinical trial repositories, mobile health devices and molecular databases. Application of these big data, particularly using machine-learning algorithms, provides a unique opportunity to obtain novel insights into kidney diseases, facilitate personalized medicine and improve patient care. Efforts to make large volumes of data freely accessible to the scientific community, increased awareness of the importance of data sharing and the availability of advanced computing algorithms will facilitate the use of big data in nephrology. However, challenges exist in accessing, harmonizing and integrating datasets in different formats from disparate sources, improving data quality and ensuring that data are secure and the rights and privacy of patients and research participants are protected. In addition, the optimism for data-driven breakthroughs in medicine is tempered by scepticism about the accuracy of calibration and prediction from in silico techniques. Machine-learning algorithms designed to study kidney health and diseases must be able to handle the nuances of this specialty, must adapt as medical practice continually evolves, and must have global and prospective applicability for external and future datasets.
Collapse
|
11
|
Lee J, Kim JH, Liu C, Hripcsak G, Natarajan K, Ta C, Weng C. Columbia Open Health Data for COVID-19 Research: Database Analysis. J Med Internet Res 2021; 23:e31122. [PMID: 34543225 PMCID: PMC8485985 DOI: 10.2196/31122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 08/03/2021] [Accepted: 08/03/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND COVID-19 has threatened the health of tens of millions of people all over the world. Massive research efforts have been made in response to the COVID-19 pandemic. Utilization of clinical data can accelerate these research efforts to combat the pandemic since important characteristics of the patients are often found by examining the clinical data. Publicly accessible clinical data on COVID-19, however, remain limited despite the immediate need. OBJECTIVE To provide shareable clinical data to catalyze COVID-19 research, we present Columbia Open Health Data for COVID-19 Research (COHD-COVID), a publicly accessible database providing clinical concept prevalence, clinical concept co-occurrence, and clinical symptom prevalence for hospitalized patients with COVID-19. COHD-COVID also provides data on hospitalized patients with influenza and general hospitalized patients as comparator cohorts. METHODS The data used in COHD-COVID were obtained from NewYork-Presbyterian/Columbia University Irving Medical Center's electronic health records database. Condition, drug, and procedure concepts were obtained from the visits of identified patients from the cohorts. Rare concepts were excluded, and the true concept counts were perturbed using Poisson randomization to protect patient privacy. Concept prevalence, concept prevalence ratio, concept co-occurrence, and symptom prevalence were calculated using the obtained concepts. RESULTS Concept prevalence and concept prevalence ratio analyses showed the clinical characteristics of the COVID-19 cohorts, confirming the well-known characteristics of COVID-19 (eg, acute lower respiratory tract infection and cough). The concepts related to the well-known characteristics of COVID-19 recorded high prevalence and high prevalence ratio in the COVID-19 cohort compared to the hospitalized influenza cohort and general hospitalized cohort. Concept co-occurrence analyses showed potential associations between specific concepts. In case of acute lower respiratory tract infection in the COVID-19 cohort, a high co-occurrence ratio was obtained with COVID-19-related concepts and commonly used drugs (eg, disease due to coronavirus and acetaminophen). Symptom prevalence analysis indicated symptom-level characteristics of the cohorts and confirmed that well-known symptoms of COVID-19 (eg, fever, cough, and dyspnea) showed higher prevalence than the hospitalized influenza cohort and the general hospitalized cohort. CONCLUSIONS We present COHD-COVID, a publicly accessible database providing useful clinical data for hospitalized patients with COVID-19, hospitalized patients with influenza, and general hospitalized patients. We expect COHD-COVID to provide researchers and clinicians quantitative measures of COVID-19-related clinical features to better understand and combat the pandemic.
Collapse
Affiliation(s)
- Junghwan Lee
- Columbia University, New York, NY, United States
| | - Jae Hyun Kim
- Columbia University, New York, NY, United States
| | - Cong Liu
- Columbia University, New York, NY, United States
| | | | | | - Casey Ta
- Columbia University, New York, NY, United States
| | - Chunhua Weng
- Columbia University, New York, NY, United States
| |
Collapse
|
12
|
Datta A, Flynn NR, Barnette DA, Woeltje KF, Miller GP, Swamidass SJ. Machine learning liver-injuring drug interactions with non-steroidal anti-inflammatory drugs (NSAIDs) from a retrospective electronic health record (EHR) cohort. PLoS Comput Biol 2021; 17:e1009053. [PMID: 34228716 PMCID: PMC8284671 DOI: 10.1371/journal.pcbi.1009053] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 07/16/2021] [Accepted: 05/08/2021] [Indexed: 01/14/2023] Open
Abstract
Drug-drug interactions account for up to 30% of adverse drug reactions. Increasing prevalence of electronic health records (EHRs) offers a unique opportunity to build machine learning algorithms to identify drug-drug interactions that drive adverse events. In this study, we investigated hospitalizations' data to study drug interactions with non-steroidal anti-inflammatory drugs (NSAIDS) that result in drug-induced liver injury (DILI). We propose a logistic regression based machine learning algorithm that unearths several known interactions from an EHR dataset of about 400,000 hospitalization. Our proposed modeling framework is successful in detecting 87.5% of the positive controls, which are defined by drugs known to interact with diclofenac causing an increased risk of DILI, and correctly ranks aggregate risk of DILI for eight commonly prescribed NSAIDs. We found that our modeling framework is particularly successful in inferring associations of drug-drug interactions from relatively small EHR datasets. Furthermore, we have identified a novel and potentially hepatotoxic interaction that might occur during concomitant use of meloxicam and esomeprazole, which are commonly prescribed together to allay NSAID-induced gastrointestinal (GI) bleeding. Empirically, we validate our approach against prior methods for signal detection on EHR datasets, in which our proposed approach outperforms all the compared methods across most metrics, such as area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC).
Collapse
Affiliation(s)
- Arghya Datta
- Department of Computer Science and Engineering, Washington University in Saint Louis, Saint Louis, Missouri, United States of America
| | - Noah R. Flynn
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, Missouri, United States of America
| | - Dustyn A. Barnette
- Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, Arkansas, United States of America
| | - Keith F. Woeltje
- Department of Internal Medicine, Washington University School of Medicine, Saint Louis, Missouri, United States of America
- Center for Clinical Excellence at BJC HealthCare, Saint Louis, Missouri, United States of America
| | - Grover P. Miller
- Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, Arkansas, United States of America
| | - S. Joshua Swamidass
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, Missouri, United States of America
- * E-mail:
| |
Collapse
|
13
|
Lee J, Liu C, Kim JH, Butler A, Shang N, Pang C, Natarajan K, Ryan P, Ta C, Weng C. Comparative effectiveness of medical concept embedding for feature engineering in phenotyping. JAMIA Open 2021; 4:ooab028. [PMID: 34142015 PMCID: PMC8206403 DOI: 10.1093/jamiaopen/ooab028] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 02/23/2021] [Accepted: 05/03/2021] [Indexed: 01/20/2023] Open
Abstract
Objective Feature engineering is a major bottleneck in phenotyping. Properly learned medical concept embeddings (MCEs) capture the semantics of medical concepts, thus are useful for retrieving relevant medical features in phenotyping tasks. We compared the effectiveness of MCEs learned from knowledge graphs and electronic healthcare records (EHR) data in retrieving relevant medical features for phenotyping tasks. Materials and Methods We implemented 5 embedding methods including node2vec, singular value decomposition (SVD), LINE, skip-gram, and GloVe with 2 data sources: (1) knowledge graphs obtained from the observational medical outcomes partnership (OMOP) common data model; and (2) patient-level data obtained from the OMOP compatible electronic health records (EHR) from Columbia University Irving Medical Center (CUIMC). We used phenotypes with their relevant concepts developed and validated by the electronic medical records and genomics (eMERGE) network to evaluate the performance of learned MCEs in retrieving phenotype-relevant concepts. Hits@k% in retrieving phenotype-relevant concepts based on a single and multiple seed concept(s) was used to evaluate MCEs. Results Among all MCEs, MCEs learned by using node2vec with knowledge graphs showed the best performance. Of MCEs based on knowledge graphs and EHR data, MCEs learned by using node2vec with knowledge graphs and MCEs learned by using GloVe with EHR data outperforms other MCEs, respectively. Conclusion MCE enables scalable feature engineering tasks, thereby facilitating phenotyping. Based on current phenotyping practices, MCEs learned by using knowledge graphs constructed by hierarchical relationships among medical concepts outperformed MCEs learned by using EHR data.
Collapse
Affiliation(s)
- Junghwan Lee
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Cong Liu
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Jae Hyun Kim
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Alex Butler
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Ning Shang
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Chao Pang
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Karthik Natarajan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Patrick Ryan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Casey Ta
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| |
Collapse
|
14
|
Havrilla JM, Liu C, Dong X, Weng C, Wang K. PhenCards: a data resource linking human phenotype information to biomedical knowledge. Genome Med 2021; 13:91. [PMID: 34034817 PMCID: PMC8147460 DOI: 10.1186/s13073-021-00909-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 05/13/2021] [Indexed: 02/07/2023] Open
Abstract
We present PhenCards ( https://phencards.org ), a database and web server intended as a one-stop shop for previously disconnected biomedical knowledge related to human clinical phenotypes. Users can query human phenotype terms or clinical notes. PhenCards obtains relevant disease/phenotype prevalence and co-occurrence, drug, procedural, pathway, literature, grant, and collaborator data. PhenCards recommends the most probable genetic diseases and candidate genes based on phenotype terms from clinical notes. PhenCards facilitates exploration of phenotype, e.g., which drugs cause or are prescribed for patient symptoms, which genes likely cause specific symptoms, and which comorbidities co-occur with phenotypes.
Collapse
Affiliation(s)
- James M Havrilla
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Cong Liu
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Xiangchen Dong
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA. .,Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA.
| |
Collapse
|
15
|
Gene expression barcode values reveal a potential link between Parkinson's disease and gastric cancer. Aging (Albany NY) 2021; 13:6171-6181. [PMID: 33596182 PMCID: PMC7950232 DOI: 10.18632/aging.202623] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 01/22/2021] [Indexed: 12/11/2022]
Abstract
Gastric cancer is a disease that develops from the lining of the stomach, whereas Parkinson’s disease is a long-term degenerative disorder of the central nervous system that mainly affects the motor system. Although these two diseases seem to be distinct from each other, increasing evidence suggests that they might be linked. To explore the linkage between these two diseases, differentially expressed genes between the diseased people and their normal controls were identified using the barcode algorithm. This algorithm transforms actual gene expression values into barcode values comprised of 1’s (expressed genes) and 0’s (silenced genes). Once the overlapped differentially expressed genes were identified, their biological relevance was investigated. Thus, using the gene expression profiles and bioinformatics methods, we demonstrate that Parkinson’s disease and gastric cancer are indeed linked. This research may serve as a pilot study, and it will stimulate more research to investigate the relationship between gastric cancer and Parkinson’s disease from the perspective of gene profiles and their functions.
Collapse
|
16
|
Fecho K, Pfaff E, Xu H, Champion J, Cox S, Stillwell L, Peden DB, Bizon C, Krishnamurthy A, Tropsha A, Ahalt SC. A novel approach for exposing and sharing clinical data: the Translator Integrated Clinical and Environmental Exposures Service. J Am Med Inform Assoc 2021; 26:1064-1073. [PMID: 31077269 DOI: 10.1093/jamia/ocz042] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Revised: 03/12/2019] [Accepted: 03/25/2019] [Indexed: 12/12/2022] Open
Abstract
OBJECTIVE This study aimed to develop a novel, regulatory-compliant approach for openly exposing integrated clinical and environmental exposures data: the Integrated Clinical and Environmental Exposures Service (ICEES). MATERIALS AND METHODS The driving clinical use case for research and development of ICEES was asthma, which is a common disease influenced by hundreds of genes and a plethora of environmental exposures, including exposures to airborne pollutants. We developed a pipeline for integrating clinical data on patients with asthma-like conditions with data on environmental exposures derived from multiple public data sources. The data were integrated at the patient and visit level and used to create de-identified, binned, "integrated feature tables," which were then placed behind an OpenAPI. RESULTS Our preliminary evaluation results demonstrate a relationship between exposure to high levels of particulate matter ≤2.5 µm in diameter (PM2.5) and the frequency of emergency department or inpatient visits for respiratory issues. For example, 16.73% of patients with average daily exposure to PM2.5 >9.62 µg/m3 experienced 2 or more emergency department or inpatient visits for respiratory issues in year 2010 compared with 7.93% of patients with lower exposures (n = 23 093). DISCUSSION The results validated our overall approach for openly exposing and sharing integrated clinical and environmental exposures data. We plan to iteratively refine and expand ICEES by including additional years of data, feature variables, and disease cohorts. CONCLUSIONS We believe that ICEES will serve as a regulatory-compliant model and approach for promoting open access to and sharing of integrated clinical and environmental exposures data.
Collapse
Affiliation(s)
- Karamarie Fecho
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Emily Pfaff
- North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Hao Xu
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - James Champion
- North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Steve Cox
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Lisa Stillwell
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - David B Peden
- North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.,Division of Allergy, Immunology and Rheumatology, Center for Environmental Medicine, Asthma & Lung Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.,Department of Pediatrics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Chris Bizon
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Ashok Krishnamurthy
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.,North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Alexander Tropsha
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.,School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Stanley C Ahalt
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.,North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
17
|
Morales DR, Conover MM, You SC, Pratt N, Kostka K, Duarte-Salles T, Fernández-Bertolín S, Aragón M, DuVall SL, Lynch K, Falconer T, van Bochove K, Sung C, Matheny ME, Lambert CG, Nyberg F, Alshammari TM, Williams AE, Park RW, Weaver J, Sena AG, Schuemie MJ, Rijnbeek PR, Williams RD, Lane JCE, Prats-Uribe A, Zhang L, Areia C, Krumholz HM, Prieto-Alhambra D, Ryan PB, Hripcsak G, Suchard MA. Renin-angiotensin system blockers and susceptibility to COVID-19: an international, open science, cohort analysis. Lancet Digit Health 2021; 3:e98-e114. [PMID: 33342753 PMCID: PMC7834915 DOI: 10.1016/s2589-7500(20)30289-2] [Citation(s) in RCA: 71] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 10/29/2020] [Accepted: 11/13/2020] [Indexed: 12/17/2022]
Abstract
BACKGROUND Angiotensin-converting enzyme inhibitors (ACEIs) and angiotensin receptor blockers (ARBs) have been postulated to affect susceptibility to COVID-19. Observational studies so far have lacked rigorous ascertainment adjustment and international generalisability. We aimed to determine whether use of ACEIs or ARBs is associated with an increased susceptibility to COVID-19 in patients with hypertension. METHODS In this international, open science, cohort analysis, we used electronic health records from Spain (Information Systems for Research in Primary Care [SIDIAP]) and the USA (Columbia University Irving Medical Center data warehouse [CUIMC] and Department of Veterans Affairs Observational Medical Outcomes Partnership [VA-OMOP]) to identify patients aged 18 years or older with at least one prescription for ACEIs and ARBs (target cohort) or calcium channel blockers (CCBs) and thiazide or thiazide-like diuretics (THZs; comparator cohort) between Nov 1, 2019, and Jan 31, 2020. Users were defined separately as receiving either monotherapy with these four drug classes, or monotherapy or combination therapy (combination use) with other antihypertensive medications. We assessed four outcomes: COVID-19 diagnosis; hospital admission with COVID-19; hospital admission with pneumonia; and hospital admission with pneumonia, acute respiratory distress syndrome, acute kidney injury, or sepsis. We built large-scale propensity score methods derived through a data-driven approach and negative control experiments across ten pairwise comparisons, with results meta-analysed to generate 1280 study effects. For each study effect, we did negative control outcome experiments using a possible 123 controls identified through a data-rich algorithm. This process used a set of predefined baseline patient characteristics to provide the most accurate prediction of treatment and balance among patient cohorts across characteristics. The study is registered with the EU Post-Authorisation Studies register, EUPAS35296. FINDINGS Among 1 355 349 antihypertensive users (363 785 ACEI or ARB monotherapy users, 248 915 CCB or THZ monotherapy users, 711 799 ACEI or ARB combination users, and 473 076 CCB or THZ combination users) included in analyses, no association was observed between COVID-19 diagnosis and exposure to ACEI or ARB monotherapy versus CCB or THZ monotherapy (calibrated hazard ratio [HR] 0·98, 95% CI 0·84-1·14) or combination use exposure (1·01, 0·90-1·15). ACEIs alone similarly showed no relative risk difference when compared with CCB or THZ monotherapy (HR 0·91, 95% CI 0·68-1·21; with heterogeneity of >40%) or combination use (0·95, 0·83-1·07). Directly comparing ACEIs with ARBs demonstrated a moderately lower risk with ACEIs, which was significant with combination use (HR 0·88, 95% CI 0·79-0·99) and non-significant for monotherapy (0·85, 0·69-1·05). We observed no significant difference between drug classes for risk of hospital admission with COVID-19, hospital admission with pneumonia, or hospital admission with pneumonia, acute respiratory distress syndrome, acute kidney injury, or sepsis across all comparisons. INTERPRETATION No clinically significant increased risk of COVID-19 diagnosis or hospital admission-related outcomes associated with ACEI or ARB use was observed, suggesting users should not discontinue or change their treatment to decrease their risk of COVID-19. FUNDING Wellcome Trust, UK National Institute for Health Research, US National Institutes of Health, US Department of Veterans Affairs, Janssen Research & Development, IQVIA, South Korean Ministry of Health and Welfare Republic, Australian National Health and Medical Research Council, and European Health Data and Evidence Network.
Collapse
Affiliation(s)
- Daniel R Morales
- Division of Population Health and Genomics, University of Dundee, Dundee, UK
| | - Mitchell M Conover
- Observational Health Data Analytics, Janssen Research & Development, Titusville, NJ, USA
| | - Seng Chan You
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea
| | - Nicole Pratt
- Quality Use of Medicines and Pharmacy Research Centre, Clinical and Health Sciences, University of South Australia, Adelaide, SA, Australia
| | | | - Talita Duarte-Salles
- Fundació Institut Universitari per a la Recerca a l'Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, Spain
| | - Sergio Fernández-Bertolín
- Fundació Institut Universitari per a la Recerca a l'Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, Spain
| | - Maria Aragón
- Fundació Institut Universitari per a la Recerca a l'Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, Spain
| | - Scott L DuVall
- Department of Veterans Affairs, Salt Lake City, UT, USA; University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Kristine Lynch
- Department of Veterans Affairs, Salt Lake City, UT, USA; University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Thomas Falconer
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | | | - Cynthia Sung
- Health Services and Systems Research, Duke-NUS Medical School, Singapore
| | - Michael E Matheny
- Geriatric Research Education and Clinical Care Center, Tennessee Valley Healthcare System VA, Nashville, TN, USA; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Christophe G Lambert
- Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, NM, USA
| | - Fredrik Nyberg
- School of Public Health and Community Medicine, Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Thamir M Alshammari
- Medication Safety Research Chair, King Saud University, Riyadh, Saudi Arabia
| | | | - Rae Woong Park
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea
| | - James Weaver
- Observational Health Data Analytics, Janssen Research & Development, Titusville, NJ, USA
| | - Anthony G Sena
- Observational Health Data Analytics, Janssen Research & Development, Titusville, NJ, USA; Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, Netherlands
| | - Martijn J Schuemie
- Observational Health Data Analytics, Janssen Research & Development, Titusville, NJ, USA
| | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, Netherlands
| | - Ross D Williams
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, Netherlands
| | - Jennifer C E Lane
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Albert Prats-Uribe
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Lin Zhang
- School of Public Health, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China; Melbourne School of Public Health, The University of Melbourne, VIC, Australia
| | - Carlos Areia
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Harlan M Krumholz
- Section of Cardiovascular Medicine, Department of Medicine, Yale University, New Haven, CT, USA
| | - Daniel Prieto-Alhambra
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Patrick B Ryan
- Division of Population Health and Genomics, University of Dundee, Dundee, UK; Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Marc A Suchard
- Department of Biostatistics, Fielding School of Public Health, and Department of Computational Medicine, David Geffen School of Medicine at UCLA, University of California, Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
18
|
Artificial Intelligence in Clinical Immunology. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_83-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
19
|
Denecke K. Biomedical Standards and Open Health Data. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11527-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
|
20
|
Fecho K, Haaland P, Krishnamurthy A, Lan B, Ramsey SA, Schmitt PL, Sharma P, Sinha M, Xu H. An approach for open multivariate analysis of integrated clinical and environmental exposures data. INFORMATICS IN MEDICINE UNLOCKED 2021; 26. [PMID: 35875189 PMCID: PMC9302917 DOI: 10.1016/j.imu.2021.100733] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
The Integrated Clinical and Environmental Exposures Service (ICEES)
provides regulatory-compliant open access to sensitive patient data that have
been integrated with public exposures data. ICEES was designed initially to
support dynamic cohort creation and bivariate contingency tests. The objective
of the present study was to develop an open approach to support multivariate
analyses using existing ICEES functionalities and abiding by all regulatory
constraints. We first developed an open approach for generating a multivariate
table that maintains contingencies between clinical and environmental variables
using programmatic calls to the open ICEES application programming interface. We
then applied the approach to data on a large cohort (N = 22,365) of patients
with asthma or related conditions and generated an eight-feature table. Due to
regulatory constraints, data loss was incurred with the incorporation of each
successive feature variable, from a starting sample size of N = 22,365 to a
final sample size of N = 4,556 (20.4%), but data loss was < 10% until the
addition of the final two feature variables. We then applied a generalized
linear model to the subsequent dataset and focused on the impact of seven select
feature variables on asthma exacerbations, defined as annual emergency
department or inpatient visits for respiratory issues. We identified five
feature variables—sex, race, obesity, prednisone, and airborne
particulate exposure—as significant predictors of asthma exacerbations.
We discuss the advantages and disadvantages of ICEES open multivariate analysis
and conclude that, despite limitations, ICEES can provide a valuable resource
for open multivariate analysis and can serve as an exemplar for
regulatory-compliant informatic solutions to open patient data, with
capabilities to explore the impact of environmental exposures on health
outcomes.
Collapse
|
21
|
Abstract
PURPOSE OF REVIEW Healthcare has already been impacted by the fourth industrial revolution exemplified by tip of spear technology, such as artificial intelligence and quantum computing. Yet, there is much to be accomplished as systems remain suboptimal, and full interoperability of digital records is not realized. Given the footprint of technology in healthcare, the field of clinical immunology will certainly see improvements related to these tools. RECENT FINDINGS Biomedical informatics spans the gamut of technology in biomedicine. Within this distinct field, advances are being made, which allow for engineering of systems to automate disease detection, create computable phenotypes and improve record portability. Within clinical immunology, technologies are emerging along these lines and are expected to continue. SUMMARY This review highlights advancements in digital health including learning health systems, electronic phenotyping, artificial intelligence and use of registries. Technological advancements for improving diagnosis and care of patients with primary immunodeficiency diseases is also highlighted.
Collapse
|
22
|
Fang J, Pieper AA, Nussinov R, Lee G, Bekris L, Leverenz JB, Cummings J, Cheng F. Harnessing endophenotypes and network medicine for Alzheimer's drug repurposing. Med Res Rev 2020; 40:2386-2426. [PMID: 32656864 PMCID: PMC7561446 DOI: 10.1002/med.21709] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 06/23/2020] [Accepted: 06/27/2020] [Indexed: 12/16/2022]
Abstract
Following two decades of more than 400 clinical trials centered on the "one drug, one target, one disease" paradigm, there is still no effective disease-modifying therapy for Alzheimer's disease (AD). The inherent complexity of AD may challenge this reductionist strategy. Recent observations and advances in network medicine further indicate that AD likely shares common underlying mechanisms and intermediate pathophenotypes, or endophenotypes, with other diseases. In this review, we consider AD pathobiology, disease comorbidity, pleiotropy, and therapeutic development, and construct relevant endophenotype networks to guide future therapeutic development. Specifically, we discuss six main endophenotype hypotheses in AD: amyloidosis, tauopathy, neuroinflammation, mitochondrial dysfunction, vascular dysfunction, and lysosomal dysfunction. We further consider how this endophenotype network framework can provide advances in computational and experimental strategies for drug-repurposing and identification of new candidate therapeutic strategies for patients suffering from or at risk for AD. We highlight new opportunities for endophenotype-informed, drug discovery in AD, by exploiting multi-omics data. Integration of genomics, transcriptomics, radiomics, pharmacogenomics, and interactomics (protein-protein interactions) are essential for successful drug discovery. We describe experimental technologies for AD drug discovery including human induced pluripotent stem cells, transgenic mouse/rat models, and population-based retrospective case-control studies that may be integrated with multi-omics in a network medicine methodology. In summary, endophenotype-based network medicine methodologies will promote AD therapeutic development that will optimize the usefulness of available data and support deep phenotyping of the patient heterogeneity for personalized medicine in AD.
Collapse
Affiliation(s)
- Jiansong Fang
- Science and Technology Innovation Center, Guangzhou University of Chinese Medicine, Guangzhou, Guangdong 510006, China
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Andrew A Pieper
- Harrington Discovery Institute, University Hospital Case Medical Center; Department of Psychiatry, Case Western Reserve University, Geriatric Research Education and Clinical Centers, Louis Stokes Cleveland VAMC, Cleveland, OH 44106, USA
| | - Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, MD 21702, USA
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Garam Lee
- Cleveland Clinic Lou Ruvo Center for Brain Health, Las Vegas, NV 89106, USA
| | - Lynn Bekris
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
| | - James B. Leverenz
- Lou Ruvo Center for Brain Health, Neurological Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Jeffrey Cummings
- Cleveland Clinic Lou Ruvo Center for Brain Health, Las Vegas, NV 89106, USA
- Department of Brain Health, School of Integrated Health Sciences, UNLV, Las Vegas, NV 89154, USA
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, Ohio 44106, USA
| |
Collapse
|
23
|
Li X, Liu G, Chen W, Bi Z, Liang H. Network analysis of autistic disease comorbidities in Chinese children based on ICD-10 codes. BMC Med Inform Decis Mak 2020; 20:268. [PMID: 33069223 PMCID: PMC7568351 DOI: 10.1186/s12911-020-01282-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Accepted: 10/05/2020] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Autism is a lifelong disability associated with several comorbidities that confound diagnosis and treatment. A better understanding of these comorbidities would facilitate diagnosis and improve treatments. Our aim was to improve the detection of comorbid diseases associated with autism. METHODS We used an FP-growth algorithm to retrospectively infer disease associations using 1488 patients with autism treated at the Guangzhou Women and Children's Medical Center. The disease network was established using Cytoscape 3.7. The rules were internally validated by 10-fold cross-validation. All rules were further verified using the Columbia Open Health Data (COHD) and by literature search. RESULTS We found 148 comorbid diseases including intellectual disability, developmental speech disorder, and epilepsy. The network comprised of 76 nodes and 178 directed links. 158 links were confirmed by literature search and 105 links were validated by COHD. Furthermore, we identified 14 links not previously reported. CONCLUSION We demonstrate that the FP-growth algorithm can detect comorbid disease patterns, including novel ones, in patients with autism.
Collapse
Affiliation(s)
- Xiaojun Li
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, 510623, China
| | - Guangjian Liu
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, 510623, China
| | - Wenxiong Chen
- Department of Neurology, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, 510623, China
| | - Zhisheng Bi
- School of Basic Medical Sciences, Guangzhou Medical University, Guangzhou, 511436, China.
| | - Huiying Liang
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, 510623, China.
| |
Collapse
|
24
|
Ensari I, Pichon A, Lipsky-Gorman S, Bakken S, Elhadad N. Augmenting the Clinical Data Sources for Enigmatic Diseases: A Cross-Sectional Study of Self-Tracking Data and Clinical Documentation in Endometriosis. Appl Clin Inform 2020; 11:769-784. [PMID: 33207385 PMCID: PMC7673957 DOI: 10.1055/s-0040-1718755] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Accepted: 07/14/2020] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Self-tracking through mobile health technology can augment the electronic health record (EHR) as an additional data source by providing direct patient input. This can be particularly useful in the context of enigmatic diseases and further promote patient engagement. OBJECTIVES This study aimed to investigate the additional information that can be gained through direct patient input on poorly understood diseases, beyond what is already documented in the EHR. METHODS This was an observational study including two samples with a clinically confirmed endometriosis diagnosis. We analyzed data from 6,925 women with endometriosis using a research app for tracking endometriosis to assess prevalence of self-reported pain problems, between- and within-person variability in pain over time, endometriosis-affected tasks of daily function, and self-management strategies. We analyzed data from 4,389 patients identified through a large metropolitan hospital EHR to compare pain problems with the self-tracking app and to identify unique data elements that can be contributed via patient self-tracking. RESULTS Pelvic pain was the most prevalent problem in the self-tracking sample (57.3%), followed by gastrointestinal-related (55.9%) and lower back (49.2%) pain. Unique problems that were captured by self-tracking included pain in ovaries (43.7%) and uterus (37.2%). Pain experience was highly variable both across and within participants over time. Within-person variation accounted for 58% of the total variance in pain scores, and was large in magnitude, based on the ratio of within- to between-person variability (0.92) and the intraclass correlation (0.42). Work was the most affected daily function task (49%), and there was significant within- and between-person variability in self-management effectiveness. Prevalence rates in the EHR were significantly lower, with abdominal pain being the most prevalent (36.5%). CONCLUSION For enigmatic diseases, patient self-tracking as an additional data source complementary to EHR can enable learning from the patient to more accurately and comprehensively evaluate patient health history and status.
Collapse
Affiliation(s)
- Ipek Ensari
- Data Science Institute, Columbia University, New York, New York, United States
| | - Adrienne Pichon
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York, United States
| | - Sharon Lipsky-Gorman
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York, United States
| | - Suzanne Bakken
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York, United States
- Columbia School of Nursing, Columbia University, New York, New York, United States
| | - Noémie Elhadad
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York, United States
| |
Collapse
|
25
|
Yue X, Wang Z, Huang J, Parthasarathy S, Moosavinasab S, Huang Y, Lin SM, Zhang W, Zhang P, Sun H. Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics 2020; 36:1241-1251. [PMID: 31584634 PMCID: PMC7703771 DOI: 10.1093/bioinformatics/btz718] [Citation(s) in RCA: 94] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 08/25/2019] [Accepted: 09/26/2019] [Indexed: 01/12/2023] Open
Abstract
MOTIVATION Graph embedding learning that aims to automatically learn low-dimensional node representations, has drawn increasing attention in recent years. To date, most recent graph embedding methods are evaluated on social and information networks and are not comprehensively studied on biomedical networks under systematic experiments and analyses. On the other hand, for a variety of biomedical network analysis tasks, traditional techniques such as matrix factorization (which can be seen as a type of graph embedding methods) have shown promising results, and hence there is a need to systematically evaluate the more recent graph embedding methods (e.g. random walk-based and neural network-based) in terms of their usability and potential to further the state-of-the-art. RESULTS We select 11 representative graph embedding methods and conduct a systematic comparison on 3 important biomedical link prediction tasks: drug-disease association (DDA) prediction, drug-drug interaction (DDI) prediction, protein-protein interaction (PPI) prediction; and 2 node classification tasks: medical term semantic type classification, protein function prediction. Our experimental results demonstrate that the recent graph embedding methods achieve promising results and deserve more attention in the future biomedical graph analysis. Compared with three state-of-the-art methods for DDAs, DDIs and protein function predictions, the recent graph embedding methods achieve competitive performance without using any biological features and the learned embeddings can be treated as complementary representations for the biological features. By summarizing the experimental results, we provide general guidelines for properly selecting graph embedding methods and setting their hyper-parameters for different biomedical tasks. AVAILABILITY AND IMPLEMENTATION As part of our contributions in the paper, we develop an easy-to-use Python package with detailed instructions, BioNEV, available at: https://github.com/xiangyue9607/BioNEV, including all source code and datasets, to facilitate studying various graph embedding methods on biomedical tasks. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiang Yue
- Department of Computer Science and Engineering, OH, USA
| | - Zhen Wang
- Department of Computer Science and Engineering, OH, USA
| | - Jingong Huang
- Department of Electrical and Computer Engineering, The Ohio State University, Columbus, OH, USA
| | | | - Soheil Moosavinasab
- Research Information Solutions and Innovation, The Research Institute at Nationwide Children’s Hospital, Columbus, OH, USA
| | - Yungui Huang
- Research Information Solutions and Innovation, The Research Institute at Nationwide Children’s Hospital, Columbus, OH, USA
| | - Simon M Lin
- Research Information Solutions and Innovation, The Research Institute at Nationwide Children’s Hospital, Columbus, OH, USA
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Ping Zhang
- Department of Computer Science and Engineering, OH, USA
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Huan Sun
- Department of Computer Science and Engineering, OH, USA
| |
Collapse
|
26
|
Fecho K, Ahalt SC, Arunachalam S, Champion J, Chute CG, Davis S, Gersing K, Glusman G, Hadlock J, Lee J, Pfaff E, Robinson M, Sid E, Ta C, Xu H, Zhu R, Zhu Q, Peden DB. Sex, obesity, diabetes, and exposure to particulate matter among patients with severe asthma: Scientific insights from a comparative analysis of open clinical data sources during a five-day hackathon. J Biomed Inform 2019; 100:103325. [PMID: 31676459 PMCID: PMC6953386 DOI: 10.1016/j.jbi.2019.103325] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Revised: 09/06/2019] [Accepted: 10/28/2019] [Indexed: 12/14/2022]
Abstract
This special communication describes activities, products, and lessons learned from a recent hackathon that was funded by the National Center for Advancing Translational Sciences via the Biomedical Data Translator program ('Translator'). Specifically, Translator team members self-organized and worked together to conceptualize and execute, over a five-day period, a multi-institutional clinical research study that aimed to examine, using open clinical data sources, relationships between sex, obesity, diabetes, and exposure to airborne fine particulate matter among patients with severe asthma. The goal was to develop a proof of concept that this new model of collaboration and data sharing could effectively produce meaningful scientific results and generate new scientific hypotheses. Three Translator Clinical Knowledge Sources, each of which provides open access (via Application Programming Interfaces) to data derived from the electronic health record systems of major academic institutions, served as the source of study data. Jupyter Python notebooks, shared in GitHub repositories, were used to call the knowledge sources and analyze and integrate the results. The results replicated established or suspected relationships between sex, obesity, diabetes, exposure to airborne fine particulate matter, and severe asthma. In addition, the results demonstrated specific differences across the three Translator Clinical Knowledge Sources, suggesting cohort- and/or environment-specific factors related to the services themselves or the catchment area from which each service derives patient data. Collectively, this special communication demonstrates the power and utility of intense, team-oriented hackathons and offers general technical, organizational, and scientific lessons learned.
Collapse
Affiliation(s)
- Karamarie Fecho
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| | - Stanley C Ahalt
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Saravanan Arunachalam
- Institute for the Environment, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - James Champion
- North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | - Sarah Davis
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Kenneth Gersing
- National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Jewel Lee
- Institute for Systems Biology, Seattle, WA, USA
| | - Emily Pfaff
- North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | - Eric Sid
- National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD, USA
| | - Casey Ta
- Columbia University, New York, NY, USA
| | - Hao Xu
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Richard Zhu
- Johns Hopkins University, Baltimore, MD, USA
| | - Qian Zhu
- National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD, USA
| | - David B Peden
- North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA; Center for Environmental Medicine, Asthma & Lung Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA; Division of Allergy, Immunology and Rheumatology, Department of Pediatrics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
27
|
Paik H, Kan MJ, Rappoport N, Hadley D, Sirota M, Chen B, Manber U, Cho SB, Butte AJ. Tracing diagnosis trajectories over millions of patients reveal an unexpected risk in schizophrenia. Sci Data 2019; 6:201. [PMID: 31615985 PMCID: PMC6794302 DOI: 10.1038/s41597-019-0220-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Accepted: 09/27/2019] [Indexed: 02/07/2023] Open
Abstract
The identification of novel disease associations using big-data for patient care has had limited success. In this study, we created a longitudinal disease network of traced readmissions (disease trajectories), merging data from over 10.4 million inpatients through the Healthcare Cost and Utilization Project, which allowed the representation of disease progression mapping over 300 diseases. From these disease trajectories, we discovered an interesting association between schizophrenia and rhabdomyolysis, a rare muscle disease (incidence < 1E-04) (relative risk, 2.21 [1.80-2.71, confidence interval = 0.95], P-value 9.54E-15). We validated this association by using independent electronic medical records from over 830,000 patients at the University of California, San Francisco (UCSF) medical center. A case review of 29 rhabdomyolysis incidents in schizophrenia patients at UCSF demonstrated that 62% are idiopathic, without the use of any drug known to lead to this adverse event, suggesting a warning to physicians to watch for this unexpected risk of schizophrenia. Large-scale analysis of disease trajectories can help physicians understand potential sequential events in their patients.
Collapse
Affiliation(s)
- Hyojung Paik
- Bakar Computational Health Sciences Institute, University of California, San Francisco, 550 16th Street, San Francisco, CA, 9414, USA
- Department of Pediatrics, University of California, San Francisco, 550 16th Street, San Francisco, CA, 94143, USA
- Korea Institute of Science and Technology Information, Center for Supercomputing Application, Division of Supercomputing, Daejeon, 34141, South Korea
- National Institute of Health, Division of Bio-Medical Informatics, Center for Genome Science, OHTAC, 187 Osongsaengmyeong2(i)-ro, Gangoe-myeon, Cheongwon-gun, ChoongchungBuk-do, South Korea
| | - Matthew J Kan
- Bakar Computational Health Sciences Institute, University of California, San Francisco, 550 16th Street, San Francisco, CA, 9414, USA
- Department of Pediatrics, University of California, San Francisco, 550 16th Street, San Francisco, CA, 94143, USA
| | - Nadav Rappoport
- Bakar Computational Health Sciences Institute, University of California, San Francisco, 550 16th Street, San Francisco, CA, 9414, USA
- Department of Pediatrics, University of California, San Francisco, 550 16th Street, San Francisco, CA, 94143, USA
| | - Dexter Hadley
- Bakar Computational Health Sciences Institute, University of California, San Francisco, 550 16th Street, San Francisco, CA, 9414, USA
- Department of Pediatrics, University of California, San Francisco, 550 16th Street, San Francisco, CA, 94143, USA
| | - Marina Sirota
- Bakar Computational Health Sciences Institute, University of California, San Francisco, 550 16th Street, San Francisco, CA, 9414, USA
- Department of Pediatrics, University of California, San Francisco, 550 16th Street, San Francisco, CA, 94143, USA
| | - Bin Chen
- Bakar Computational Health Sciences Institute, University of California, San Francisco, 550 16th Street, San Francisco, CA, 9414, USA
- Department of Pediatrics, University of California, San Francisco, 550 16th Street, San Francisco, CA, 94143, USA
| | - Udi Manber
- Bakar Computational Health Sciences Institute, University of California, San Francisco, 550 16th Street, San Francisco, CA, 9414, USA
- Department of Medicine, University of California, San Francisco, 505 Parnassus Avenue, San Francisco, CA, 94143, USA
| | - Seong Beom Cho
- National Institute of Health, Division of Bio-Medical Informatics, Center for Genome Science, OHTAC, 187 Osongsaengmyeong2(i)-ro, Gangoe-myeon, Cheongwon-gun, ChoongchungBuk-do, South Korea.
| | - Atul J Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, 550 16th Street, San Francisco, CA, 9414, USA.
- Department of Pediatrics, University of California, San Francisco, 550 16th Street, San Francisco, CA, 94143, USA.
| |
Collapse
|
28
|
Masoudi-Sobhanzadeh Y, Omidi Y, Amanlou M, Masoudi-Nejad A. Drug databases and their contributions to drug repurposing. Genomics 2019; 112:1087-1095. [PMID: 31226485 DOI: 10.1016/j.ygeno.2019.06.021] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 05/23/2019] [Accepted: 06/17/2019] [Indexed: 12/31/2022]
Abstract
Drug repurposing is an interesting field in the drug discovery scope because of reducing time and cost. It is also considered as an appropriate method for finding medications for orphan and rare diseases. Hence, many researchers have proposed novel methods based on databases which contain different information. Thus, a suitable organization of data which facilitates the repurposing applications and provides a tool or a web service can be beneficial. In this review, we categorize drug databases and discuss their advantages and disadvantages. Surprisingly, to the best of our knowledge, the importance and potential of databases in drug repurposing are yet to be emphasized. Indeed, the available databases can be divided into several groups based on data content, and different classes can be applied to find a new application of the existing drugs. Furthermore, we propose some suggestions for making databases more effective and popular in this field.
Collapse
Affiliation(s)
- Yosef Masoudi-Sobhanzadeh
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Yadollah Omidi
- Research Center for Pharmaceutical Nanotechnology and Department of Pharamaceutics, Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Massoud Amanlou
- Drug Design and Development Research Center, The Institute of Pharmaceutical Sciences (TIPS), Tehran University of Medical Sciences, Tehran 14176-53955, Iran
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.
| |
Collapse
|
29
|
Ahalt SC, Chute CG, Fecho K, Glusman G, Hadlock J, Taylor CO, Pfaff ER, Robinson PN, Solbrig H, Ta C, Tatonetti N, Weng C. Clinical Data: Sources and Types, Regulatory Constraints, Applications. Clin Transl Sci 2019; 12:329-333. [PMID: 31074176 PMCID: PMC6617834 DOI: 10.1111/cts.12638] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Accepted: 03/27/2019] [Indexed: 12/30/2022] Open
Affiliation(s)
- Stanley C Ahalt
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | | | - Karamarie Fecho
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | | | | | | | - Emily R Pfaff
- North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | | | | | - Casey Ta
- Columbia University, New York, New York, USA
| | | | | | | |
Collapse
|