Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Shivade C, Raghavan P, Fosler-Lussier E, Embi PJ, Elhadad N, Johnson SB, Lai AM. A review of approaches to identifying patient phenotype cohorts using electronic health records. J Am Med Inform Assoc 2013;21:221-30. [PMID: 24201027 PMCID: PMC3932460 DOI: 10.1136/amiajnl-2013-001935] [Citation(s) in RCA: 278] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

For:	Shivade C, Raghavan P, Fosler-Lussier E, Embi PJ, Elhadad N, Johnson SB, Lai AM. A review of approaches to identifying patient phenotype cohorts using electronic health records. J Am Med Inform Assoc 2013;21:221-30. [PMID: 24201027 PMCID: PMC3932460 DOI: 10.1136/amiajnl-2013-001935] [Citation(s) in RCA: 278] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

Number

Cited by Other Article(s)

Ding S, Zhang S, Hu X, Zou N. Identify and mitigate bias in electronic phenotyping: A comprehensive study from computational perspective. J Biomed Inform 2024:104671. [PMID: 38876452 DOI: 10.1016/j.jbi.2024.104671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 05/26/2024] [Accepted: 06/05/2024] [Indexed: 06/16/2024]

Yan S, Melnick K, He X, Lyu T, Moor RSF, Still MEH, Mitchell DA, Shenkman EA, Wang H, Guo Y, Bian J, Ghiaseddin AP. Developing a computable phenotype for glioblastoma. Neuro Oncol 2024;26:1163-1170. [PMID: 38141226 PMCID: PMC11145437 DOI: 10.1093/neuonc/noad249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Indexed: 12/25/2023] Open

Affiliation(s)

Sandra Yan Department of Neurosurgery, College of Medicine, University of Florida, Gainesville, Florida, USA
Kaitlyn Melnick Department of Neurosurgery, College of Medicine, University of Florida, Gainesville, Florida, USA
Xing He Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
Tianchen Lyu Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
Rachel S F Moor Department of Neurosurgery, College of Medicine, University of Florida, Gainesville, Florida, USA
Megan E H Still Department of Neurosurgery, College of Medicine, University of Florida, Gainesville, Florida, USA
Duane A Mitchell Department of Neurosurgery, College of Medicine, University of Florida, Gainesville, Florida, USA
Elizabeth A Shenkman Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
Han Wang Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
Yi Guo Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
Jiang Bian Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
Ashley P Ghiaseddin Department of Neurosurgery, College of Medicine, University of Florida, Gainesville, Florida, USA

Collapse

Cheung KS. Big data approach in the field of gastric and colorectal cancer research. J Gastroenterol Hepatol 2024;39:1027-1032. [PMID: 38413187 DOI: 10.1111/jgh.16527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Accepted: 02/07/2024] [Indexed: 02/29/2024]

De Clercq L, Himmelreich JCL, Harskamp RE. Quality of heart failure registration in primary care: observations from 1 million electronic health records in the Amsterdam Metropolitan Area. Diagnosis (Berl) 2024;0:dx-2024-0009. [PMID: 38741552 DOI: 10.1515/dx-2024-0009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 04/22/2024] [Indexed: 05/16/2024]

Li Y, Yang AY, Marelli A, Li Y. MixEHR-SurG: A joint proportional hazard and guided topic model for inferring mortality-associated topics from electronic health records. J Biomed Inform 2024;153:104638. [PMID: 38631461 DOI: 10.1016/j.jbi.2024.104638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/07/2024] [Accepted: 04/03/2024] [Indexed: 04/19/2024]

Abstract

Survival models can help medical practitioners to evaluate the prognostic importance of clinical variables to patient outcomes such as mortality or hospital readmission and subsequently design personalized treatment regimes. Electronic Health Records (EHRs) hold the promise for large-scale survival analysis based on systematically recorded clinical features for each patient. However, existing survival models either do not scale to high dimensional and multi-modal EHR data or are difficult to interpret. In this study, we present a supervised topic model called MixEHR-SurG to simultaneously integrate heterogeneous EHR data and model survival hazard. Our contributions are three-folds: (1) integrating EHR topic inference with Cox proportional hazards likelihood; (2) integrating patient-specific topic hyperparameters using the PheCode concepts such that each topic can be identified with exactly one PheCode-associated phenotype; (3) multi-modal survival topic inference. This leads to a highly interpretable survival topic model that can infer PheCode-specific phenotype topics associated with patient mortality. We evaluated MixEHR-SurG using a simulated dataset and two real-world EHR datasets: the Quebec Congenital Heart Disease (CHD) data consisting of 8211 subjects with 75,187 outpatient claim records of 1767 unique ICD codes; the MIMIC-III consisting of 1458 subjects with multi-modal EHR records. Compared to the baselines, MixEHR-SurG achieved a superior dynamic AUROC for mortality prediction, with a mean AUROC score of 0.89 in the simulation dataset and a mean AUROC of 0.645 on the CHD dataset. Qualitatively, MixEHR-SurG associates severe cardiac conditions with high mortality risk among the CHD patients after the first heart failure hospitalization and critical brain injuries with increased mortality among the MIMIC-III patients after their ICU discharge. Together, the integration of the Cox proportional hazards model and EHR topic inference in MixEHR-SurG not only leads to competitive mortality prediction but also meaningful phenotype topics for in-depth survival analysis. The software is available at GitHub: https://github.com/li-lab-mcgill/MixEHR-SurG.

Collapse

Martins C, Neves B, Teixeira AS, Froes M, Sarmento P, Machado J, Magalhães CA, Silva NA, Silva MJ, Leite F. Identifying subgroups in heart failure patients with multimorbidity by clustering and network analysis. BMC Med Inform Decis Mak 2024;24:95. [PMID: 38622703 PMCID: PMC11020914 DOI: 10.1186/s12911-024-02497-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 04/03/2024] [Indexed: 04/17/2024] Open

Vanderbleek JJ, Owensby JK, McAnnally A, England BR, Chen L, Curtis JR, Yun H. Classifying Multimorbidity Using Drug Concepts via the Rx-Risk Comorbidity Index: Methods and Comparative Cross-Sectional Study. Arthritis Care Res (Hoboken) 2024;76:559-569. [PMID: 37986017 DOI: 10.1002/acr.25273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 06/26/2023] [Accepted: 11/14/2023] [Indexed: 11/22/2023]

Abstract

OBJECTIVE

The study objective was to update a method to identify comorbid conditions using only medication information in circumstances in which diagnosis codes may be undercaptured, such as in single-specialty electronic health records (EHRs), and to compare the distribution of comorbidities across Rx-Risk versus other traditional comorbidity indices.

METHODS

Using First Databank, RxNorm, and its web-based clients, RxNav and RxClass, we mapped Drug Concept Unique Identifiers (RxCUIs), National Drug Codes (NDCs), and Anatomical Therapeutic Chemical (ATC) codes to Rx-Risk, a medication-focused comorbidity index. In established rheumatoid arthritis (RA) and osteoarthritis (OA) cohorts within the Rheumatology Informatics System for Effectiveness registry, we then compared Rx-Risk with other comorbidity indices, including the Charlson Comorbidity Index, Rheumatic Disease Comorbidity Index (RDCI), and Elixhauser.

RESULTS

We identified 965 unique ingredient RxCUIs representing the 46 Rx-Risk comorbidity categories. After excluding dosage form and ingredient related RxCUIs, 80,911 unique associated RxCUIs were mapped to the index. Additionally, 187,024 unique NDCs and 354 ATC codes were obtained and mapped to the index categories. When compared to traditional comorbidity indices in the RA cohort, the median score for Rx-Risk (median 6.00 [25th percentile 2, 75th percentile 9]) was much greater than for Charlson (median 0 [25th percentile 0, 75th percentile 0]), RDCI (median 0 [25th percentile 0, 75th percentile 0]), and Elixhauser (median 1 [25th percentile 1, 75th percentile 1]). Analyses of the OA cohort yielded similar results. For patients with a Charlson score of 0 (85% of total), both the RDCI and Elixhauser were close to 1, but the Rx-Risk score ranged from 0 to 16 or more.

CONCLUSION

The misclassification and under-ascertainment of comorbidities in single-specialty EHRs can largely be overcome by using a medication-focused comorbidity index.

Collapse

Mizuno S, Wagata M, Nagaie S, Ishikuro M, Obara T, Tamiya G, Kuriyama S, Tanaka H, Yaegashi N, Yamamoto M, Sugawara J, Ogishima S. Development of phenotyping algorithms for hypertensive disorders of pregnancy (HDP) and their application in more than 22,000 pregnant women. Sci Rep 2024;14:6292. [PMID: 38491024 PMCID: PMC10943000 DOI: 10.1038/s41598-024-55914-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 02/28/2024] [Indexed: 03/18/2024] Open

Yusuf A, Boyne DJ, O'Sullivan DE, Brenner DR, Cheung WY, Mirza I, Jarada TN. Text analysis framework for identifying mutations among non-small cell lung cancer patients from laboratory data. BMC Med Res Methodol 2024;24:63. [PMID: 38468224 PMCID: PMC10926579 DOI: 10.1186/s12874-024-02192-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 02/25/2024] [Indexed: 03/13/2024] Open

Abstract

BACKGROUND

Laboratory data can provide great value to support research aimed at reducing the incidence, prolonging survival and enhancing outcomes of cancer. Data is characterized by the information it carries and the format it holds. Data captured in Alberta's biomarker laboratory repository is free text, cluttered and rouge. Such data format limits its utility and prohibits broader adoption and research development. Text analysis for information extraction of unstructured data can change this and lead to more complete analyses. Previous work on extracting relevant information from free text, unstructured data employed Natural Language Processing (NLP), Machine Learning (ML), rule-based Information Extraction (IE) methods, or a hybrid combination between them.

METHODS

In our study, text analysis was performed on Alberta Precision Laboratories data which consisted of 95,854 entries from the Southern Alberta Dataset (SAD) and 6944 entries from the Northern Alberta Dataset (NAD). The data covers all of Alberta and is completely population-based. Our proposed framework is built around rule-based IE methods. It incorporates topics such as Syntax and Lexical analyses to achieve deterministic extraction of data from biomarker laboratory data (i.e., Epidermal Growth Factor Receptor (EGFR) test results). Lexical analysis compromises of data cleaning and pre-processing, Rich Text Format text conversion into readable plain text format, and normalization and tokenization of text. The framework then passes the text into the Syntax analysis stage which includes the rule-based method of extracting relevant data. Rule-based patterns of the test result are identified, and a Context Free Grammar then generates the rules of information extraction. Finally, the results are linked with the Alberta Cancer Registry to support real-world cancer research studies.

RESULTS

Of the original 5512 entries in the SAD dataset and 5017 entries in the NAD dataset which were filtered for EGFR, the framework yielded 5129 and 3388 extracted EGFR test results from the SAD and NAD datasets, respectively. An accuracy of 97.5% was achieved on a random sample of 362 tests.

CONCLUSIONS

We presented a text analysis framework to extract specific information from unstructured clinical data. Our proposed framework has shown that it can successfully extract relevant information from EGFR test results.

Collapse

Ammar S, Borghoff K, El Mikati IK, Mustafa RA, Noureddine L. Using ICD9/10 codes for identifying ADPKD patients, a validation study. J Nephrol 2024;37:523-525. [PMID: 37907678 DOI: 10.1007/s40620-023-01780-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 09/03/2023] [Indexed: 11/02/2023]

Gao J, Bonzel CL, Hong C, Varghese P, Zakir K, Gronsbell J. Semi-supervised ROC analysis for reliable and streamlined evaluation of phenotyping algorithms. J Am Med Inform Assoc 2024;31:640-650. [PMID: 38128118 PMCID: PMC10873838 DOI: 10.1093/jamia/ocad226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 09/22/2023] [Accepted: 11/20/2023] [Indexed: 12/23/2023] Open

Boeker M, Zöller D, Blasini R, Macho P, Helfer S, Behrens M, Prokosch HU, Gulden C. Effectiveness of IT-supported patient recruitment: study protocol for an interrupted time series study at ten German university hospitals. Trials 2024;25:125. [PMID: 38365848 PMCID: PMC10870691 DOI: 10.1186/s13063-024-07918-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Accepted: 01/09/2024] [Indexed: 02/18/2024] Open

Abstract

BACKGROUND

As part of the German Medical Informatics Initiative, the MIRACUM project establishes data integration centers across ten German university hospitals. The embedded MIRACUM Use Case "Alerting in Care - IT Support for Patient Recruitment", aims to support the recruitment into clinical trials by automatically querying the repositories for patients satisfying eligibility criteria and presenting them as screening candidates. The objective of this study is to investigate whether the developed recruitment tool has a positive effect on study recruitment within a multi-center environment by increasing the number of participants. Its secondary objective is the measurement of organizational burden and user satisfaction of the provided IT solution.

METHODS

The study uses an Interrupted Time Series Design with a duration of 15 months. All trials start in the control phase of randomized length with regular recruitment and change to the intervention phase with additional IT support. The intervention consists of the application of a recruitment-support system which uses patient data collected in general care for screening according to specific criteria. The inclusion and exclusion criteria of all selected trials are translated into a machine-readable format using the OHDSI ATLAS tool. All patient data from the data integration centers is regularly checked against these criteria. The primary outcome is the number of participants recruited per trial and week standardized by the targeted number of participants per week and the expected recruitment duration of the specific trial. Secondary outcomes are usability, usefulness, and efficacy of the recruitment support. Sample size calculation based on simple parallel group assumption can demonstrate an effect size of d=0.57 on a significance level of 5% and a power of 80% with a total number of 100 trials (10 per site). Data describing the included trials and the recruitment process is collected at each site. The primary analysis will be conducted using linear mixed models with the actual recruitment number per week and trial standardized by the expected recruitment number per week and trial as the dependent variable.

DISCUSSION

The application of an IT-supported recruitment solution developed in the MIRACUM consortium leads to an increased number of recruited participants in studies at German university hospitals. It supports employees engaged in the recruitment of trial participants and is easy to integrate in their daily work.

Collapse

He X, Wei R, Huang Y, Chen Z, Lyu T, Bost S, Tong J, Li L, Zhou Y, Guo J, Tang H, Wang F, DeKosky S, Xu H, Chen Y, Zhang R, Xu J, Guo Y, Wu Y, Bian J. Develop and Validate a Computable Phenotype for the Identification of Alzheimer's Disease Patients Using Electronic Health Record Data. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.02.06.24302389. [PMID: 38370766 PMCID: PMC10871460 DOI: 10.1101/2024.02.06.24302389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]

Dong G, Bate A, Haguinet F, Westman G, Dürlich L, Hviid A, Sessa M. Optimizing Signal Management in a Vaccine Adverse Event Reporting System: A Proof-of-Concept with COVID-19 Vaccines Using Signs, Symptoms, and Natural Language Processing. Drug Saf 2024;47:173-182. [PMID: 38062261 PMCID: PMC10821983 DOI: 10.1007/s40264-023-01381-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2023] [Indexed: 01/28/2024]

Abstract

INTRODUCTION

The Vaccine Adverse Event Reporting System (VAERS) has already been challenged by an extreme increase in the number of individual case safety reports (ICSRs) after the market introduction of coronavirus disease 2019 (COVID-19) vaccines. Evidence from scientific literature suggests that when there is an extreme increase in the number of ICSRs recorded in spontaneous reporting databases (such as the VAERS), an accompanying increase in the number of disproportionality signals (sometimes referred to as 'statistical alerts') generated is expected.

OBJECTIVES

The objective of this study was to develop a natural language processing (NLP)-based approach to optimize signal management by excluding disproportionality signals related to listed adverse events following immunization (AEFIs). COVID-19 vaccines were used as a proof-of-concept.

METHODS

The VAERS was used as a data source, and the Finding Associated Concepts with Text Analysis (FACTA+) was used to extract signs and symptoms of listed AEFIs from MEDLINE for COVID-19 vaccines. Disproportionality analyses were conducted according to guidelines and recommendations provided by the US Centers for Disease Control and Prevention. By using signs and symptoms of listed AEFIs, we computed the proportion of disproportionality signals dismissed for COVID-19 vaccines using this approach. Nine NLP techniques, including Generative Pre-Trained Transformer 3.5 (GPT-3.5), were used to automatically retrieve Medical Dictionary for Regulatory Activities Preferred Terms (MedDRA PTs) from signs and symptoms extracted from FACTA+.

RESULTS

Overall, 17% of disproportionality signals for COVID-19 vaccines were dismissed as they reported signs and symptoms of listed AEFIs. Eight of nine NLP techniques used to automatically retrieve MedDRA PTs from signs and symptoms extracted from FACTA+ showed suboptimal performance. GPT-3.5 achieved an accuracy of 78% in correctly assigning MedDRA PTs.

CONCLUSION

Our approach reduced the need for manual exclusion of disproportionality signals related to listed AEFIs and may lead to better optimization of time and resources in signal management.

Collapse

Mollalo A, Hamidi B, Lenert L, Alekseyenko AV. Application of Spatial Analysis for Electronic Health Records: Characterizing Patient Phenotypes and Emerging Trends. RESEARCH SQUARE 2024:rs.3.rs-3443865. [PMID: 37886509 PMCID: PMC10602163 DOI: 10.21203/rs.3.rs-3443865/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]

Abstract

Background

Electronic health records (EHR) commonly contain patient addresses that provide valuable data for geocoding and spatial analysis, enabling more comprehensive descriptions of individual patients for clinical purposes. Despite the widespread use of EHR in clinical decision support and interventions, no systematic review has examined the extent to which spatial analysis is used to characterize patient phenotypes.

Objective

This study reviews advanced spatial analyses that employed individual-level health data from EHR within the US to characterize patient phenotypes.

Methods

We systematically evaluated English-language peer-reviewed articles from PubMed/MEDLINE, Scopus, Web of Science, and Google Scholar databases from inception to August 20, 2023, without imposing constraints on time, study design, or specific health domains.

Results

Only 49 articles met the eligibility criteria. These articles utilized diverse spatial methods, with a predominant focus on clustering techniques, while spatiotemporal analysis (frequentist and Bayesian) and modeling were relatively underexplored. A noteworthy surge (n = 42, 85.7%) in publications was observed post-2017. The publications investigated a variety of adult and pediatric clinical areas, including infectious disease, endocrinology, and cardiology, using phenotypes defined over a range of data domains, such as demographics, diagnoses, and visits. The primary health outcomes investigated were asthma, hypertension, and diabetes. Notably, patient phenotypes involving genomics, imaging, and notes were rarely utilized.

Conclusions

This review underscores the growing interest in spatial analysis of EHR-derived data and highlights knowledge gaps in clinical health, phenotype domains, and spatial methodologies. Additionally, this review proposes guidelines for harnessing the potential of spatial analysis to enhance the context of individual patients for future clinical decision support.

Collapse

Schopow N, Osterhoff G, Baur D. Applications of the Natural Language Processing Tool ChatGPT in Clinical Practice: Comparative Study and Augmented Systematic Review. JMIR Med Inform 2023;11:e48933. [PMID: 38015610 DOI: 10.2196/48933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 06/20/2023] [Accepted: 08/25/2023] [Indexed: 11/29/2023] Open

Abstract

BACKGROUND

This research integrates a comparative analysis of the performance of human researchers and OpenAI's ChatGPT in systematic review tasks and describes an assessment of the application of natural language processing (NLP) models in clinical practice through a review of 5 studies.

OBJECTIVE

This study aimed to evaluate the reliability between ChatGPT and human researchers in extracting key information from clinical articles, and to investigate the practical use of NLP in clinical settings as evidenced by selected studies.

METHODS

The study design comprised a systematic review of clinical articles executed independently by human researchers and ChatGPT. The level of agreement between and within raters for parameter extraction was assessed using the Fleiss and Cohen κ statistics.

RESULTS

The comparative analysis revealed a high degree of concordance between ChatGPT and human researchers for most parameters, with less agreement for study design, clinical task, and clinical implementation. The review identified 5 significant studies that demonstrated the diverse applications of NLP in clinical settings. These studies' findings highlight the potential of NLP to improve clinical efficiency and patient outcomes in various contexts, from enhancing allergy detection and classification to improving quality metrics in psychotherapy treatments for veterans with posttraumatic stress disorder.

CONCLUSIONS

Our findings underscore the potential of NLP models, including ChatGPT, in performing systematic reviews and other clinical tasks. Despite certain limitations, NLP models present a promising avenue for enhancing health care efficiency and accuracy. Future studies must focus on broadening the range of clinical applications and exploring the ethical considerations of implementing NLP applications in health care settings.

Collapse

Meier R, Grischott T, Rachamin Y, Jäger L, Senn O, Rosemann T, Burgstaller JM, Markun S. Importance of different electronic medical record components for chronic disease identification in a Swiss primary care database: a cross-sectional study. Swiss Med Wkly 2023;153:40107. [PMID: 37854021 DOI: 10.57187/smw.2023.40107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2023] Open

Abstract

BACKGROUND

Primary care databases collect electronic medical records with routine data from primary care patients. The identification of chronic diseases in primary care databases often integrates information from various electronic medical record components (EMR-Cs) used by primary care providers. This study aimed to estimate the prevalence of selected chronic conditions using a large Swiss primary care database and to examine the importance of different EMR-Cs for case identification.

METHODS

Cross-sectional study with 120,608 patients of 128 general practitioners in the Swiss FIRE ("Family Medicine Research using Electronic Medical Records") primary care database in 2019. Sufficient criteria on three individual EMR-Cs, namely medication, clinical or laboratory parameters and reasons for encounters, were combined by logical disjunction into definitions of 49 chronic conditions; then prevalence estimates and measures of importance of the individual EMR-Cs for case identification were calculated.

RESULTS

A total of 185,535 cases (i.e. patients with a specific chronic condition) were identified. Prevalence estimates were 27.5% (95% CI: 27.3-27.8%) for hypertension, 13.5% (13.3-13.7%) for dyslipidaemia and 6.6% (6.4-6.7%) for diabetes mellitus. Of all cases, 87.1% (87.0-87.3%) were identified via medication, 22.1% (21.9-22.3%) via clinical or laboratory parameters and 19.3% (19.1-19.5%) via reasons for encounters. The majority (65.4%) of cases were identifiable solely through medication. Of the two other EMR-Cs, clinical or laboratory parameters was most important for identifying cases of chronic kidney disease, anorexia/bulimia nervosa and obesity whereas reasons for encounters was crucial for identifying many low-prevalence diseases as well as cancer, heart disease and osteoarthritis.

CONCLUSIONS

The EMR-C medication was most important for chronic disease identification overall, but identification varied strongly by disease. The analysis of the importance of different EMR-Cs for estimating prevalence revealed strengths and weaknesses of the disease definitions used within the FIRE primary care database. Although prioritising specificity over sensitivity in the EMR-C criteria may have led to underestimation of most prevalences, their sex- and age-specific patterns were consistent with published figures for Swiss general practice.

Collapse

Dhingra LS, Shen M, Mangla A, Khera R. Cardiovascular Care Innovation through Data-Driven Discoveries in the Electronic Health Record. Am J Cardiol 2023;203:136-148. [PMID: 37499593 PMCID: PMC10865722 DOI: 10.1016/j.amjcard.2023.06.104] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/24/2023] [Accepted: 06/29/2023] [Indexed: 07/29/2023]

Sathe NA, Xian S, Mabrey FL, Crosslin DR, Mooney SD, Morrell ED, Lybarger K, Yetisgen M, Jarvik GP, Bhatraju PK, Wurfel MM. Evaluating construct validity of computable acute respiratory distress syndrome definitions in adults hospitalized with COVID-19: an electronic health records based approach. BMC Pulm Med 2023;23:292. [PMID: 37559024 PMCID: PMC10413524 DOI: 10.1186/s12890-023-02560-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 07/11/2023] [Indexed: 08/11/2023] Open

Abstract

BACKGROUND

Evolving ARDS epidemiology and management during COVID-19 have prompted calls to reexamine the construct validity of Berlin criteria, which have been rarely evaluated in real-world data. We developed a Berlin ARDS definition (EHR-Berlin) computable in electronic health records (EHR) to (1) assess its construct validity, and (2) assess how expanding its criteria affected validity.

METHODS

We performed a retrospective cohort study at two tertiary care hospitals with one EHR, among adults hospitalized with COVID-19 February 2020-March 2021. We assessed five candidate definitions for ARDS: the EHR-Berlin definition modeled on Berlin criteria, and four alternatives informed by recent proposals to expand criteria and include patients on high-flow oxygen (EHR-Alternative 1), relax imaging criteria (EHR-Alternatives 2-3), and extend timing windows (EHR-Alternative 4). We evaluated two aspects of construct validity for the EHR-Berlin definition: (1) criterion validity: agreement with manual ARDS classification by experts, available in 175 patients; (2) predictive validity: relationships with hospital mortality, assessed by Pearson r and by area under the receiver operating curve (AUROC). We assessed predictive validity and timing of identification of EHR-Berlin definition compared to alternative definitions.

RESULTS

Among 765 patients, mean (SD) age was 57 (18) years and 471 (62%) were male. The EHR-Berlin definition classified 171 (22%) patients as ARDS, which had high agreement with manual classification (kappa 0.85), and was associated with mortality (Pearson r = 0.39; AUROC 0.72, 95% CI 0.68, 0.77). In comparison, EHR-Alternative 1 classified 219 (29%) patients as ARDS, maintained similar relationships to mortality (r = 0.40; AUROC 0.74, 95% CI 0.70, 0.79, Delong test P = 0.14), and identified patients earlier in their hospitalization (median 13 vs. 15 h from admission, Wilcoxon signed-rank test P < 0.001). EHR-Alternative 3, which removed imaging criteria, had similar correlation (r = 0.41) but better discrimination for mortality (AUROC 0.76, 95% CI 0.72, 0.80; P = 0.036), and identified patients median 2 h (P < 0.001) from admission.

CONCLUSIONS

The EHR-Berlin definition can enable ARDS identification with high criterion validity, supporting large-scale study and surveillance. There are opportunities to expand the Berlin criteria that preserve predictive validity and facilitate earlier identification.

Collapse

Smith G, Miller A, Marra DE, Wu Y, Bian J, Maraganore DM, Anton S. Evaluation of a Computable Phenotype for Successful Cognitive Aging. Mayo Clin Proc Innov Qual Outcomes 2023;7:212-221. [PMID: 37304063 PMCID: PMC10250575 DOI: 10.1016/j.mayocpiqo.2023.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023] Open

Abstract

Objective

To establish, apply, and evaluate a computable phenotype for the recruitment of individuals with successful cognitive aging.

Participants and Methods

Interviews with 10 aging experts identified electronic health record (EHR)-available variables representing successful aging among individuals aged 85 years and older. On the basis of the identified variables, we developed a rule-based computable phenotype algorithm composed of 17 eligibility criteria. Starting September 1, 2019, we applied the computable phenotype algorithm to all living persons aged 85 years and older at the University of Florida Health, which identified 24,024 individuals. This sample was comprised of 13,841 (58%) women, 13,906 (58%) Whites, and 16,557 (69%) non-Hispanics. A priori permission to be contacted for research had been obtained for 11,898 individuals, of whom 470 responded to study announcements and 333 consented to evaluation. Then, we contacted those who consented to evaluate whether their cognitive and functional status clinically met out successful cognitive aging criteria of a modified Telephone Interview for Cognitive Status score of more than 27 and Geriatric Depression Scale of less than 6. The study was completed on December 31, 2022.

Results

Of the 45% of living persons aged 85 years and older included in the University of Florida Health EHR database identified by the computable phenotype as successfully aged, approximately 4% of these responded to study announcements and 333 consented, of which 218 (65%) met successful cognitive aging criteria through direct evaluation.

Conclusion

The study evaluated a computable phenotype algorithm for the recruitment of individuals for a successful aging study using large-scale EHRs. Our study provides proof of concept of using big data and informatics as aids for the recruitment of individuals for prospective cohort studies.

Collapse

Noaeen M, Amini S, Bhasker S, Ghezelsefli Z, Ahmed A, Jafarinezhad O, Abad ZSH. Unlocking the Power of EHRs: Harnessing Unstructured Data for Machine Learning-based Outcome Predictions. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023;2023:1-4. [PMID: 38083058 DOI: 10.1109/embc40787.2023.10340232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]

Gendrin A, Souliotis L, Loudon-Griffiths J, Aggarwal R, Amoako D, Desouza G, Dimitrievska S, Metcalfe P, Louvet E, Sahni H. Identifying Patient Populations in Texts Describing Drug Approvals Through Deep Learning-Based Information Extraction: Development of a Natural Language Processing Algorithm. JMIR Form Res 2023;7:e44876. [PMID: 37347514 DOI: 10.2196/44876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 03/30/2023] [Accepted: 04/17/2023] [Indexed: 06/23/2023] Open

Abstract

BACKGROUND

New drug treatments are regularly approved, and it is challenging to remain up-to-date in this rapidly changing environment. Fast and accurate visualization is important to allow a global understanding of the drug market. Automation of this information extraction provides a helpful starting point for the subject matter expert, helps to mitigate human errors, and saves time.

OBJECTIVE

We aimed to semiautomate disease population extraction from the free text of oncology drug approval descriptions from the BioMedTracker database for 6 selected drug targets. More specifically, we intended to extract (1) line of therapy, (2) stage of cancer of the patient population described in the approval, and (3) the clinical trials that provide evidence for the approval. We aimed to use these results in downstream applications, aiding the searchability of relevant content against related drug project sources.

METHODS

We fine-tuned a state-of-the-art deep learning model, Bidirectional Encoder Representations from Transformers, for each of the 3 desired outputs. We independently applied rule-based text mining approaches. We compared the performances of deep learning and rule-based approaches and selected the best method, which was then applied to new entries. The results were manually curated by a subject matter expert and then used to train new models.

RESULTS

The training data set is currently small (433 entries) and will enlarge over time when new approval descriptions become available or if a choice is made to take another drug target into account. The deep learning models achieved 61% and 56% 5-fold cross-validated accuracies for line of therapy and stage of cancer, respectively, which were treated as classification tasks. Trial identification is treated as a named entity recognition task, and the 5-fold cross-validated F₁-score is currently 87%. Although the scores of the classification tasks could seem low, the models comprise 5 classes each, and such scores are a marked improvement when compared to random classification. Moreover, we expect improved performance as the input data set grows, since deep learning models need to be trained on a large enough amount of data to be able to learn the task they are taught. The rule-based approach achieved 60% and 74% 5-fold cross-validated accuracies for line of therapy and stage of cancer, respectively. No attempt was made to define a rule-based approach for trial identification.

CONCLUSIONS

We developed a natural language processing algorithm that is currently assisting subject matter experts in disease population extraction, which supports health authority approvals. This algorithm achieves semiautomation, enabling subject matter experts to leverage the results for deeper analysis and to accelerate information retrieval in a crowded clinical environment such as oncology.

Collapse

Oommen C, Howlett-Prieto Q, Carrithers MD, Hier DB. Inter-rater agreement for the annotation of neurologic signs and symptoms in electronic health records. Front Digit Health 2023;5:1075771. [PMID: 37383943 PMCID: PMC10294690 DOI: 10.3389/fdgth.2023.1075771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 05/26/2023] [Indexed: 06/30/2023] Open

Alsaleh MM, Allery F, Choi JW, Hama T, McQuillin A, Wu H, Thygesen JH. Prediction of disease comorbidity using explainable artificial intelligence and machine learning techniques: A systematic review. Int J Med Inform 2023;175:105088. [PMID: 37156169 DOI: 10.1016/j.ijmedinf.2023.105088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 03/23/2023] [Accepted: 05/01/2023] [Indexed: 05/10/2023]

Abstract

OBJECTIVE

Disease comorbidity is a major challenge in healthcare affecting the patient's quality of life and costs. AI-based prediction of comorbidities can overcome this issue by improving precision medicine and providing holistic care. The objective of this systematic literature review was to identify and summarise existing machine learning (ML) methods for comorbidity prediction and evaluate the interpretability and explainability of the models.

MATERIALS AND METHODS

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework was used to identify articles in three databases: Ovid Medline, Web of Science and PubMed. The literature search covered a broad range of terms for the prediction of disease comorbidity and ML, including traditional predictive modelling.

RESULTS

Of 829 unique articles, 58 full-text papers were assessed for eligibility. A final set of 22 articles with 61 ML models was included in this review. Of the identified ML models, 33 models achieved relatively high accuracy (80-95%) and AUC (0.80-0.89). Overall, 72% of studies had high or unclear concerns regarding the risk of bias.

DISCUSSION

This systematic review is the first to examine the use of ML and explainable artificial intelligence (XAI) methods for comorbidity prediction. The chosen studies focused on a limited scope of comorbidities ranging from 1 to 34 (mean = 6), and no novel comorbidities were found due to limited phenotypic and genetic data. The lack of standard evaluation for XAI hinders fair comparisons.

CONCLUSION

A broad range of ML methods has been used to predict the comorbidities of various disorders. With further development of explainable ML capacity in the field of comorbidity prediction, there is a significant possibility of identifying unmet health needs by highlighting comorbidities in patient groups that were not previously recognised to be at risk for particular comorbidities.

Collapse

Daniali M, Galer PD, Lewis-Smith D, Parthasarathy S, Kim E, Salvucci DD, Miller JM, Haag S, Helbig I. Enriching representation learning using 53 million patient notes through human phenotype ontology embedding. Artif Intell Med 2023;139:102523. [PMID: 37100502 PMCID: PMC10782859 DOI: 10.1016/j.artmed.2023.102523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 02/17/2023] [Accepted: 02/23/2023] [Indexed: 03/04/2023]

Affiliation(s)

Maryam Daniali Department of Computer Science, Drexel University, Philadelphia, PA, USA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA
Peter D Galer Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy Neuro Genetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
David Lewis-Smith Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy Neuro Genetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Translational and Clinical Research Institute, Newcastle University, Newcastle-upon-Tyne, UK; Department of Clinical Neurosciences, Royal Victoria Infirmary, Newcastle-upon-Tyne, UK
Shridhar Parthasarathy Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy Neuro Genetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA
Edward Kim Department of Computer Science, Drexel University, Philadelphia, PA, USA
Dario D Salvucci Department of Computer Science, Drexel University, Philadelphia, PA, USA
Jeffrey M Miller Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA
Scott Haag Department of Computer Science, Drexel University, Philadelphia, PA, USA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA
Ingo Helbig Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy Neuro Genetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Department of Neurology, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA.

Collapse

Callahan TJ, Stefanksi AL, Ostendorf DM, Wyrwa JM, Davies SJD, Hripcsak G, Hunter LE, Kahn MG. Characterizing Patient Representations for Computational Phenotyping. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2023;2022:319-328. [PMID: 37128436 PMCID: PMC10148332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]

Samaras A, Bekiaridou A, Papazoglou AS, Moysidis DV, Tsoumakas G, Bamidis P, Tsigkas G, Lazaros G, Kassimis G, Fragakis N, Vassilikos V, Zarifis I, Tziakas DN, Tsioufis K, Davlouros P, Giannakoulas G. Artificial intelligence-based mining of electronic health record data to accelerate the digital transformation of the national cardiovascular ecosystem: design protocol of the CardioMining study. BMJ Open 2023;13:e068698. [PMID: 37012018 PMCID: PMC10083759 DOI: 10.1136/bmjopen-2022-068698] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/04/2023] Open

Abstract

INTRODUCTION

Mining of electronic health record (EHRs) data is increasingly being implemented all over the world but mainly focuses on structured data. The capabilities of artificial intelligence (AI) could reverse the underusage of unstructured EHR data and enhance the quality of medical research and clinical care. This study aims to develop an AI-based model to transform unstructured EHR data into an organised, interpretable dataset and form a national dataset of cardiac patients.

METHODS AND ANALYSIS

CardioMining is a retrospective, multicentre study based on large, longitudinal data obtained from unstructured EHRs of the largest tertiary hospitals in Greece. Demographics, hospital administrative data, medical history, medications, laboratory examinations, imaging reports, therapeutic interventions, in-hospital management and postdischarge instructions will be collected, coupled with structured prognostic data from the National Institute of Health. The target number of included patients is 100 000. Natural language processing techniques will facilitate data mining from the unstructured EHRs. The accuracy of the automated model will be compared with the manual data extraction by study investigators. Machine learning tools will provide data analytics. CardioMining aims to cultivate the digital transformation of the national cardiovascular system and fill the gap in medical recording and big data analysis using validated AI techniques.

ETHICS AND DISSEMINATION

This study will be conducted in keeping with the International Conference on Harmonisation Good Clinical Practice guidelines, the Declaration of Helsinki, the Data Protection Code of the European Data Protection Authority and the European General Data Protection Regulation. The Research Ethics Committee of the Aristotle University of Thessaloniki and Scientific and Ethics Council of the AHEPA University Hospital have approved this study. Study findings will be disseminated through peer-reviewed medical journals and international conferences. International collaborations with other cardiovascular registries will be attempted.

TRIAL REGISTRATION NUMBER

NCT05176769.

Collapse

Affiliation(s)

Athanasios Samaras 1st Department of Cardiology, University General Hospital of Thessaloniki AHEPA, Thessaloniki, Greece
Alexandra Bekiaridou 1st Department of Cardiology, University General Hospital of Thessaloniki AHEPA, Thessaloniki, Greece Institute of Bioelectronic Medicine, Feinstein Institutes for Medical Research, New York, New York, USA
Andreas S Papazoglou 1st Department of Cardiology, University General Hospital of Thessaloniki AHEPA, Thessaloniki, Greece
Dimitrios V Moysidis 1st Department of Cardiology, University General Hospital of Thessaloniki AHEPA, Thessaloniki, Greece
Grigorios Tsoumakas School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
Panagiotis Bamidis Medical Physics and Digital Innovation Laboratory, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece
Grigorios Tsigkas Department of Cardiology, University Hospital of Patras, Rio Patras, Greece
George Lazaros 1st Cardiology Department, "Hippokration" General Hospital, University of Athens Medical School, Athens, Greece
George Kassimis 1st Department of Cardiology, University General Hospital of Thessaloniki AHEPA, Thessaloniki, Greece 2nd Cardiology Department, Hippokrateion General Hospital, Aristotle University of Thessaloniki, Thessaloniki, Greece
Nikolaos Fragakis 2nd Cardiology Department, Hippokrateion General Hospital, Aristotle University of Thessaloniki, Thessaloniki, Greece
Vassilios Vassilikos 3rd Cardiology Department, Hippokrateion General Hospital, Aristotle University of Thessaloniki, Thessaloniki, Greece
Ioannis Zarifis Department of Cardiology, "George Papanikolaou" General Hospital, Thessaloniki, Greece
Dimitrios N Tziakas Department of Cardiology, Democritus University of Thrace, University Hospital of Alexandroupolis, Alexandroupolis, Greece
Konstantinos Tsioufis 1st Cardiology Department, "Hippokration" General Hospital, University of Athens Medical School, Athens, Greece
Periklis Davlouros Department of Cardiology, University Hospital of Patras, Rio Patras, Greece
George Giannakoulas 1st Department of Cardiology, University General Hospital of Thessaloniki AHEPA, Thessaloniki, Greece

Collapse

Sharperson C, Hajibonabi F, Hanna TN, Gerard RL, Gilyard S, Johnson JO. Are disparities in emergency department imaging exacerbated during high-volume periods? Clin Imaging 2023;96:9-14. [PMID: 36731373 DOI: 10.1016/j.clinimag.2023.01.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 01/05/2023] [Accepted: 01/09/2023] [Indexed: 01/17/2023]

He T, Belouali A, Patricoski J, Lehmann H, Ball R, Anagnostou V, Kreimeyer K, Botsis T. Trends and opportunities in computable clinical phenotyping: A scoping review. J Biomed Inform 2023;140:104335. [PMID: 36933631 DOI: 10.1016/j.jbi.2023.104335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 03/07/2023] [Accepted: 03/09/2023] [Indexed: 03/18/2023]

Abstract

Identifying patient cohorts meeting the criteria of specific phenotypes is essential in biomedicine and particularly timely in precision medicine. Many research groups deliver pipelines that automatically retrieve and analyze data elements from one or more sources to automate this task and deliver high-performing computable phenotypes. We applied a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines to conduct a thorough scoping review on computable clinical phenotyping. Five databases were searched using a query that combined the concepts of automation, clinical context, and phenotyping. Subsequently, four reviewers screened 7960 records (after removing over 4000 duplicates) and selected 139 that satisfied the inclusion criteria. This dataset was analyzed to extract information on target use cases, data-related topics, phenotyping methodologies, evaluation strategies, and portability of developed solutions. Most studies supported patient cohort selection without discussing the application to specific use cases, such as precision medicine. Electronic Health Records were the primary source in 87.1 % (N = 121) of all studies, and International Classification of Diseases codes were heavily used in 55.4 % (N = 77) of all studies, however, only 25.9 % (N = 36) of the records described compliance with a common data model. In terms of the presented methods, traditional Machine Learning (ML) was the dominant method, often combined with natural language processing and other approaches, while external validation and portability of computable phenotypes were pursued in many cases. These findings revealed that defining target use cases precisely, moving away from sole ML strategies, and evaluating the proposed solutions in the real setting are essential opportunities for future work. There is also momentum and an emerging need for computable phenotyping to support clinical and epidemiological research and precision medicine.

Collapse

Arnold CG, Sonn B, Meyers FJ, Vest A, Puls R, Zirkler E, Edelmann M, Brooks IM, Monte AA. Accessing and utilizing clinical and genomic data from an electronic health record data warehouse. TRANSLATIONAL MEDICINE COMMUNICATIONS 2023;8:7. [PMID: 38223535 PMCID: PMC10786622 DOI: 10.1186/s41231-023-00140-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Accepted: 02/20/2023] [Indexed: 01/16/2024]

Wang L, Foer D, Zhang Y, Karlson EW, Bates DW, Zhou L. Post-Acute COVID-19 Respiratory Symptoms in Patients With Asthma: An Electronic Health Records-Based Study. THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY. IN PRACTICE 2023;11:825-835.e3. [PMID: 36566779 PMCID: PMC9773736 DOI: 10.1016/j.jaip.2022.12.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 11/27/2022] [Accepted: 12/01/2022] [Indexed: 12/24/2022]

Abstract

BACKGROUND

Post-viral respiratory symptoms are common among patients with asthma. Respiratory symptoms after acute COVID-19 are widely reported in the general population, but large-scale studies identifying symptom risk for patients with asthma are lacking.

OBJECTIVE

To identify and compare risk for post-acute COVID-19 respiratory symptoms in patients with and without asthma.

METHODS

This retrospective, observational cohort study included COVID-19-positive patients between March 4, 2020, and January 20, 2021, with up to 180 days of health care follow-up in a health care system in the Northeastern United States. Respiratory symptoms recorded in clinical notes from days 28 to 180 after COVID-19 diagnosis were extracted using natural language processing. Cohorts were stratified by hospitalization status during the acute COVID-19 period. Univariable and multivariable analyses were used to compare symptoms among patients with and without asthma adjusting for demographic and clinical confounders.

RESULTS

Among 31,084 eligible patients with COVID-19, 2863 (9.2%) had hospitalization during the acute COVID-19 period; 4049 (13.0%) had a history of asthma, accounting for 13.8% of hospitalized and 12.9% of nonhospitalized patients. In the post-acute COVID-19 period, patients with asthma had significantly higher risk of shortness of breath, cough, bronchospasm, and wheezing than patients without an asthma history. Incident respiratory symptoms of bronchospasm and wheezing were also higher in patients with asthma. Patients with asthma who had not been hospitalized during acute COVID-19 had additionally higher risk of cough, abnormal breathing, sputum changes, and a wider range of incident respiratory symptoms.

CONCLUSION

Patients with asthma may have an under-recognized burden of respiratory symptoms after COVID-19 warranting increased awareness and monitoring in this population.

Collapse

Li Y, Hu H, Zheng Y, Donahoo WT, Guo Y, Xu J, Chen WH, Liu N, Shenkman EA, Bian J, Guo J. Impact of Contextual-Level Social Determinants of Health on Newer Antidiabetic Drug Adoption in Patients with Type 2 Diabetes. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023;20:ijerph20054036. [PMID: 36901047 PMCID: PMC10001625 DOI: 10.3390/ijerph20054036] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Revised: 02/17/2023] [Accepted: 02/22/2023] [Indexed: 05/14/2023]

Brandt PS, Kho A, Luo Y, Pacheco JA, Walunas TL, Hakonarson H, Hripcsak G, Liu C, Shang N, Weng C, Walton N, Carrell DS, Crane PK, Larson EB, Chute CG, Kullo IJ, Carroll R, Denny J, Ramirez A, Wei WQ, Pathak J, Wiley LK, Richesson R, Starren JB, Rasmussen LV. Characterizing variability of electronic health record-driven phenotype definitions. J Am Med Inform Assoc 2023;30:427-437. [PMID: 36474423 PMCID: PMC9933077 DOI: 10.1093/jamia/ocac235] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 10/19/2022] [Accepted: 11/23/2022] [Indexed: 12/12/2022] Open

Affiliation(s)

Pascal S Brandt Department of Biomedical and Medical Education, University of Washington, Seattle, Washington, USA
Abel Kho Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
Yuan Luo Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
Jennifer A Pacheco Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
Theresa L Walunas Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
Hakon Hakonarson Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
George Hripcsak Department of Biomedical Informatics, Columbia University, New York, New York, USA
Cong Liu Department of Biomedical Informatics, Columbia University, New York, New York, USA
Ning Shang Department of Biomedical Informatics, Columbia University, New York, New York, USA
Chunhua Weng Department of Biomedical Informatics, Columbia University, New York, New York, USA
Nephi Walton Intermountain Precision Genomics, Intermountain Healthcare, St George, Utah, USA
David S Carrell Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
Paul K Crane Department of Medicine, University of Washington, Seattle, Washington, USA
Eric B Larson Department of Medicine, University of Washington, Seattle, Washington, USA Department of Health Services, University of Washington, Seattle, Washington, USA
Christopher G Chute Schools of Medicine, Public Health, and Nursing, Johns Hopkins University, Baltimore, Maryland, USA
Iftikhar J Kullo Department of Cardiovascular Medicine, Mayo Clinic, Rochester, Minnesota, USA
Robert Carroll Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
Josh Denny All of Us Research Program, National Institutes of Health, Bethesda, Maryland, USA
Andrea Ramirez Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
Wei-Qi Wei Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
Jyoti Pathak Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
Laura K Wiley Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
Rachel Richesson Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, Michigan, USA
Justin B Starren Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
Luke V Rasmussen Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA

Collapse

Assess the documentation of cognitive tests and biomarkers in electronic health records via natural language processing for Alzheimer's disease and related dementias. Int J Med Inform 2023;170:104973. [PMID: 36577203 DOI: 10.1016/j.ijmedinf.2022.104973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 12/11/2022] [Accepted: 12/17/2022] [Indexed: 12/24/2022]

Abstract

BACKGROUND

Cognitive tests and biomarkers are the key information to assess the severity and track the progression of Alzheimer's' disease (AD) and AD-related dementias (AD/ADRD), yet, both are often only documented in clinical narratives of patients' electronic health records (EHRs). In this work, we aim to (1) assess the documentation of cognitive tests and biomarkers in EHRs that can be used as real-world endpoints, and (2) identify, extract, and harmonize the different commonly used cognitive tests from clinical narratives using natural language processing (NLP) methods into categorical AD/ADRD severity.

METHODS

We developed a rule-based NLP pipeline to extract the cognitive tests and biomarkers from clinical narratives in AD/ADRD patients' EHRs. We aggregated the extracted results to the patient level and harmonized the cognitive test scores into severity categories using cutoffs determined based on both relevant literature and domain knowledge of AD/ADRD clinicians.

RESULTS

We identified an AD/ADRD cohort of 48,912 patients from the University of Florida (UF) Health system and identified 7 measurements (6 cognitive tests and 1 biomarker) that are frequently documented in our data. Our NLP pipeline achieved an overall F1-score of 0.9059 across the 7 measurements. Among the 6 cognitive tests, we were able to harmonize 4 cognitive test scores into severity categories, and the population characteristics of patients with different severity were described. We also identified several factors related to the availability of their documentation in EHRs.

CONCLUSION

This study demonstrates that our NLP pipelines can extract cognitive tests and biomarkers of AD/ADRD accurately for downstream studies. Although, the documentation of cognitive tests and biomarkers in EHRs appears to be low, RWD is still an important resource for AD/ADRD research. Nevertheless, providing standardized approach to document cognitive tests and biomarkers in EHRS are also warranted.

Collapse

Wu CS, Chen CH, Su CH, Chien YL, Dai HJ, Chen HH. Augmenting DSM-5 diagnostic criteria with self-attention-based BiLSTM models for psychiatric diagnosis. Artif Intell Med 2023;136:102488. [PMID: 36710066 DOI: 10.1016/j.artmed.2023.102488] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Revised: 11/20/2022] [Accepted: 01/09/2023] [Indexed: 01/12/2023]

Abstract

BACKGROUND

Most previous studies make psychiatric diagnoses based on diagnostic terms. In this study we sought to augment Diagnostic and Statistical Manual of Mental Disorders, 5th Edition (DSM-5) diagnostic criteria with deep neural network models to make psychiatric diagnoses based on psychiatric notes.

METHODS

We augmented DSM-5 diagnostic criteria with self-attention-based bidirectional long short-term memory (BiLSTM) models to identify schizophrenia, bipolar, and unipolar depressive disorders. Given that the diagnostic criteria for psychiatric diagnosis include a certain symptom profile and functional impairment, we first extracted psychiatric symptoms and functional features with two approaches, including a lexicon-based approach and a dependency parsing approach. Then, we incorporated free-text discharge notes and extracted features for psychiatric diagnoses with the proposed models.

RESULTS

The micro-averaged F1 scores of the two automatic annotation approaches were greater than 0.8. BiLSTM models with self-attention outperformed the rule-based models with DSM-5 criteria in the prediction of schizophrenia and bipolar disorder, while the latter outperformed the former in predicting unipolar depressive disorder. Approaches for augmenting DSM-5 criteria with a self-attention-based BiLSTM outperformed both pure rule-based and pure deep neural network models. In terms of classification of psychiatric diagnoses, we observed that the performance for schizophrenia and bipolar disorder was acceptable.

CONCLUSION

This DSM-5-augmented deep neural network models showed good performance in identifying psychiatric diagnoses from psychiatric notes. We conclude that it is possible to establish a model that consults clinical notes to make psychiatric diagnoses comparably to physicians. Further research will be extended to outpatient notes and other psychiatric disorders.

Collapse

Sinha P, Meyer NJ, Calfee CS. Biological Phenotyping in Sepsis and Acute Respiratory Distress Syndrome. Annu Rev Med 2023;74:457-471. [PMID: 36469902 PMCID: PMC10617629 DOI: 10.1146/annurev-med-043021-014005] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023;30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open

Abstract

OBJECTIVE

Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used.

MATERIALS AND METHODS

We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies.

RESULTS

Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions.

DISCUSSION

Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released.

CONCLUSION

Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.

Collapse

Obeid JS, Khalifa A, Xavier B, Bou-Daher H, Rockey DC. An AI Approach for Identifying Patients With Cirrhosis. J Clin Gastroenterol 2023;57:82-88. [PMID: 34238846 PMCID: PMC8741865 DOI: 10.1097/mcg.0000000000001586] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Accepted: 06/05/2021] [Indexed: 02/05/2023]

van den Bulk S, Spoelman WA, van Dijkman PRM, Numans ME, Bonten TN. Non-acute chest pain in primary care; referral rates, communication and guideline adherence: a cohort study using routinely collected health data. BMC PRIMARY CARE 2022;23:336. [PMID: 36550420 PMCID: PMC9784001 DOI: 10.1186/s12875-022-01939-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 12/02/2022] [Indexed: 12/24/2022]

Abstract

BACKGROUND

The prevalence of coronary artery disease is increasing due to the aging population and increasing prevalence of cardiovascular risk factors. Non-acute chest pain often is the first symptom of stable coronary artery disease. To optimise care for patients with non-acute chest pain and make efficient use of available resources, we need to know more about the current incidence, referral rate and management of these patients.

METHODS

We used routinely collected health data from the STIZON data warehouse in the Netherlands between 2010 and 2016. Patients > 18 years, with no history of cardiovascular disease, seen by the general practitioner (GP) for non-acute chest pain with a suspected cardiac origin were included. Outcomes were (i) incidence of new non-acute chest pain in primary care, (ii) referral rates to the cardiologist, (iii) correspondence from the cardiologist to the GP, (iv) registration by GPs of received correspondence and; (v) pharmacological guideline adherence after newly diagnosed stable angina pectoris.

RESULTS

In total 9029 patients were included during the study period, resulting in an incidence of new non-acute chest pain of 1.01/1000 patient-years. 2166 (24%) patients were referred to the cardiologist. In 857/2114 (41%) referred patients, correspondence from the cardiologist was not available in the GP's electronic medical record. In 753/1257 (60%) patients with available correspondence, the GP did not code the conclusion in the electronic medical record. Despite guideline recommendations, 37/255 (15%) patients with angina pectoris were not prescribed antiplatelet therapy nor anticoagulation, 69/255 (27%) no statin and 67/255 (26%) no beta-blocker.

CONCLUSION

After referral, both communication from cardiologists and registration of the final diagnosis by GPs were suboptimal. Both cardiologists and GPs should make adequate communication and registration a priority, as it improves health outcomes. Secondary pharmacological prevention in patients with angina pectoris was below guideline standards. So, proactive attention needs to be given to optimise secondary prevention in this high-risk group in primary care.

Collapse

Alzubi R, Alzoubi H, Katsigiannis S, West D, Ramzan N. Automated Detection of Substance-Use Status and Related Information from Clinical Text. SENSORS (BASEL, SWITZERLAND) 2022;22:9609. [PMID: 36559979 PMCID: PMC9783118 DOI: 10.3390/s22249609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Revised: 11/21/2022] [Accepted: 11/25/2022] [Indexed: 06/17/2023]

Woodward AA, Urbanowicz RJ, Naj AC, Moore JH. Genetic heterogeneity: Challenges, impacts, and methods through an associative lens. Genet Epidemiol 2022;46:555-571. [PMID: 35924480 PMCID: PMC9669229 DOI: 10.1002/gepi.22497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 07/06/2022] [Accepted: 07/19/2022] [Indexed: 01/07/2023]

Tanwar A, Zhang J, Ive J, Gupta V, Guo Y. Phenotyping in clinical text with unsupervised numerical reasoning for patient stratification. Exp Biol Med (Maywood) 2022;247:2038-2052. [PMID: 36217914 PMCID: PMC9791305 DOI: 10.1177/15353702221118092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Zou Y, Pesaranghader A, Song Z, Verma A, Buckeridge DL, Li Y. Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model. Sci Rep 2022;12:17868. [PMID: 36284225 PMCID: PMC9596500 DOI: 10.1038/s41598-022-22956-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 10/21/2022] [Indexed: 01/20/2023] Open

Conte M, Flynn A, Boisvert P, Landis-Lewis Z, Richesson R, Friedman C. Computable phenotypes for cohort identification: core content for a new class of FAIR Digital Objects. RESEARCH IDEAS AND OUTCOMES 2022. [DOI: 10.3897/rio.8.e95856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Abstract Introduction We present current work to develop and define a class of digital objects that facilitates patient cohort identification for clinical studies, such that these objects are Findable, Accessible, Interoperable, and Reusable (FAIR) (Wilkinson et al. 2016). Developing this class of FAIR Digital Objects (FDOs) builds on the work of several years to develop the Knowledge Grid (https://kgrid.org/), which facilitates the development, description and implementation of biomedical knowledge packaged in machine-readable and machine-executable formats (Flynn et al. 2018). Additionally, this work aligns with the goals of the Mobilizing Computable Biomedical Knowledge (MCBK) community (https://mobilizecbk.med.umich.edu/) (Mobilizing Computable Biomedical Knowledge 2018). In this abstract, we describe our work to develop a FDO carrying a computable phenotype. Defining computable phenotypes In biomedical informatics, 'phenotyping' describes a data-driven approach to identifying a group of individuals sharing observable characteristics of interest, generally related to a disease or condition, and a 'computable phenotype' (CP) is a machine-processable expression of a phenotypic pattern of these characteristics (Hripcsak and Albers 2018). For the purposes of this work, we are interested in CPs derived from data contained in electronic health record (EHR) systems. This includes both structured data, e.g. codes for diseases, diagnoses, procedures, or laboratory tests, and unstructured data, e.g. free text including patient histories, clinical observations, discharge summaries, and reports. Thus, we define computable phenotype FDOs (CP-FDOs) as a class of FDO that packages an executable EHR-derived CP together with documentation needed to implement and use it effectively for creating cohorts of individuals with similar observable characteristics from EHR data sets. Importance of portable and FAIR CPs There is tremendous excitement for using real-world EHR data to discover important findings about human health and well-being. However, for discovery to happen, researchers need mechanisms like CPs to identify study cohorts for analysis. Beginning in the early 2010s, a growing literature explores various methods for the secondary use of EHR data for patient phenotyping to arrive at consistent study cohorts (Shivade et al. 2014, Banda et al. 2018). The heterogeneous nature of EHR data has inspired a wide variety of phenotyping methods, from those which rely solely on documented codes linked to terms in existing vocabularies to those which combine such codes with other concepts extracted from free text using natural language processing. Our current focus is on packaging CPs inside FDOs for classifying patients as having or not having a phenotype of interest. This can be done within an individual health system, or at scale across a clinical data research network. Using CPs for cohort identification can reduce the time and expense of traditional data set building and clincal trial recruitment, and expand the potential scope of a study population(Boland et al. 2013). Creating and validating CPs requires time, resources, and both clinical and technical expertise. One estimate is that it can take 6-10 months to develop and validate a CP (Shang et al. 2019). And, as there is no standard data model within EHRs in the United States, many CPs are designed for performance at a single site, rather than for portability, which is understood as the ability to implement a phenotype at a different site with similar performance (Shang et al. 2019). While portability is increasingly recognized as an important element of phenotyping, and there have been recent efforts to develop more portable CPs, many of these processes still require significant technical expertise at the implementation site to adapt the phenotype for use on local data. There may also be significant advantages to making CPs FAIR. These include transparency in cohort selection, and better generalizability of results. FAIR CPs may also increase the potential for robust comparisons of data from related studies, leading to better evidence synthesis to improve delivery of care and ultimately human health. Defining a new class of FDOs to hold and convey CPs We believe that packaging validated CPs inside digital objects may alleviate many of the pressures mentioned above, and contributes to making both the processes and products of clinical research more FAIR. To this end, our current work focuses on packaging a validated CP inside a machine-processable FDO. The phenotype of interest identifies pediatric and adult patients with a rare disease (Oliverio et al. 2021), and has several features which make it ideal for transformation to an executable FDO. First, the phenotype utilizes standards to define the clinical characteristics of interest, and is based on a common data model; these features increase the potential for both interoperability and reuse. Additionally, because the phenotype has been validated across three sites, its portability has already been demonstrated. Finally, the full computable phenotype has been shared as a series of SQL queries, including scripts for patient identification, deriving statistics, and validation, which have been annotated with instructions for implementation at other sites. The goals of this work are:

To develop CPs as executable DOs, leveraging previous work to develop executable Knowledge Objects (KO) (Flynn et al. 2018)

To advance our understanding of how to define computable phenotypes as a class of FDO, including what is needed to meet the requirements of binding, abstraction, and encapsulation (Wittenburg et al. 2019)

To develop CPs as executable DOs, leveraging previous work to develop executable Knowledge Objects (KO) (Flynn et al. 2018) To advance our understanding of how to define computable phenotypes as a class of FDO, including what is needed to meet the requirements of binding, abstraction, and encapsulation (Wittenburg et al. 2019) Conclusion Computable phenotypes, packaged as FDOs, may increase the potential both for the portability of a phenotype and the reusability of data resulting from its implementation. Providing CPs as executable FDOs may also reduce barriers to portability and local implementation. In this presentation, we describe our work to develop a FDO computable phenotype from an existing validated phenotype. Lessons learned from this process will increase our understanding of both the technical requirements, and how to address necessary components of abstraction, binding, and encapsulation so that these can function as FAIR Digital Objects. Collapse

Culié D, Schiappa R, Contu S, Scheller B, Villarme A, Dassonville O, Poissonnet G, Bozec A, Chamorey E. Validation and Improvement of a Convolutional Neural Network to Predict the Involved Pathology in a Head and Neck Surgery Cohort. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022;19:12200. [PMID: 36231500 PMCID: PMC9564535 DOI: 10.3390/ijerph191912200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 09/19/2022] [Accepted: 09/22/2022] [Indexed: 06/16/2023]

Development and validation of algorithms to identify patients with chronic kidney disease and related chronic diseases across the Northern Territory, Australia. BMC Nephrol 2022;23:320. [PMID: 36151531 PMCID: PMC9502610 DOI: 10.1186/s12882-022-02947-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 09/13/2022] [Indexed: 11/15/2022] Open

Abstract

Background

Electronic health records can be used for population-wide identification and monitoring of disease. The Territory Kidney Care project developed algorithms to identify individuals with chronic kidney disease (CKD) and several commonly comorbid chronic diseases. This study aims to describe the development and validation of our algorithms for CKD, diabetes, hypertension, and cardiovascular disease. A secondary aim of the study was to describe data completeness of the Territory Kidney Care database.

Methods

The Territory Kidney Care database consolidates electronic health records from multiple health services including public hospitals (n = 6) and primary care health services (> 60) across the Northern Territory, Australia. Using the database (n = 48,569) we selected a stratified random sample of patients (n = 288), which included individuals with mild to end-stage CKD. Diagnostic accuracy of the algorithms was tested against blinded manual chart reviews. Data completeness of the database was also described.

Results

For CKD defined as CKD stage 1 or higher (eGFR of any level with albuminuria or persistent eGFR < 60 ml/min/1.73², including renal replacement therapy) overall algorithm sensitivity was 93% (95%CI 89 to 96%) and specificity was 73% (95%CI 64 to 82%). For CKD defined as CKD stage 3a or higher (eGFR < 60 ml/min/1.73²) algorithm sensitivity and specificity were 93% and 97% respectively. Among the CKD 1 to 5 staging algorithms, the CKD stage 5 algorithm was most accurate with > 99% sensitivity and specificity. For related comorbidities – algorithm sensitivity and specificity results were 75% and 97% for diabetes; 85% and 88% for hypertension; and 79% and 96% for cardiovascular disease.

Conclusions

We developed and validated algorithms to identify CKD and related chronic diseases within electronic health records. Validation results showed that CKD algorithms have a high degree of diagnostic accuracy compared to traditional administrative codes. Our highly accurate algorithms present new opportunities in early kidney disease detection, monitoring, and epidemiological research.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12882-022-02947-9.

Collapse

Chushig-Muzo D, Soguero-Ruiz C, Miguel Bohoyo PD, Mora-Jiménez I. Learning and visualizing chronic latent representations using electronic health records. BioData Min 2022;15:18. [PMID: 36064616 PMCID: PMC9446539 DOI: 10.1186/s13040-022-00303-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 07/27/2022] [Indexed: 12/03/2022] Open

Abstract

Background

Nowadays, patients with chronic diseases such as diabetes and hypertension have reached alarming numbers worldwide. These diseases increase the risk of developing acute complications and involve a substantial economic burden and demand for health resources. The widespread adoption of Electronic Health Records (EHRs) is opening great opportunities for supporting decision-making. Nevertheless, data extracted from EHRs are complex (heterogeneous, high-dimensional and usually noisy), hampering the knowledge extraction with conventional approaches.

Methods

We propose the use of the Denoising Autoencoder (DAE), a Machine Learning (ML) technique allowing to transform high-dimensional data into latent representations (LRs), thus addressing the main challenges with clinical data. We explore in this work how the combination of LRs with a visualization method can be used to map the patient data in a two-dimensional space, gaining knowledge about the distribution of patients with different chronic conditions. Furthermore, this representation can be also used to characterize the patient’s health status evolution, which is of paramount importance in the clinical setting.

Results

To obtain clinical LRs, we considered real-world data extracted from EHRs linked to the University Hospital of Fuenlabrada in Spain. Experimental results showed the great potential of DAEs to identify patients with clinical patterns linked to hypertension, diabetes and multimorbidity. The procedure allowed us to find patients with the same main chronic disease but different clinical characteristics. Thus, we identified two kinds of diabetic patients with differences in their drug therapy (insulin and non-insulin dependant), and also a group of women affected by hypertension and gestational diabetes. We also present a proof of concept for mapping the health status evolution of synthetic patients when considering the most significant diagnoses and drugs associated with chronic patients.

Conclusion

Our results highlighted the value of ML techniques to extract clinical knowledge, supporting the identification of patients with certain chronic conditions. Furthermore, the patient’s health status progression on the two-dimensional space might be used as a tool for clinicians aiming to characterize health conditions and identify their more relevant clinical codes.

Supplementary Information

The online version contains supplementary material available at (10.1186/s13040-022-00303-z).

Collapse

Zhong C, Liao K, Chen W, Liu Q, Peng B, Huang X, Peng J, Wei Z. Hierarchical reinforcement learning for automatic disease diagnosis. Bioinformatics 2022;38:3995-4001. [PMID: 35775965 DOI: 10.1093/bioinformatics/btac408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 05/16/2022] [Accepted: 06/29/2022] [Indexed: 12/24/2022] Open

Abdulkareem M, Kenawy AA, Rauseo E, Lee AM, Sojoudi A, Amir-Khalili A, Lekadir K, Young AA, Barnes MR, Barckow P, Khanji MY, Aung N, Petersen SE. Predicting post-contrast information from contrast agent free cardiac MRI using machine learning: Challenges and methods. Front Cardiovasc Med 2022;9:894503. [PMID: 36051279 PMCID: PMC9426684 DOI: 10.3389/fcvm.2022.894503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 06/27/2022] [Indexed: 11/29/2022] Open

Abstract

Objectives

Currently, administering contrast agents is necessary for accurately visualizing and quantifying presence, location, and extent of myocardial infarction (MI) with cardiac magnetic resonance (CMR). In this study, our objective is to investigate and analyze pre- and post-contrast CMR images with the goal of predicting post-contrast information using pre-contrast information only. We propose methods and identify challenges.

Methods

The study population consists of 272 retrospectively selected CMR studies with diagnoses of MI (n = 108) and healthy controls (n = 164). We describe a pipeline for pre-processing this dataset for analysis. After data feature engineering, 722 cine short-axis (SAX) images and segmentation mask pairs were used for experimentation. This constitutes 506, 108, and 108 pairs for the training, validation, and testing sets, respectively. We use deep learning (DL) segmentation (UNet) and classification (ResNet50) models to discover the extent and location of the scar and classify between the ischemic cases and healthy cases (i.e., cases with no regional myocardial scar) from the pre-contrast cine SAX image frames, respectively. We then capture complex data patterns that represent subtle signal and functional changes in the cine SAX images due to MI using optical flow, rate of change of myocardial area, and radiomics data. We apply this dataset to explore two supervised learning methods, namely, the support vector machines (SVM) and the decision tree (DT) methods, to develop predictive models for classifying pre-contrast cine SAX images as being a case of MI or healthy.

Results

Overall, for the UNet segmentation model, the performance based on the mean Dice score for the test set (n = 108) is 0.75 (±0.20) for the endocardium, 0.51 (±0.21) for the epicardium and 0.20 (±0.17) for the scar. For the classification task, the accuracy, F1 and precision scores of 0.68, 0.69, and 0.64, respectively, were achieved with the SVM model, and of 0.62, 0.63, and 0.72, respectively, with the DT model.

Conclusion

We have presented some promising approaches involving DL, SVM, and DT methods in an attempt to accurately predict contrast information from non-contrast images. While our initial results are modest for this challenging task, this area of research still poses several open problems.

Collapse

Affiliation(s)

Musa Abdulkareem Barts Heart Centre, Barts Health National Health Service (NHS) Trust, London, United Kingdom National Institute for Health Research (NIHR) Barts Biomedical Research Centre, William Harvey Research Institute, Queen Mary University of London, London, United Kingdom Health Data Research UK, London, United Kingdom
Asmaa A. Kenawy Barts Heart Centre, Barts Health National Health Service (NHS) Trust, London, United Kingdom National Institute for Health Research (NIHR) Barts Biomedical Research Centre, William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
Elisa Rauseo Barts Heart Centre, Barts Health National Health Service (NHS) Trust, London, United Kingdom National Institute for Health Research (NIHR) Barts Biomedical Research Centre, William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
Aaron M. Lee Barts Heart Centre, Barts Health National Health Service (NHS) Trust, London, United Kingdom National Institute for Health Research (NIHR) Barts Biomedical Research Centre, William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
Alireza Sojoudi Circle Cardiovascular Imaging Inc., Calgary, AB, Canada
Alborz Amir-Khalili Circle Cardiovascular Imaging Inc., Calgary, AB, Canada
Karim Lekadir Artificial Intelligence in Medicine Lab (BCN-AIM), Faculty of Mathematics and Computer Science, University of Barcelona, Barcelona, Spain
Alistair A. Young Department of Biomedical Engineering, King’s College London, London, United Kingdom
Michael R. Barnes Centre for Translational Bioinformatics, William Harvey Research Institute, Faculty of Medicine and Dentistry, Queen Mary University of London, London, United Kingdom
Philipp Barckow Circle Cardiovascular Imaging Inc., Calgary, AB, Canada
Mohammed Y. Khanji Barts Heart Centre, Barts Health National Health Service (NHS) Trust, London, United Kingdom National Institute for Health Research (NIHR) Barts Biomedical Research Centre, William Harvey Research Institute, Queen Mary University of London, London, United Kingdom Newham University Hospital, Barts Health National Health Service (NHS) Trust, London, United Kingdom
Nay Aung Barts Heart Centre, Barts Health National Health Service (NHS) Trust, London, United Kingdom National Institute for Health Research (NIHR) Barts Biomedical Research Centre, William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
Steffen E. Petersen Barts Heart Centre, Barts Health National Health Service (NHS) Trust, London, United Kingdom National Institute for Health Research (NIHR) Barts Biomedical Research Centre, William Harvey Research Institute, Queen Mary University of London, London, United Kingdom Health Data Research UK, London, United Kingdom The Alan Turing Institute, London, United Kingdom

Collapse

Momtazmanesh S, Nowroozi A, Rezaei N. Artificial Intelligence in Rheumatoid Arthritis: Current Status and Future Perspectives: A State-of-the-Art Review. Rheumatol Ther 2022;9:1249-1304. [PMID: 35849321 PMCID: PMC9510088 DOI: 10.1007/s40744-022-00475-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 06/24/2022] [Indexed: 11/23/2022] Open

Abstract

Investigation of the potential applications of artificial intelligence (AI), including machine learning (ML) and deep learning (DL) techniques, is an exponentially growing field in medicine and healthcare. These methods can be critical in providing high-quality care to patients with chronic rheumatological diseases lacking an optimal treatment, like rheumatoid arthritis (RA), which is the second most prevalent autoimmune disease. Herein, following reviewing the basic concepts of AI, we summarize the advances in its applications in RA clinical practice and research. We provide directions for future investigations in this field after reviewing the current knowledge gaps and technical and ethical challenges in applying AI. Automated models have been largely used to improve RA diagnosis since the early 2000s, and they have used a wide variety of techniques, e.g., support vector machine, random forest, and artificial neural networks. AI algorithms can facilitate screening and identification of susceptible groups, diagnosis using omics, imaging, clinical, and sensor data, patient detection within electronic health record (EHR), i.e., phenotyping, treatment response assessment, monitoring disease course, determining prognosis, novel drug discovery, and enhancing basic science research. They can also aid in risk assessment for incidence of comorbidities, e.g., cardiovascular diseases, in patients with RA. However, the proposed models may vary significantly in their performance and reliability. Despite the promising results achieved by AI models in enhancing early diagnosis and management of patients with RA, they are not fully ready to be incorporated into clinical practice. Future investigations are required to ensure development of reliable and generalizable algorithms while they carefully look for any potential source of bias or misconduct. We showed that a growing body of evidence supports the potential role of AI in revolutionizing screening, diagnosis, and management of patients with RA. However, multiple obstacles hinder clinical applications of AI models. Incorporating the machine and/or deep learning algorithms into real-world settings would be a key step in the progress of AI in medicine.

Collapse