51
|
Callahan TJ, Stefanksi AL, Ostendorf DM, Wyrwa JM, Davies SJD, Hripcsak G, Hunter LE, Kahn MG. Characterizing Patient Representations for Computational Phenotyping. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2023; 2022:319-328. [PMID: 37128436 PMCID: PMC10148332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Patient representation learning methods create rich representations of complex data and have potential to further advance the development of computational phenotypes (CP). Currently, these methods are either applied to small predefined concept sets or all available patient data, limiting the potential for novel discovery and reducing the explainability of the resulting representations. We report on an extensive, data-driven characterization of the utility of patient representation learning methods for the purpose of CP development or automatization. We conducted ablation studies to examine the impact of patient representations, built using data from different combinations of data types and sampling windows on rare disease classification. We demonstrated that the data type and sampling window directly impact classification and clustering performance, and these results differ by rare disease group. Our results, although preliminary, exemplify the importance of and need for data-driven characterization in patient representation-based CP development pipelines.
Collapse
Affiliation(s)
- Tiffany J Callahan
- Columbia University, New York, NY, 10032, USA
- University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | | | | | - Jordan M Wyrwa
- University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Children's Hospital Colorado, Aurora, CO, 80045, USA
| | | | | | - Lawrence E Hunter
- University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Michael G Kahn
- University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| |
Collapse
|
52
|
Beason-Held LL, Kerley CI, Chaganti S, Moghekar A, Thambisetty M, Ferrucci L, Resnick SM, Landman BA. Health Conditions Associated with Alzheimer's Disease and Vascular Dementia. Ann Neurol 2023; 93:805-818. [PMID: 36571386 PMCID: PMC11973975 DOI: 10.1002/ana.26584] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 12/15/2022] [Accepted: 12/17/2022] [Indexed: 12/27/2022]
Abstract
OBJECTIVE We examined medical records to determine health conditions associated with dementia at varied intervals prior to dementia diagnosis in participants from the Baltimore Longitudinal Study of Aging (BLSA). METHODS Data were available for 347 Alzheimer's disease (AD), 76 vascular dementia (VaD), and 811 control participants without dementia. Logistic regressions were performed associating International Classification of Diseases, 9th Revision (ICD-9) health codes with dementia status across all time points, at 5 and 1 year(s) prior to dementia diagnosis, and at the year of diagnosis, controlling for age, sex, and follow-up length of the medical record. RESULTS In AD, the earliest and most consistent associations across all time points included depression, erectile dysfunction, gait abnormalities, hearing loss, and nervous and musculoskeletal symptoms. Cardiomegaly, urinary incontinence, non-epithelial skin cancer, and pneumonia were not significant until 1 year before dementia diagnosis. In VaD, the earliest and most consistent associations across all time points included abnormal electrocardiogram (EKG), cardiac dysrhythmias, cerebrovascular disease, non-epithelial skin cancer, depression, and hearing loss. Atrial fibrillation, occlusion of cerebral arteries, essential tremor, and abnormal reflexes were not significant until 1 year before dementia diagnosis. INTERPRETATION These findings suggest that some health conditions are associated with future dementia beginning at least 5 years before dementia diagnosis and are consistently seen over time, while others only reach significance closer to the date of diagnosis. These results also show that there are both shared and distinctive health conditions associated with AD and VaD. These results reinforce the need for medical intervention and treatment to lessen the impact of health comorbidities in the aging population. ANN NEUROL 2023;93:805-818.
Collapse
Affiliation(s)
- Lori L Beason-Held
- National Institute on Aging Intramural Research Program, Baltimore, Maryland, USA
| | - Cailey I Kerley
- Department of Electrical and Computer Engineering, Vanderbilt University, Nashville, Tennessee, USA
| | - Shikha Chaganti
- Department of Electrical and Computer Engineering, Vanderbilt University, Nashville, Tennessee, USA
| | - Abhay Moghekar
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Madhav Thambisetty
- National Institute on Aging Intramural Research Program, Baltimore, Maryland, USA
| | - Luigi Ferrucci
- National Institute on Aging Intramural Research Program, Baltimore, Maryland, USA
| | - Susan M Resnick
- National Institute on Aging Intramural Research Program, Baltimore, Maryland, USA
| | - Bennett A Landman
- Department of Electrical and Computer Engineering, Vanderbilt University, Nashville, Tennessee, USA
| |
Collapse
|
53
|
Johnson M, Patel M, Phipps A, van der Schaar M, Boulton D, Gibbs M. The potential and pitfalls of artificial intelligence in clinical pharmacology. CPT Pharmacometrics Syst Pharmacol 2023; 12:279-284. [PMID: 36717763 PMCID: PMC10014043 DOI: 10.1002/psp4.12902] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 11/30/2022] [Accepted: 12/05/2022] [Indexed: 02/01/2023] Open
Affiliation(s)
- Martin Johnson
- Clinical Pharmacology and Quantitative Pharmacology, Clinical Pharmacology and Safety Science, R&D, AstraZeneca, Cambridge, UK
| | - Mishal Patel
- Clinical Pharmacology and Quantitative Pharmacology, Artificial Intelligence & Data Analytics, R&D, AstraZeneca, Cambridge, UK
| | - Alex Phipps
- Clinical Pharmacology and Quantitative Pharmacology, Clinical Pharmacology and Safety Science, R&D, AstraZeneca, Cambridge, UK
| | - Mihaela van der Schaar
- Cambridge Centre for Artificial Intelligence in Medicine, Department of Applied Mathematics and Theoretical Physics and Department of Population Health, University of Cambridge, Cambridge, UK
| | - Dave Boulton
- Clinical Pharmacology and Quantitative Pharmacology, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gaithersburg, Maryland, USA
| | - Megan Gibbs
- Clinical Pharmacology and Quantitative Pharmacology, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gaithersburg, Maryland, USA
| |
Collapse
|
54
|
Brandt PS, Kho A, Luo Y, Pacheco JA, Walunas TL, Hakonarson H, Hripcsak G, Liu C, Shang N, Weng C, Walton N, Carrell DS, Crane PK, Larson EB, Chute CG, Kullo IJ, Carroll R, Denny J, Ramirez A, Wei WQ, Pathak J, Wiley LK, Richesson R, Starren JB, Rasmussen LV. Characterizing variability of electronic health record-driven phenotype definitions. J Am Med Inform Assoc 2023; 30:427-437. [PMID: 36474423 PMCID: PMC9933077 DOI: 10.1093/jamia/ocac235] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 10/19/2022] [Accepted: 11/23/2022] [Indexed: 12/12/2022] Open
Abstract
OBJECTIVE The aim of this study was to analyze a publicly available sample of rule-based phenotype definitions to characterize and evaluate the variability of logical constructs used. MATERIALS AND METHODS A sample of 33 preexisting phenotype definitions used in research that are represented using Fast Healthcare Interoperability Resources and Clinical Quality Language (CQL) was analyzed using automated analysis of the computable representation of the CQL libraries. RESULTS Most of the phenotype definitions include narrative descriptions and flowcharts, while few provide pseudocode or executable artifacts. Most use 4 or fewer medical terminologies. The number of codes used ranges from 5 to 6865, and value sets from 1 to 19. We found that the most common expressions used were literal, data, and logical expressions. Aggregate and arithmetic expressions are the least common. Expression depth ranges from 4 to 27. DISCUSSION Despite the range of conditions, we found that all of the phenotype definitions consisted of logical criteria, representing both clinical and operational logic, and tabular data, consisting of codes from standard terminologies and keywords for natural language processing. The total number and variety of expressions are low, which may be to simplify implementation, or authors may limit complexity due to data availability constraints. CONCLUSIONS The phenotype definitions analyzed show significant variation in specific logical, arithmetic, and other operators but are all composed of the same high-level components, namely tabular data and logical expressions. A standard representation for phenotype definitions should support these formats and be modular to support localization and shared logic.
Collapse
Affiliation(s)
- Pascal S Brandt
- Department of Biomedical and Medical Education, University of Washington, Seattle, Washington, USA
| | - Abel Kho
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Jennifer A Pacheco
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Theresa L Walunas
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Hakon Hakonarson
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Cong Liu
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Ning Shang
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Nephi Walton
- Intermountain Precision Genomics, Intermountain Healthcare, St George, Utah, USA
| | - David S Carrell
- Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
| | - Paul K Crane
- Department of Medicine, University of Washington, Seattle, Washington, USA
| | - Eric B Larson
- Department of Medicine, University of Washington, Seattle, Washington, USA
- Department of Health Services, University of Washington, Seattle, Washington, USA
| | - Christopher G Chute
- Schools of Medicine, Public Health, and Nursing, Johns Hopkins University, Baltimore, Maryland, USA
| | - Iftikhar J Kullo
- Department of Cardiovascular Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Robert Carroll
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Josh Denny
- All of Us Research Program, National Institutes of Health, Bethesda, Maryland, USA
| | - Andrea Ramirez
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Jyoti Pathak
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
| | - Laura K Wiley
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Rachel Richesson
- Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Justin B Starren
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Luke V Rasmussen
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| |
Collapse
|
55
|
Pacheco JA, Rasmussen LV, Wiley K, Person TN, Cronkite DJ, Sohn S, Murphy S, Gundelach JH, Gainer V, Castro VM, Liu C, Mentch F, Lingren T, Sundaresan AS, Eickelberg G, Willis V, Furmanchuk A, Patel R, Carrell DS, Deng Y, Walton N, Satterfield BA, Kullo IJ, Dikilitas O, Smith JC, Peterson JF, Shang N, Kiryluk K, Ni Y, Li Y, Nadkarni GN, Rosenthal EA, Walunas TL, Williams MS, Karlson EW, Linder JE, Luo Y, Weng C, Wei W. Evaluation of the portability of computable phenotypes with natural language processing in the eMERGE network. Sci Rep 2023; 13:1971. [PMID: 36737471 PMCID: PMC9898520 DOI: 10.1038/s41598-023-27481-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 01/03/2023] [Indexed: 02/05/2023] Open
Abstract
The electronic Medical Records and Genomics (eMERGE) Network assessed the feasibility of deploying portable phenotype rule-based algorithms with natural language processing (NLP) components added to improve performance of existing algorithms using electronic health records (EHRs). Based on scientific merit and predicted difficulty, eMERGE selected six existing phenotypes to enhance with NLP. We assessed performance, portability, and ease of use. We summarized lessons learned by: (1) challenges; (2) best practices to address challenges based on existing evidence and/or eMERGE experience; and (3) opportunities for future research. Adding NLP resulted in improved, or the same, precision and/or recall for all but one algorithm. Portability, phenotyping workflow/process, and technology were major themes. With NLP, development and validation took longer. Besides portability of NLP technology and algorithm replicability, factors to ensure success include privacy protection, technical infrastructure setup, intellectual property agreement, and efficient communication. Workflow improvements can improve communication and reduce implementation time. NLP performance varied mainly due to clinical document heterogeneity; therefore, we suggest using semi-structured notes, comprehensive documentation, and customization options. NLP portability is possible with improved phenotype algorithm performance, but careful planning and architecture of the algorithms is essential to support local customizations.
Collapse
Affiliation(s)
| | | | - Ken Wiley
- National Human Genome Research Institute, Bethesda, USA
| | | | - David J Cronkite
- Kaiser Permanente Washington Health Research Institute, Seattle, USA
| | | | | | | | | | | | - Cong Liu
- Columbia University, New York, USA
| | - Frank Mentch
- Children's Hospital of Philadelphia, Philadelphia, USA
| | - Todd Lingren
- Cincinnati Children's Hospital Medical Center, Cincinnati, USA
| | | | | | | | | | | | - David S Carrell
- Kaiser Permanente Washington Health Research Institute, Seattle, USA
| | - Yu Deng
- Northwestern University, Evanston, USA
| | | | | | | | | | | | | | | | | | - Yizhao Ni
- Cincinnati Children's Hospital Medical Center, Cincinnati, USA
| | - Yikuan Li
- Northwestern University, Evanston, USA
| | | | | | | | | | | | | | - Yuan Luo
- Northwestern University, Evanston, USA
| | | | - WeiQi Wei
- Vanderbilt University Medical Center, Nashville, USA
| |
Collapse
|
56
|
Chen Z, Zhang H, Yang X, Wu S, He X, Xu J, Guo J, Prosperi M, Wang F, Xu H, Chen Y, Hu H, DeKosky ST, Farrer M, Guo Y, Wu Y, Bian J. Assess the documentation of cognitive tests and biomarkers in electronic health records via natural language processing for Alzheimer's disease and related dementias. Int J Med Inform 2023; 170:104973. [PMID: 36577203 PMCID: PMC11325083 DOI: 10.1016/j.ijmedinf.2022.104973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 12/11/2022] [Accepted: 12/17/2022] [Indexed: 12/24/2022]
Abstract
BACKGROUND Cognitive tests and biomarkers are the key information to assess the severity and track the progression of Alzheimer's' disease (AD) and AD-related dementias (AD/ADRD), yet, both are often only documented in clinical narratives of patients' electronic health records (EHRs). In this work, we aim to (1) assess the documentation of cognitive tests and biomarkers in EHRs that can be used as real-world endpoints, and (2) identify, extract, and harmonize the different commonly used cognitive tests from clinical narratives using natural language processing (NLP) methods into categorical AD/ADRD severity. METHODS We developed a rule-based NLP pipeline to extract the cognitive tests and biomarkers from clinical narratives in AD/ADRD patients' EHRs. We aggregated the extracted results to the patient level and harmonized the cognitive test scores into severity categories using cutoffs determined based on both relevant literature and domain knowledge of AD/ADRD clinicians. RESULTS We identified an AD/ADRD cohort of 48,912 patients from the University of Florida (UF) Health system and identified 7 measurements (6 cognitive tests and 1 biomarker) that are frequently documented in our data. Our NLP pipeline achieved an overall F1-score of 0.9059 across the 7 measurements. Among the 6 cognitive tests, we were able to harmonize 4 cognitive test scores into severity categories, and the population characteristics of patients with different severity were described. We also identified several factors related to the availability of their documentation in EHRs. CONCLUSION This study demonstrates that our NLP pipelines can extract cognitive tests and biomarkers of AD/ADRD accurately for downstream studies. Although, the documentation of cognitive tests and biomarkers in EHRs appears to be low, RWD is still an important resource for AD/ADRD research. Nevertheless, providing standardized approach to document cognitive tests and biomarkers in EHRS are also warranted.
Collapse
Affiliation(s)
- Zhaoyi Chen
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Hansi Zhang
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Xi Yang
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Songzi Wu
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Xing He
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Jie Xu
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Jingchuan Guo
- Department of Pharmaceutical Outcomes and Policy, University of Florida, Gainesville, FL, USA
| | - Mattia Prosperi
- Department of Epidemiology, University of Florida, Gainesville, FL, USA
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Hua Xu
- Center for Translational AI in Medicine, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Yong Chen
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Hui Hu
- Channing Division of Network Medicine at Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Steven T DeKosky
- Department of Neurology, University of Florida, Gainesville, FL, USA
| | - Matthew Farrer
- Department of Neurology, University of Florida, Gainesville, FL, USA
| | - Yi Guo
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Yonghui Wu
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA.
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA.
| |
Collapse
|
57
|
Li Y, Mamouei M, Salimi-Khorshidi G, Rao S, Hassaine A, Canoy D, Lukasiewicz T, Rahimi K. Hi-BEHRT: Hierarchical Transformer-Based Model for Accurate Prediction of Clinical Events Using Multimodal Longitudinal Electronic Health Records. IEEE J Biomed Health Inform 2023; 27:1106-1117. [PMID: 36427286 PMCID: PMC7615082 DOI: 10.1109/jbhi.2022.3224727] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Electronic health records (EHR) represent a holistic overview of patients' trajectories. Their increasing availability has fueled new hopes to leverage them and develop accurate risk prediction models for a wide range of diseases. Given the complex interrelationships of medical records and patient outcomes, deep learning models have shown clear merits in achieving this goal. However, a key limitation of current study remains their capacity in processing long sequences, and long sequence modelling and its application in the context of healthcare and EHR remains unexplored. Capturing the whole history of medical encounters is expected to lead to more accurate predictions, but the inclusion of records collected for decades and from multiple resources can inevitably exceed the receptive field of the most existing deep learning architectures. This can result in missing crucial, long-term dependencies. To address this gap, we present Hi-BEHRT, a hierarchical Transformer-based model that can significantly expand the receptive field of Transformers and extract associations from much longer sequences. Using a multimodal large-scale linked longitudinal EHR, the Hi-BEHRT exceeds the state-of-the-art deep learning models 1% to 5% for area under the receiver operating characteristic (AUROC) curve and 1% to 8% for area under the precision recall (AUPRC) curve on average, and 2% to 8% (AUROC) and 2% to 11% (AUPRC) for patients with long medical history for 5-year heart failure, diabetes, chronic kidney disease, and stroke risk prediction. Additionally, because pretraining for hierarchical Transformer is not well-established, we provide an effective end-to-end contrastive pre-training strategy for Hi-BEHRT using EHR, improving its transferability on predicting clinical events with relatively small training dataset.
Collapse
Affiliation(s)
- Yikuan Li
- Deep Medicine, Oxford Martin School, University of Oxford, OX1 2JD Oxford, U.K
| | - Mohammad Mamouei
- Deep Medicine, Oxford Martin School, University of Oxford, OX1 2JD Oxford, U.K
| | | | - Shishir Rao
- Deep Medicine, Oxford Martin School, University of Oxford, OX1 2JD Oxford, U.K
| | - Abdelaali Hassaine
- Deep Medicine, Oxford Martin School, University of Oxford, OX1 2JD Oxford, U.K
| | - Dexter Canoy
- Deep Medicine, Oxford Martin School, University of Oxford, OX1 2JD Oxford, U.K
| | - Thomas Lukasiewicz
- Department of Computer Science, University of Oxford, OX1 2JD Oxford, U.K
| | - Kazem Rahimi
- Deep Medicine, Oxford Martin School, University of Oxford, OX1 2JD Oxford, U.K
| |
Collapse
|
58
|
Abstract
Advancements in high-throughput sequencing have yielded vast amounts of genomic data, which are studied using genome-wide association study (GWAS)/phenome-wide association study (PheWAS) methods to identify associations between the genotype and phenotype. The associated findings have contributed to pharmacogenomics and improved clinical decision support at the point of care in many healthcare systems. However, the accumulation of genomic data from sequencing and clinical data from electronic health records (EHRs) poses significant challenges for data scientists. Following the rise of artificial intelligence (AI) technology such as machine learning and deep learning, an increasing number of GWAS/PheWAS studies have successfully leveraged this technology to overcome the aforementioned challenges. In this review, we focus on the application of data science and AI technology in three areas, including risk prediction and identification of causal single-nucleotide polymorphisms, EHR-based phenotyping and CRISPR guide RNA design. Additionally, we highlight a few emerging AI technologies, such as transfer learning and multi-view learning, which will or have started to benefit genomic studies.
Collapse
Affiliation(s)
- Jing Lin
- NUHS Corporate Office, National University Health System, Singapore
| | - Kee Yuan Ngiam
- NUHS Corporate Office, National University Health System, Singapore,Department of Surgery, National University of Singapore, Singapore,Correspondence: A/Prof Kee Yuan Ngiam, Group Chief Technology Officer, NUHS Corporate Office, National University Health System, 1E Kent Ridge Road, 119228, Singapore. E-mail:
| |
Collapse
|
59
|
Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023; 30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVE Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used. MATERIALS AND METHODS We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies. RESULTS Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions. DISCUSSION Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released. CONCLUSION Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.
Collapse
Affiliation(s)
- Siyue Yang
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | | | - Ellen Stephenson
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Karen Tu
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Jessica Gronsbell
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
60
|
Azizi S, Hier DB, Wunsch II DC. Enhanced neurologic concept recognition using a named entity recognition model based on transformers. Front Digit Health 2022; 4:1065581. [PMID: 36569804 PMCID: PMC9772022 DOI: 10.3389/fdgth.2022.1065581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 11/21/2022] [Indexed: 12/12/2022] Open
Abstract
Although deep learning has been applied to the recognition of diseases and drugs in electronic health records and the biomedical literature, relatively little study has been devoted to the utility of deep learning for the recognition of signs and symptoms. The recognition of signs and symptoms is critical to the success of deep phenotyping and precision medicine. We have developed a named entity recognition model that uses deep learning to identify text spans containing neurological signs and symptoms and then maps these text spans to the clinical concepts of a neuro-ontology. We compared a model based on convolutional neural networks to one based on bidirectional encoder representation from transformers. Models were evaluated for accuracy of text span identification on three text corpora: physician notes from an electronic health record, case histories from neurologic textbooks, and clinical synopses from an online database of genetic diseases. Both models performed best on the professionally-written clinical synopses and worst on the physician-written clinical notes. Both models performed better when signs and symptoms were represented as shorter text spans. Consistent with prior studies that examined the recognition of diseases and drugs, the model based on bidirectional encoder representations from transformers outperformed the model based on convolutional neural networks for recognizing signs and symptoms. Recall for signs and symptoms ranged from 59.5% to 82.0% and precision ranged from 61.7% to 80.4%. With further advances in NLP, fully automated recognition of signs and symptoms in electronic health records and the medical literature should be feasible.
Collapse
Affiliation(s)
- Sima Azizi
- Applied Computational Intelligence Laboratory, Department of Electrical & Computer Engineering, Missouri University of Science & Technology, Rolla, MO, United States
| | - Daniel B. Hier
- Applied Computational Intelligence Laboratory, Department of Electrical & Computer Engineering, Missouri University of Science & Technology, Rolla, MO, United States
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Donald C. Wunsch II
- Applied Computational Intelligence Laboratory, Department of Electrical & Computer Engineering, Missouri University of Science & Technology, Rolla, MO, United States
- National Science Foundation, ECCS Division, Arlington, VA, United States
| |
Collapse
|
61
|
Hallinan CM, Gunn JM, Bonomo YA. Use of electronic medical records to monitor the safe and effective prescribing of medicinal cannabis: is it feasible? Aust J Prim Health 2022; 28:564-572. [PMID: 35927928 DOI: 10.1071/py22054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 06/17/2022] [Indexed: 12/13/2022]
Abstract
BACKGROUND General practitioners are well positioned to contribute to the pharmacovigilance of medical cannabis via the general practice electronic medical record (EMR). The aim of this research is to interrogate de-identified patient data from the Patron primary care data repository for reports of medicinal cannabis to ascertain the feasibility of using EMRs to monitor medicinal cannabis prescribing in Australia. METHODS EMR rule-based digital phenotyping of 1 164 846 active patients from 109 practices was undertaken to investigate reports of medicinal cannabis use from September 2017 to September 2020. RESULTS Eighty patients with 170 prescriptions of medicinal cannabis were identified in the Patron repository. Reasons for prescription included anxiety, multiple sclerosis, cancer, nausea, and Crohn's disease. Nine patients showed symptoms of a possible adverse event, including depression, motor vehicle accident, gastrointestinal symptoms, and anxiety. CONCLUSIONS The recording of medicinal cannabis effects in the patient EMR provides potential for medicinal cannabis monitoring in the community. This is especially feasible if monitoring were to be embedded into general practitioner workflow.
Collapse
Affiliation(s)
- Christine M Hallinan
- Department of General Practice, Melbourne Medical School, Faculty of Medicine, Dentistry and Health Sciences of the University of Melbourne, Level 2, 780 Elizabeth Street, Melbourne, Vic. 3004, Australia; and Faculty of Medicine, Dentistry and Health Sciences of the University of Melbourne, Level 2, Alan Gilbert Building, Grattan Street, Parkville, Vic. 3010, Australia; and Health and Biomedical Research Information Technology Unit (HaBIC R2), Faculty of Medicine, Dentistry and Health Sciences of the University of Melbourne, Level 2, 780 Elizabeth Street, Melbourne, Vic. 3004, Australia
| | - Jane M Gunn
- Faculty of Medicine, Dentistry and Health Sciences of the University of Melbourne, Level 2, Alan Gilbert Building, Grattan Street, Parkville, Vic. 3010, Australia
| | - Yvonne A Bonomo
- Faculty of Medicine, Dentistry and Health Sciences of the University of Melbourne, Level 2, Alan Gilbert Building, Grattan Street, Parkville, Vic. 3010, Australia; and Department of Addiction Medicine, St Vincent's Hospital, Fitzroy, Vic. 3065, Australia
| |
Collapse
|
62
|
Lyu T, Liang C, Liu J, Campbell B, Hung P, Shih YW, Ghumman N, Li X, on behalf of the National COVID Cohort Collaborative Consortium. Temporal Events Detector for Pregnancy Care (TED-PC): A rule-based algorithm to infer gestational age and delivery date from electronic health records of pregnant women with and without COVID-19. PLoS One 2022; 17:e0276923. [PMID: 36315520 PMCID: PMC9621451 DOI: 10.1371/journal.pone.0276923] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 10/16/2022] [Indexed: 11/07/2022] Open
Abstract
OBJECTIVE Identifying the time of SARS-CoV-2 viral infection relative to specific gestational weeks is critical for delineating the role of viral infection timing in adverse pregnancy outcomes. However, this task is difficult when it comes to Electronic Health Records (EHR). In combating the COVID-19 pandemic for maternal health, we sought to develop and validate a clinical information extraction algorithm to detect the time of clinical events relative to gestational weeks. MATERIALS AND METHODS We used EHR from the National COVID Cohort Collaborative (N3C), in which the EHR are normalized by the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). We performed EHR phenotyping, resulting in 270,897 pregnant women (June 1st, 2018 to May 31st, 2021). We developed a rule-based algorithm and performed a multi-level evaluation to test content validity and clinical validity, and extreme length of gestation (<150 or >300). RESULTS The algorithm identified 296,194 pregnancies (16,659 COVID-19, 174,744 without COVID-19) in 270,897 pregnant women. For inferring gestational age, 95% cases (n = 40) have moderate-high accuracy (Cohen's Kappa = 0.62); 100% cases (n = 40) have moderate-high granularity of temporal information (Cohen's Kappa = 1). For inferring delivery dates, the accuracy is 100% (Cohen's Kappa = 1). The accuracy of gestational age detection for the extreme length of gestation is 93.3% (Cohen's Kappa = 1). Mothers with COVID-19 showed higher prevalence in obesity or overweight (35.1% vs. 29.5%), diabetes (17.8% vs. 17.0%), chronic obstructive pulmonary disease (0.2% vs. 0.1%), respiratory distress syndrome or acute respiratory failure (1.8% vs. 0.2%). DISCUSSION We explored the characteristics of pregnant women by different gestational weeks of SARS-CoV-2 infection with our algorithm. TED-PC is the first to infer the exact gestational week linked with every clinical event from EHR and detect the timing of SARS-CoV-2 infection in pregnant women. CONCLUSION The algorithm shows excellent clinical validity in inferring gestational age and delivery dates, which supports multiple EHR cohorts on N3C studying the impact of COVID-19 on pregnancy.
Collapse
Affiliation(s)
- Tianchu Lyu
- Department of Health Services Policy and Management, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, United States of America
| | - Chen Liang
- Department of Health Services Policy and Management, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, United States of America
| | - Jihong Liu
- Department of Epidemiology & Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, United States of America
| | - Berry Campbell
- Department of Obstetrics and Gynecology, School of Medicine, University of South Carolina, Columbia, South Carolina, United States of America
| | - Peiyin Hung
- Department of Health Services Policy and Management, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, United States of America
| | - Yi-Wen Shih
- Department of Health Services Policy and Management, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, United States of America
| | - Nadia Ghumman
- Department of Health Services Policy and Management, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, United States of America
| | - Xiaoming Li
- Department of Health Promotion Education and Behaviors, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, United States of America
| | | |
Collapse
|
63
|
Bashir MBA, Basna R, Zhang GQ, Backman H, Lindberg A, Ekerljung L, Axelsson M, Hedman L, Vanfleteren L, Lundbäck B, Rönmark E, Nwaru BI. Computational phenotyping of obstructive airway diseases: protocol for a systematic review. Syst Rev 2022; 11:216. [PMID: 36229872 PMCID: PMC9559879 DOI: 10.1186/s13643-022-02078-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 09/18/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Over the last decade, computational sciences have contributed immensely to characterization of phenotypes of airway diseases, but it is difficult to compare derived phenotypes across studies, perhaps as a result of the different decisions that fed into these phenotyping exercises. We aim to perform a systematic review of studies using computational approaches to phenotype obstructive airway diseases in children and adults. METHODS AND ANALYSIS We will search PubMed, Embase, Scopus, Web of Science, and Google Scholar for papers published between 2010 and 2020. Conferences proceedings, reference list of included papers, and experts will form additional sources of literature. We will include observational epidemiological studies that used a computational approach to derive phenotypes of chronic airway diseases, whether in a general population or in a clinical setting. Two reviewers will independently screen the retrieved studies for eligibility, extract relevant data, and perform quality appraisal of included studies. A third reviewer will arbitrate any disagreements in these processes. Quality appraisal of the studies will be undertaken using the Effective Public Health Practice Project quality assessment tool. We will use summary tables to describe the included studies. We will narratively synthesize the generated evidence, providing critical assessment of the populations, variables, and computational approaches used in deriving the phenotypes across studies CONCLUSION: As progress continues to be made in the area of computational phenotyping of chronic obstructive airway diseases, this systematic review, the first on this topic, will provide the state of the art on the field and highlight important perspectives for future works. ETHICS AND DISSEMINATION No ethical approval is needed for this work is based only on the published literature and does not involve collection of any primary or human data. REGISTRATION AND REPORTING SYSTEMATIC REVIEW REGISTRATION: PROSPERO CRD42020164898.
Collapse
Affiliation(s)
- Muwada Bashir Awad Bashir
- Krefting Research Centre, Institute of Medicine, University of Gothenburg, SE-405 30, Gothenburg, Sweden.
| | - Rani Basna
- Krefting Research Centre, Institute of Medicine, University of Gothenburg, SE-405 30, Gothenburg, Sweden
| | - Guo-Qiang Zhang
- Krefting Research Centre, Institute of Medicine, University of Gothenburg, SE-405 30, Gothenburg, Sweden
| | - Helena Backman
- Section of Sustainable Health/the OLIN Unit, Department of Public Health and Clinical Medicine, Umeå University, Umeå, Sweden
| | - Anne Lindberg
- Section of Medicine/the OLIN Unit, Department of Public Health and Clinical Medicine, Umeå University, Umeå, Sweden
| | - Linda Ekerljung
- Krefting Research Centre, Institute of Medicine, University of Gothenburg, SE-405 30, Gothenburg, Sweden
| | - Malin Axelsson
- Department of Care Science, Faculty of Health and Society, Malmö University, Malmö, Sweden
| | - Linnea Hedman
- Department of Health Sciences, Luleå University of Technology, Luleå, Sweden
| | - Lowie Vanfleteren
- COPD Center, Sahlgrenska University Hospital, University of Gothenburg, Gothenburg, Sweden
| | - Bo Lundbäck
- Krefting Research Centre, Institute of Medicine, University of Gothenburg, SE-405 30, Gothenburg, Sweden
| | - Eva Rönmark
- Section of Sustainable Health/the OLIN Unit, Department of Public Health and Clinical Medicine, Umeå University, Umeå, Sweden
| | - Bright I Nwaru
- Krefting Research Centre, Institute of Medicine, University of Gothenburg, SE-405 30, Gothenburg, Sweden.,Wallenberg Centre for Molecular and Translational Medicine, Institute of Medicine, University of Gothenburg, Gothenburg, Sweden
| |
Collapse
|
64
|
Conte M, Flynn A, Boisvert P, Landis-Lewis Z, Richesson R, Friedman C. Computable phenotypes for cohort identification: core content for a new class of FAIR Digital Objects. RESEARCH IDEAS AND OUTCOMES 2022. [DOI: 10.3897/rio.8.e95856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Introduction
We present current work to develop and define a class of digital objects that facilitates patient cohort identification for clinical studies, such that these objects are Findable, Accessible, Interoperable, and Reusable (FAIR) (Wilkinson et al. 2016). Developing this class of FAIR Digital Objects (FDOs) builds on the work of several years to develop the Knowledge Grid (https://kgrid.org/), which facilitates the development, description and implementation of biomedical knowledge packaged in machine-readable and machine-executable formats (Flynn et al. 2018). Additionally, this work aligns with the goals of the Mobilizing Computable Biomedical Knowledge (MCBK) community (https://mobilizecbk.med.umich.edu/) (Mobilizing Computable Biomedical Knowledge 2018). In this abstract, we describe our work to develop a FDO carrying a computable phenotype.
Defining computable phenotypes
In biomedical informatics, 'phenotyping' describes a data-driven approach to identifying a group of individuals sharing observable characteristics of interest, generally related to a disease or condition, and a 'computable phenotype' (CP) is a machine-processable expression of a phenotypic pattern of these characteristics (Hripcsak and Albers 2018).
For the purposes of this work, we are interested in CPs derived from data contained in electronic health record (EHR) systems. This includes both structured data, e.g. codes for diseases, diagnoses, procedures, or laboratory tests, and unstructured data, e.g. free text including patient histories, clinical observations, discharge summaries, and reports. Thus, we define computable phenotype FDOs (CP-FDOs) as a class of FDO that packages an executable EHR-derived CP together with documentation needed to implement and use it effectively for creating cohorts of individuals with similar observable characteristics from EHR data sets.
Importance of portable and FAIR CPs
There is tremendous excitement for using real-world EHR data to discover important findings about human health and well-being. However, for discovery to happen, researchers need mechanisms like CPs to identify study cohorts for analysis. Beginning in the early 2010s, a growing literature explores various methods for the secondary use of EHR data for patient phenotyping to arrive at consistent study cohorts (Shivade et al. 2014, Banda et al. 2018). The heterogeneous nature of EHR data has inspired a wide variety of phenotyping methods, from those which rely solely on documented codes linked to terms in existing vocabularies to those which combine such codes with other concepts extracted from free text using natural language processing.
Our current focus is on packaging CPs inside FDOs for classifying patients as having or not having a phenotype of interest. This can be done within an individual health system, or at scale across a clinical data research network. Using CPs for cohort identification can reduce the time and expense of traditional data set building and clincal trial recruitment, and expand the potential scope of a study population(Boland et al. 2013).
Creating and validating CPs requires time, resources, and both clinical and technical expertise. One estimate is that it can take 6-10 months to develop and validate a CP (Shang et al. 2019). And, as there is no standard data model within EHRs in the United States, many CPs are designed for performance at a single site, rather than for portability, which is understood as the ability to implement a phenotype at a different site with similar performance (Shang et al. 2019). While portability is increasingly recognized as an important element of phenotyping, and there have been recent efforts to develop more portable CPs, many of these processes still require significant technical expertise at the implementation site to adapt the phenotype for use on local data.
There may also be significant advantages to making CPs FAIR. These include transparency in cohort selection, and better generalizability of results. FAIR CPs may also increase the potential for robust comparisons of data from related studies, leading to better evidence synthesis to improve delivery of care and ultimately human health.
Defining a new class of FDOs to hold and convey CPs
We believe that packaging validated CPs inside digital objects may alleviate many of the pressures mentioned above, and contributes to making both the processes and products of clinical research more FAIR. To this end, our current work focuses on packaging a validated CP inside a machine-processable FDO. The phenotype of interest identifies pediatric and adult patients with a rare disease (Oliverio et al. 2021), and has several features which make it ideal for transformation to an executable FDO. First, the phenotype utilizes standards to define the clinical characteristics of interest, and is based on a common data model; these features increase the potential for both interoperability and reuse. Additionally, because the phenotype has been validated across three sites, its portability has already been demonstrated. Finally, the full computable phenotype has been shared as a series of SQL queries, including scripts for patient identification, deriving statistics, and validation, which have been annotated with instructions for implementation at other sites.
The goals of this work are:
To develop CPs as executable DOs, leveraging previous work to develop executable Knowledge Objects (KO) (Flynn et al. 2018)
To advance our understanding of how to define computable phenotypes as a class of FDO, including what is needed to meet the requirements of binding, abstraction, and encapsulation (Wittenburg et al. 2019)
To develop CPs as executable DOs, leveraging previous work to develop executable Knowledge Objects (KO) (Flynn et al. 2018)
To advance our understanding of how to define computable phenotypes as a class of FDO, including what is needed to meet the requirements of binding, abstraction, and encapsulation (Wittenburg et al. 2019)
Conclusion
Computable phenotypes, packaged as FDOs, may increase the potential both for the portability of a phenotype and the reusability of data resulting from its implementation. Providing CPs as executable FDOs may also reduce barriers to portability and local implementation. In this presentation, we describe our work to develop a FDO computable phenotype from an existing validated phenotype. Lessons learned from this process will increase our understanding of both the technical requirements, and how to address necessary components of abstraction, binding, and encapsulation so that these can function as FAIR Digital Objects.
Collapse
|
65
|
Momtazmanesh S, Nowroozi A, Rezaei N. Artificial Intelligence in Rheumatoid Arthritis: Current Status and Future Perspectives: A State-of-the-Art Review. Rheumatol Ther 2022; 9:1249-1304. [PMID: 35849321 PMCID: PMC9510088 DOI: 10.1007/s40744-022-00475-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 06/24/2022] [Indexed: 11/23/2022] Open
Abstract
Investigation of the potential applications of artificial intelligence (AI), including machine learning (ML) and deep learning (DL) techniques, is an exponentially growing field in medicine and healthcare. These methods can be critical in providing high-quality care to patients with chronic rheumatological diseases lacking an optimal treatment, like rheumatoid arthritis (RA), which is the second most prevalent autoimmune disease. Herein, following reviewing the basic concepts of AI, we summarize the advances in its applications in RA clinical practice and research. We provide directions for future investigations in this field after reviewing the current knowledge gaps and technical and ethical challenges in applying AI. Automated models have been largely used to improve RA diagnosis since the early 2000s, and they have used a wide variety of techniques, e.g., support vector machine, random forest, and artificial neural networks. AI algorithms can facilitate screening and identification of susceptible groups, diagnosis using omics, imaging, clinical, and sensor data, patient detection within electronic health record (EHR), i.e., phenotyping, treatment response assessment, monitoring disease course, determining prognosis, novel drug discovery, and enhancing basic science research. They can also aid in risk assessment for incidence of comorbidities, e.g., cardiovascular diseases, in patients with RA. However, the proposed models may vary significantly in their performance and reliability. Despite the promising results achieved by AI models in enhancing early diagnosis and management of patients with RA, they are not fully ready to be incorporated into clinical practice. Future investigations are required to ensure development of reliable and generalizable algorithms while they carefully look for any potential source of bias or misconduct. We showed that a growing body of evidence supports the potential role of AI in revolutionizing screening, diagnosis, and management of patients with RA. However, multiple obstacles hinder clinical applications of AI models. Incorporating the machine and/or deep learning algorithms into real-world settings would be a key step in the progress of AI in medicine.
Collapse
Affiliation(s)
- Sara Momtazmanesh
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran
- Research Center for Immunodeficiencies, Pediatrics Center of Excellence, Children's Medical Center, Tehran University of Medical Sciences, Dr. Gharib St, Keshavarz Blvd, Tehran, Iran
| | - Ali Nowroozi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran
| | - Nima Rezaei
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran.
- Research Center for Immunodeficiencies, Pediatrics Center of Excellence, Children's Medical Center, Tehran University of Medical Sciences, Dr. Gharib St, Keshavarz Blvd, Tehran, Iran.
- Department of Immunology, School of Medicine, Tehran University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
66
|
Liang C, Weissman S, Olatosi B, Poon EG, Yarrington ME, Li X. Curating a knowledge base for individuals with coinfection of HIV and SARS-CoV-2: a study protocol of EHR-based data mining and clinical implementation. BMJ Open 2022; 12:e067204. [PMID: 36100301 PMCID: PMC9471209 DOI: 10.1136/bmjopen-2022-067204] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 08/25/2022] [Indexed: 11/04/2022] Open
Abstract
INTRODUCTION Despite a higher risk of severe COVID-19 disease in individuals with HIV, the interactions between SARS-CoV-2 and HIV infections remain unclear. To delineate these interactions, multicentre Electronic Health Records (EHR) hold existing promise to provide full-spectrum and longitudinal clinical data, demographics and sociobehavioural data at individual level. Presently, a comprehensive EHR-based cohort for the HIV/SARS-CoV-2 coinfection has not been established; EHR integration and data mining methods tailored for studying the coinfection are urgently needed yet remain underdeveloped. METHODS AND ANALYSIS The overarching goal of this exploratory/developmental study is to establish an EHR-based cohort for individuals with HIV/SARS-CoV-2 coinfection and perform large-scale EHR-based data mining to examine the interactions between HIV and SARS-CoV-2 infections and systematically identify and validate factors contributing to the severe clinical course of the coinfection. We will use a nationwide EHR database in the USA, namely, National COVID Cohort Collaborative (N3C). Ultimately, collected clinical evidence will be implemented and used to pilot test a clinical decision support prototype to assist providers in screening and referral of at-risk patients in real-world clinics. ETHICS AND DISSEMINATION The study was approved by the institutional review boards at the University of South Carolina (Pro00121828) as non-human subject study. Study findings will be presented at academic conferences and published in peer-reviewed journals. This study will disseminate urgently needed clinical evidence for guiding clinical practice for individuals with the coinfection at Prisma Health, a healthcare system in collaboration.
Collapse
Affiliation(s)
- Chen Liang
- Department of Health Services Policy and Management, University of South Carolina, Columbia, South Carolina, USA
- Big Data Health Science Center, University of South Carolina, Columbia, South Carolina, USA
| | - Sharon Weissman
- Big Data Health Science Center, University of South Carolina, Columbia, South Carolina, USA
- Department of Internal Medicine, University of South Carolina, Columbia, South Carolina, USA
| | - Bankole Olatosi
- Department of Health Services Policy and Management, University of South Carolina, Columbia, South Carolina, USA
- Big Data Health Science Center, University of South Carolina, Columbia, South Carolina, USA
| | - Eric G Poon
- Department of Medicine, Duke University, Durham, North Carolina, USA
| | | | - Xiaoming Li
- Big Data Health Science Center, University of South Carolina, Columbia, South Carolina, USA
- Department of Health Promotion Education and Behavior, University of South Carolina, Columbia, South Carolina, USA
| |
Collapse
|
67
|
Levenson M, He W, Chen L, Dharmarajan S, Izem R, Meng Z, Pang H, Rockhold F. Statistical consideration for fit-for-use real-world data to support regulatory decision making in drug development. Stat Biopharm Res 2022. [DOI: 10.1080/19466315.2022.2120533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
| | - Weili He
- Global Medical Affairs Statistics, Data and Statistical Sciences, AbbVie, North Chicago, IL
| | - Li Chen
- Global Medical Affairs Statistics, Data and Statistical Sciences, AbbVie, North Chicago, IL
| | | | - Rima Izem
- Novartis Institutes for BioMedical Research Basel, Basel, Basel-Stadt, CH
| | | | | | - Frank Rockhold
- Department of Biostatistics & Bioinformatics, Duke University, Durham, NC
- Duke Clinical Research Institute, Duke University, Durham, NC
| |
Collapse
|
68
|
Albert K, Delano M. Sex trouble: Sex/gender slippage, sex confusion, and sex obsession in machine learning using electronic health records. PATTERNS (NEW YORK, N.Y.) 2022; 3:100534. [PMID: 36033589 PMCID: PMC9403398 DOI: 10.1016/j.patter.2022.100534] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
False assumptions that sex and gender are binary, static, and concordant are deeply embedded in the medical system. As machine learning researchers use medical data to build tools to solve novel problems, understanding how existing systems represent sex/gender incorrectly is necessary to avoid perpetuating harm. In this perspective, we identify and discuss three factors to consider when working with sex/gender in research: "sex/gender slippage," the frequent substitution of sex and sex-related terms for gender and vice versa; "sex confusion," the fact that any given sex variable holds many different potential meanings; and "sex obsession," the idea that the relevant variable for most inquiries related to sex/gender is sex assigned at birth. We then explore how these phenomena show up in medical machine learning research using electronic health records, with a specific focus on HIV risk prediction. Finally, we offer recommendations about how machine learning researchers can engage more carefully with questions of sex/gender.
Collapse
Affiliation(s)
- Kendra Albert
- Cyberlaw Clinic, Harvard Law School, Cambridge, MA 02138, USA
| | - Maggie Delano
- Engineering Department, Swarthmore College, Swarthmore, PA 19146, USA
| |
Collapse
|
69
|
Liu J, Capurro D, Nguyen A, Verspoor K. "Note Bloat" impacts deep learning-based NLP models for clinical prediction tasks. J Biomed Inform 2022; 133:104149. [PMID: 35878821 DOI: 10.1016/j.jbi.2022.104149] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 05/28/2022] [Accepted: 07/19/2022] [Indexed: 10/17/2022]
Abstract
One unintended consequence of the Electronic Health Records (EHR) implementation is the overuse of content-importing technology, such as copy-and-paste, that creates "bloated" notes containing large amounts of textual redundancy. Despite the rising interest in applying machine learning models to learn from real-patient data, it is unclear how the phenomenon of note bloat might affect the Natural Language Processing (NLP) models derived from these notes. Therefore, in this work we examine the impact of redundancy on deep learning-based NLP models, considering four clinical prediction tasks using a publicly available EHR database. We applied two deduplication methods to the hospital notes, identifying large quantities of redundancy, and found that removing the redundancy usually has little negative impact on downstream performances, and can in certain circumstances assist models to achieve significantly better results. We also showed it is possible to attack model predictions by simply adding note duplicates, causing changes of correct predictions made by trained models into wrong predictions. In conclusion, we demonstrated that EHR text redundancy substantively affects NLP models for clinical prediction tasks, showing that the awareness of clinical contexts and robust modeling methods are important to create effective and reliable NLP systems in healthcare contexts.
Collapse
Affiliation(s)
- Jinghui Liu
- School of Computing and Information Systems, The University of Melbourne, Victoria, Australia; Australian e-Health Research Centre, CSIRO, Brisbane, Australia.
| | - Daniel Capurro
- School of Computing and Information Systems, The University of Melbourne, Victoria, Australia; Centre for Digital Transformation of Health, Melbourne Medical School, The University of Melbourne, Victoria, Australia.
| | - Anthony Nguyen
- Australian e-Health Research Centre, CSIRO, Brisbane, Australia.
| | - Karin Verspoor
- School of Computing and Information Systems, The University of Melbourne, Victoria, Australia; Centre for Digital Transformation of Health, Melbourne Medical School, The University of Melbourne, Victoria, Australia; School of Computing Technologies, RMIT University, Victoria, Australia.
| |
Collapse
|
70
|
Brandt PS, Pacheco JA, Adekkanattu P, Sholle ET, Abedian S, Stone DJ, Knaack DM, Xu J, Xu Z, Peng Y, Benda NC, Wang F, Luo Y, Jiang G, Pathak J, Rasmussen LV. Design and validation of a FHIR-based EHR-driven phenotyping toolbox. J Am Med Inform Assoc 2022; 29:1449-1460. [PMID: 35799370 PMCID: PMC9382394 DOI: 10.1093/jamia/ocac063] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Revised: 04/04/2022] [Accepted: 06/17/2022] [Indexed: 12/14/2022] Open
Abstract
OBJECTIVES To develop and validate a standards-based phenotyping tool to author electronic health record (EHR)-based phenotype definitions and demonstrate execution of the definitions against heterogeneous clinical research data platforms. MATERIALS AND METHODS We developed an open-source, standards-compliant phenotyping tool known as the PhEMA Workbench that enables a phenotype representation using the Fast Healthcare Interoperability Resources (FHIR) and Clinical Quality Language (CQL) standards. We then demonstrated how this tool can be used to conduct EHR-based phenotyping, including phenotype authoring, execution, and validation. We validated the performance of the tool by executing a thrombotic event phenotype definition at 3 sites, Mayo Clinic (MC), Northwestern Medicine (NM), and Weill Cornell Medicine (WCM), and used manual review to determine precision and recall. RESULTS An initial version of the PhEMA Workbench has been released, which supports phenotype authoring, execution, and publishing to a shared phenotype definition repository. The resulting thrombotic event phenotype definition consisted of 11 CQL statements, and 24 value sets containing a total of 834 codes. Technical validation showed satisfactory performance (both NM and MC had 100% precision and recall and WCM had a precision of 95% and a recall of 84%). CONCLUSIONS We demonstrate that the PhEMA Workbench can facilitate EHR-driven phenotype definition, execution, and phenotype sharing in heterogeneous clinical research data environments. A phenotype definition that integrates with existing standards-compliant systems, and the use of a formal representation facilitates automation and can decrease potential for human error.
Collapse
Affiliation(s)
- Pascal S Brandt
- Corresponding Author: Pascal S. Brandt, Department of Biomedical Informatics & Medical Education, University of Washington, Box 358047, Seattle, WA 98195, USA;
| | - Jennifer A Pacheco
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Prakash Adekkanattu
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA
| | - Evan T Sholle
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA
| | - Sajjad Abedian
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA
| | - Daniel J Stone
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA
| | - David M Knaack
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA
| | - Jie Xu
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA
| | - Zhenxing Xu
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA
| | - Yifan Peng
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA
| | - Natalie C Benda
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA
| | - Fei Wang
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Guoqian Jiang
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA
| | - Jyotishman Pathak
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA
| | - Luke V Rasmussen
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| |
Collapse
|
71
|
Fan B, Klatt J, Moor MM, Daniels LA. Prediction of recovery from multiple organ dysfunction syndrome in pediatric sepsis patients. Bioinformatics 2022; 38:i101-i108. [PMID: 35758775 PMCID: PMC9236580 DOI: 10.1093/bioinformatics/btac229] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
MOTIVATION Sepsis is a leading cause of death and disability in children globally, accounting for ∼3 million childhood deaths per year. In pediatric sepsis patients, the multiple organ dysfunction syndrome (MODS) is considered a significant risk factor for adverse clinical outcomes characterized by high mortality and morbidity in the pediatric intensive care unit. The recent rapidly growing availability of electronic health records (EHRs) has allowed researchers to vastly develop data-driven approaches like machine learning in healthcare and achieved great successes. However, effective machine learning models which could make the accurate early prediction of the recovery in pediatric sepsis patients from MODS to a mild state and thus assist the clinicians in the decision-making process is still lacking. RESULTS This study develops a machine learning-based approach to predict the recovery from MODS to zero or single organ dysfunction by 1 week in advance in the Swiss Pediatric Sepsis Study cohort of children with blood-culture confirmed bacteremia. Our model achieves internal validation performance on the SPSS cohort with an area under the receiver operating characteristic (AUROC) of 79.1% and area under the precision-recall curve (AUPRC) of 73.6%, and it was also externally validated on another pediatric sepsis patients cohort collected in the USA, yielding an AUROC of 76.4% and AUPRC of 72.4%. These results indicate that our model has the potential to be included into the EHRs system and contribute to patient assessment and triage in pediatric sepsis patient care. AVAILABILITY AND IMPLEMENTATION Code available at https://github.com/BorgwardtLab/MODS-recovery. The data underlying this article is not publicly available for the privacy of individuals that participated in the study. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bowen Fan
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Juliane Klatt
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Michael M Moor
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Latasha A Daniels
- Division of Critical Care, Ann and Robert H. Lurie Children’s Hospital of Chicago, Chicago, IL, USA
| |
Collapse
|
72
|
Binkheder S, Wu HY, Quinney SK, Zhang S, Zitu MM, Chiang CW, Wang L, Jones J, Li L. PhenoDEF: a corpus for annotating sentences with information of phenotype definitions in biomedical literature. J Biomed Semantics 2022; 13:17. [PMID: 35690873 PMCID: PMC9188713 DOI: 10.1186/s13326-022-00272-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 05/18/2022] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Adverse events induced by drug-drug interactions are a major concern in the United States. Current research is moving toward using electronic health record (EHR) data, including for adverse drug events discovery. One of the first steps in EHR-based studies is to define a phenotype for establishing a cohort of patients. However, phenotype definitions are not readily available for all phenotypes. One of the first steps of developing automated text mining tools is building a corpus. Therefore, this study aimed to develop annotation guidelines and a gold standard corpus to facilitate building future automated approaches for mining phenotype definitions contained in the literature. Furthermore, our aim is to improve the understanding of how these published phenotype definitions are presented in the literature and how we annotate them for future text mining tasks. RESULTS Two annotators manually annotated the corpus on a sentence-level for the presence of evidence for phenotype definitions. Three major categories (inclusion, intermediate, and exclusion) with a total of ten dimensions were proposed characterizing major contextual patterns and cues for presenting phenotype definitions in published literature. The developed annotation guidelines were used to annotate the corpus that contained 3971 sentences: 1923 out of 3971 (48.4%) for the inclusion category, 1851 out of 3971 (46.6%) for the intermediate category, and 2273 out of 3971 (57.2%) for exclusion category. The highest number of annotated sentences was 1449 out of 3971 (36.5%) for the "Biomedical & Procedure" dimension. The lowest number of annotated sentences was 49 out of 3971 (1.2%) for "The use of NLP". The overall percent inter-annotator agreement was 97.8%. Percent and Kappa statistics also showed high inter-annotator agreement across all dimensions. CONCLUSIONS The corpus and annotation guidelines can serve as a foundational informatics approach for annotating and mining phenotype definitions in literature, and can be used later for text mining applications.
Collapse
Affiliation(s)
- Samar Binkheder
- Department of Biohealth Informatics, Indiana University School of Informatics and Computing, Indianapolis, IN, USA
- Medical Informatics Unit, Department of Medical Education, College of Medicine, King Saud University, Riyadh, Saudi Arabia
| | - Heng-Yi Wu
- Development Science Informatics, Genentech, South San Francisco, CA, USA
| | - Sara K Quinney
- Department of Obstetrics and Gynecology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Shijun Zhang
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| | - Md Muntasir Zitu
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| | - Chien-Wei Chiang
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| | - Lei Wang
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| | - Josette Jones
- Department of Biohealth Informatics, Indiana University School of Informatics and Computing, Indianapolis, IN, USA
| | - Lang Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA.
- , 250 Lincoln Tower, 1800 Cannon Drive, Columbus, OH, 43210, USA.
| |
Collapse
|
73
|
Huang RJ, Kwon NSE, Tomizawa Y, Choi AY, Hernandez-Boussard T, Hwang JH. A Comparison of Logistic Regression Against Machine Learning Algorithms for Gastric Cancer Risk Prediction Within Real-World Clinical Data Streams. JCO Clin Cancer Inform 2022; 6:e2200039. [PMID: 35763703 DOI: 10.1200/cci.22.00039] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
PURPOSE Noncardia gastric cancer (NCGC) is a leading cause of global cancer mortality, and is often diagnosed at advanced stages. Development of NCGC risk models within electronic health records (EHR) may allow for improved cancer prevention. There has been much recent interest in use of machine learning (ML) for cancer prediction, but few studies comparing ML with classical statistical models for NCGC risk prediction. METHODS We trained models using logistic regression (LR) and four commonly used ML algorithms to predict NCGC from age-/sex-matched controls in two EHR systems: Stanford University and the University of Washington (UW). The LR model contained well-established NCGC risk factors (intestinal metaplasia histology, prior Helicobacter pylori infection, race, ethnicity, nativity status, smoking history, anemia), whereas ML models agnostically selected variables from the EHR. Models were developed and internally validated in the Stanford data, and externally validated in the UW data. Hyperparameter tuning of models was achieved using cross-validation. Model performance was compared by accuracy, sensitivity, and specificity. RESULTS In internal validation, LR performed with comparable accuracy (0.732; 95% CI, 0.698 to 0.764), sensitivity (0.697; 95% CI, 0.647 to 0.744), and specificity (0.767; 95% CI, 0.720 to 0.809) to penalized lasso, support vector machine, K-nearest neighbor, and random forest models. In external validation, LR continued to demonstrate high accuracy, sensitivity, and specificity. Although K-nearest neighbor demonstrated higher accuracy and specificity, this was offset by significantly lower sensitivity. No ML model consistently outperformed LR across evaluation criteria. CONCLUSION Drawing data from two independent EHRs, we find LR on the basis of established risk factors demonstrated comparable performance to optimized ML algorithms. This study demonstrates that classical models built on robust, hand-chosen predictor variables may not be inferior to data-driven models for NCGC risk prediction.
Collapse
Affiliation(s)
- Robert J Huang
- Division of Gastroenterology and Hepatology, Stanford University School of Medicine, Stanford, CA
| | - Nicole Sung-Eun Kwon
- Division of Gastroenterology and Hepatology, Stanford University School of Medicine, Stanford, CA
| | - Yutaka Tomizawa
- Division of Gastroenterology, University of Washington, Seattle, WA
| | - Alyssa Y Choi
- Division of Gastroenterology and Hepatology, University of California Irvine, Irvine, CA
| | | | - Joo Ha Hwang
- Division of Gastroenterology and Hepatology, Stanford University School of Medicine, Stanford, CA
| |
Collapse
|
74
|
Khoury P, Srinivasan R, Kakumanu S, Ochoa S, Keswani A, Sparks R, Rider NL. A Framework for Augmented Intelligence in Allergy and Immunology Practice and Research—A Work Group Report of the AAAAI Health Informatics, Technology, and Education Committee. THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY: IN PRACTICE 2022; 10:1178-1188. [PMID: 35300959 PMCID: PMC9205719 DOI: 10.1016/j.jaip.2022.01.047] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 01/19/2022] [Accepted: 01/20/2022] [Indexed: 10/18/2022]
Abstract
Artificial and augmented intelligence (AI) and machine learning (ML) methods are expanding into the health care space. Big data are increasingly used in patient care applications, diagnostics, and treatment decisions in allergy and immunology. How these technologies will be evaluated, approved, and assessed for their impact is an important consideration for researchers and practitioners alike. With the potential of ML, deep learning, natural language processing, and other assistive methods to redefine health care usage, a scaffold for the impact of AI technology on research and patient care in allergy and immunology is needed. An American Academy of Asthma Allergy and Immunology Health Information Technology and Education subcommittee workgroup was convened to perform a scoping review of AI within health care as well as the specialty of allergy and immunology to address impacts on allergy and immunology practice and research as well as potential challenges including education, AI governance, ethical and equity considerations, and potential opportunities for the specialty. There are numerous potential clinical applications of AI in allergy and immunology that range from disease diagnosis to multidimensional data reduction in electronic health records or immunologic datasets. For appropriate application and interpretation of AI, specialists should be involved in the design, validation, and implementation of AI in allergy and immunology. Challenges include incorporation of data science and bioinformatics into training of future allergists-immunologists.
Collapse
|
75
|
Chamberlain AM, Roger VL, Noseworthy PA, Chen LY, Weston SA, Jiang R, Alonso A. Identification of Incident Atrial Fibrillation From Electronic Medical Records. J Am Heart Assoc 2022; 11:e023237. [PMID: 35348008 PMCID: PMC9075468 DOI: 10.1161/jaha.121.023237] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Background Electronic medical records are increasingly used to identify disease cohorts; however, computable phenotypes using electronic medical record data are often unable to distinguish between prevalent and incident cases. Methods and Results We identified all Olmsted County, Minnesota residents aged ≥18 with a first-ever International Classification of Diseases, Ninth Revision (ICD-9) diagnostic code for atrial fibrillation or atrial flutter from 2000 to 2014 (N=6177), and a random sample with an International Classification of Diseases, Tenth Revision (ICD-10) code from 2016 to 2018 (N=200). Trained nurse abstractors reviewed all medical records to validate the events and ascertain the date of onset (incidence date). Various algorithms based on number and types of codes (inpatient/outpatient), medications, and procedures were evaluated. Positive predictive value (PPV) and sensitivity of the algorithms were calculated. The lowest PPV was observed for 1 code (64.4%), and the highest PPV was observed for 2 codes (any type) >7 days apart but within 1 year (71.6%). Requiring either 1 inpatient or 2 outpatient codes separated by >7 days but within 1 year had the best balance between PPV (69.9%) and sensitivity (95.5%). PPVs were slightly higher using ICD-10 codes. Requiring an anticoagulant or antiarrhythmic prescription or electrical cardioversion in addition to diagnostic code(s) modestly improved the PPVs at the expense of large reductions in sensitivity. Conclusions We developed simple, exportable, computable phenotypes for atrial fibrillation using structured electronic medical record data. However, use of diagnostic codes to identify incident atrial fibrillation is prone to some misclassification. Further study is warranted to determine whether more complex phenotypes, including unstructured data sources or using machine learning techniques, may improve the accuracy of identifying incident atrial fibrillation.
Collapse
Affiliation(s)
- Alanna M. Chamberlain
- Department of Quantitative Health SciencesMayo ClinicRochesterMN,Department of Cardiovascular MedicineMayo ClinicRochesterMN
| | - Véronique L. Roger
- Department of Cardiovascular MedicineMayo ClinicRochesterMN,Epidemiology and Community Health BranchNational Heart, Lung, and Blood InstituteNational Institutes of HealthBethesdaMD
| | | | - Lin Y. Chen
- Cardiovascular DivisionDepartment of MedicineUniversity of Minnesota Medical SchoolMinneapolisMN
| | - Susan A. Weston
- Department of Quantitative Health SciencesMayo ClinicRochesterMN
| | - Ruoxiang Jiang
- Department of Quantitative Health SciencesMayo ClinicRochesterMN
| | - Alvaro Alonso
- Department of EpidemiologyRollins School of Public HealthEmory UniversityAtlantaGA
| |
Collapse
|
76
|
Almowil Z, Zhou SM, Brophy S, Croxall J. Concept Libraries for Repeatable and Reusable Research: Qualitative Study Exploring the Needs of Users. JMIR Hum Factors 2022; 9:e31021. [PMID: 35289755 PMCID: PMC8965669 DOI: 10.2196/31021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 11/17/2021] [Accepted: 12/05/2021] [Indexed: 12/05/2022] Open
Abstract
Background Big data research in the field of health sciences is hindered by a lack of agreement on how to identify and define different conditions and their medications. This means that researchers and health professionals often have different phenotype definitions for the same condition. This lack of agreement makes it difficult to compare different study findings and hinders the ability to conduct repeatable and reusable research. Objective This study aims to examine the requirements of various users, such as researchers, clinicians, machine learning experts, and managers, in the development of a data portal for phenotypes (a concept library). Methods This was a qualitative study using interviews and focus group discussion. One-to-one interviews were conducted with researchers, clinicians, machine learning experts, and senior research managers in health data science (N=6) to explore their specific needs in the development of a concept library. In addition, a focus group discussion with researchers (N=14) working with the Secured Anonymized Information Linkage databank, a national eHealth data linkage infrastructure, was held to perform a SWOT (strengths, weaknesses, opportunities, and threats) analysis for the phenotyping system and the proposed concept library. The interviews and focus group discussion were transcribed verbatim, and 2 thematic analyses were performed. Results Most of the participants thought that the prototype concept library would be a very helpful resource for conducting repeatable research, but they specified that many requirements are needed before its development. Although all the participants stated that they were aware of some existing concept libraries, most of them expressed negative perceptions about them. The participants mentioned several facilitators that would stimulate them to share their work and reuse the work of others, and they pointed out several barriers that could inhibit them from sharing their work and reusing the work of others. The participants suggested some developments that they would like to see to improve reproducible research output using routine data. Conclusions The study indicated that most interviewees valued a concept library for phenotypes. However, only half of the participants felt that they would contribute by providing definitions for the concept library, and they reported many barriers regarding sharing their work on a publicly accessible platform. Analysis of interviews and the focus group discussion revealed that different stakeholders have different requirements, facilitators, barriers, and concerns about a prototype concept library.
Collapse
Affiliation(s)
- Zahra Almowil
- Data Science Building, Medical School, Swansea University, Swansea, Wales, United Kingdom
| | - Shang-Ming Zhou
- Centre For Health Technology, Faculty of Health, University of Plymouth, Plymouth, United Kingdom
| | - Sinead Brophy
- Data Science Building, Medical School, Swansea University, Swansea, Wales, United Kingdom
| | - Jodie Croxall
- Data Science Building, Medical School, Swansea University, Swansea, Wales, United Kingdom
| |
Collapse
|
77
|
Machine Learning Approaches to Investigate Clostridioides difficile Infection and Outcomes: A Systematic Review. Int J Med Inform 2022; 160:104706. [DOI: 10.1016/j.ijmedinf.2022.104706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 12/21/2021] [Accepted: 01/22/2022] [Indexed: 11/20/2022]
|
78
|
Applying Machine Learning in Distributed Data Networks for Pharmacoepidemiologic and Pharmacovigilance Studies: Opportunities, Challenges, and Considerations. Drug Saf 2022; 45:493-510. [PMID: 35579813 PMCID: PMC9112258 DOI: 10.1007/s40264-022-01158-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/13/2022] [Indexed: 01/28/2023]
Abstract
Increasing availability of electronic health databases capturing real-world experiences with medical products has garnered much interest in their use for pharmacoepidemiologic and pharmacovigilance studies. The traditional practice of having numerous groups use single databases to accomplish similar tasks and address common questions about medical products can be made more efficient through well-coordinated multi-database studies, greatly facilitated through distributed data network (DDN) architectures. Access to larger amounts of electronic health data within DDNs has created a growing interest in using data-adaptive machine learning (ML) techniques that can automatically model complex associations in high-dimensional data with minimal human guidance. However, the siloed storage and diverse nature of the databases in DDNs create unique challenges for using ML. In this paper, we discuss opportunities, challenges, and considerations for applying ML in DDNs for pharmacoepidemiologic and pharmacovigilance studies. We first discuss major types of activities performed by DDNs and how ML may be used. Next, we discuss practical data-related factors influencing how DDNs work in practice. We then combine these discussions and jointly consider how opportunities for ML are affected by practical data-related factors for DDNs, leading to several challenges. We present different approaches for addressing these challenges and highlight efforts that real-world DDNs have taken or are currently taking to help mitigate them. Despite these challenges, the time is ripe for the emerging interest to use ML in DDNs, and the utility of these data-adaptive modeling techniques in pharmacoepidemiologic and pharmacovigilance studies will likely continue to increase in the coming years.
Collapse
|
79
|
Lossio-Ventura JA, Sun R, Boussard S, Hernandez-Boussard T. Clinical concept recognition: Evaluation of existing systems on EHRs. Front Artif Intell 2022; 5:1051724. [PMID: 36714202 PMCID: PMC9880223 DOI: 10.3389/frai.2022.1051724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 12/15/2022] [Indexed: 01/15/2023] Open
Abstract
Objective The adoption of electronic health records (EHRs) has produced enormous amounts of data, creating research opportunities in clinical data sciences. Several concept recognition systems have been developed to facilitate clinical information extraction from these data. While studies exist that compare the performance of many concept recognition systems, they are typically developed internally and may be biased due to different internal implementations, parameters used, and limited number of systems included in the evaluations. The goal of this research is to evaluate the performance of existing systems to retrieve relevant clinical concepts from EHRs. Methods We investigated six concept recognition systems, including CLAMP, cTAKES, MetaMap, NCBO Annotator, QuickUMLS, and ScispaCy. Clinical concepts extracted included procedures, disorders, medications, and anatomical location. The system performance was evaluated on two datasets: the 2010 i2b2 and the MIMIC-III. Additionally, we assessed the performance of these systems in five challenging situations, including negation, severity, abbreviation, ambiguity, and misspelling. Results For clinical concept extraction, CLAMP achieved the best performance on exact and inexact matching, with an F-score of 0.70 and 0.94, respectively, on i2b2; and 0.39 and 0.50, respectively, on MIMIC-III. Across the five challenging situations, ScispaCy excelled in extracting abbreviation information (F-score: 0.86) followed by NCBO Annotator (F-score: 0.79). CLAMP outperformed in extracting severity terms (F-score 0.73) followed by NCBO Annotator (F-score: 0.68). CLAMP outperformed other systems in extracting negated concepts (F-score 0.63). Conclusions Several concept recognition systems exist to extract clinical information from unstructured data. This study provides an external evaluation by end-users of six commonly used systems across different extraction tasks. Our findings suggest that CLAMP provides the most comprehensive set of annotations for clinical concept extraction tasks and associated challenges. Comparing standard extraction tasks across systems provides guidance to other clinical researchers when selecting a concept recognition system relevant to their clinical information extraction task.
Collapse
Affiliation(s)
- Juan Antonio Lossio-Ventura
- Biomedical Informatics Research, Stanford University, Stanford, CA, United States.,National Institute of Mental Health, National Institutes of Health, Bethesda, MD, United States
| | - Ran Sun
- Biomedical Informatics Research, Stanford University, Stanford, CA, United States
| | | | - Tina Hernandez-Boussard
- Biomedical Informatics Research, Stanford University, Stanford, CA, United States.,Department of Biomedical Data Sciences, Stanford University, Stanford, CA, United States.,Department of Surgery, Stanford University, Stanford, CA, United States
| |
Collapse
|
80
|
Chapman M, Mumtaz S, Rasmussen LV, Karwath A, Gkoutos GV, Gao C, Thayer D, Pacheco JA, Parkinson H, Richesson RL, Jefferson E, Denaxas S, Curcin V. Desiderata for the development of next-generation electronic health record phenotype libraries. Gigascience 2021; 10:giab059. [PMID: 34508578 PMCID: PMC8434766 DOI: 10.1093/gigascience/giab059] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 07/15/2021] [Accepted: 08/18/2021] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND High-quality phenotype definitions are desirable to enable the extraction of patient cohorts from large electronic health record repositories and are characterized by properties such as portability, reproducibility, and validity. Phenotype libraries, where definitions are stored, have the potential to contribute significantly to the quality of the definitions they host. In this work, we present a set of desiderata for the design of a next-generation phenotype library that is able to ensure the quality of hosted definitions by combining the functionality currently offered by disparate tooling. METHODS A group of researchers examined work to date on phenotype models, implementation, and validation, as well as contemporary phenotype libraries developed as a part of their own phenomics communities. Existing phenotype frameworks were also examined. This work was translated and refined by all the authors into a set of best practices. RESULTS We present 14 library desiderata that promote high-quality phenotype definitions, in the areas of modelling, logging, validation, and sharing and warehousing. CONCLUSIONS There are a number of choices to be made when constructing phenotype libraries. Our considerations distil the best practices in the field and include pointers towards their further development to support portable, reproducible, and clinically valid phenotype design. The provision of high-quality phenotype definitions enables electronic health record data to be more effectively used in medical domains.
Collapse
Affiliation(s)
- Martin Chapman
- Department of Population Health Sciences, King's College London, London, SE1 1UL, UK
| | - Shahzad Mumtaz
- Health Informatics Centre (HIC), University of Dundee, Dundee, DD1 9SY, UK
| | - Luke V Rasmussen
- Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Andreas Karwath
- Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, B15 2TT, UK
| | - Georgios V Gkoutos
- Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, B15 2TT, UK
| | - Chuang Gao
- Health Informatics Centre (HIC), University of Dundee, Dundee, DD1 9SY, UK
| | - Dan Thayer
- SAIL Databank, Swansea University, Swansea, SA2 8PP, UK
| | - Jennifer A Pacheco
- Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, UK
| | - Rachel L Richesson
- Department of Learning Health Sciences, University of Michigan Medical School, MI 48109, USA
| | - Emily Jefferson
- Health Informatics Centre (HIC), University of Dundee, Dundee, DD1 9SY, UK
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, NW1 2DA, UK
| | - Vasa Curcin
- Department of Population Health Sciences, King's College London, London, SE1 1UL, UK
| |
Collapse
|
81
|
De Freitas JK, Johnson KW, Golden E, Nadkarni GN, Dudley JT, Bottinger EP, Glicksberg BS, Miotto R. Phe2vec: Automated disease phenotyping based on unsupervised embeddings from electronic health records. PATTERNS (NEW YORK, N.Y.) 2021; 2:100337. [PMID: 34553174 PMCID: PMC8441576 DOI: 10.1016/j.patter.2021.100337] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 06/30/2021] [Accepted: 08/05/2021] [Indexed: 11/23/2022]
Abstract
Robust phenotyping of patients from electronic health records (EHRs) at scale is a challenge in clinical informatics. Here, we introduce Phe2vec, an automated framework for disease phenotyping from EHRs based on unsupervised learning and assess its effectiveness against standard rule-based algorithms from Phenotype KnowledgeBase (PheKB). Phe2vec is based on pre-computing embeddings of medical concepts and patients' clinical history. Disease phenotypes are then derived from a seed concept and its neighbors in the embedding space. Patients are linked to a disease if their embedded representation is close to the disease phenotype. Comparing Phe2vec and PheKB cohorts head-to-head using chart review, Phe2vec performed on par or better in nine out of ten diseases. Differently from other approaches, it can scale to any condition and was validated against widely adopted expert-based standards. Phe2vec aims to optimize clinical informatics research by augmenting current frameworks to characterize patients by condition and derive reliable disease cohorts.
Collapse
Affiliation(s)
- Jessica K. De Freitas
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
| | - Kipp W. Johnson
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
| | - Eddye Golden
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
| | - Girish N. Nadkarni
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
| | - Joel T. Dudley
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
| | - Erwin P. Bottinger
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
- Digital Health Center at Hasso Plattner Institute, University of Potsdam, Professor-Dr.-Helmert-Str 2–3, 14482 Potsdam, Germany
| | - Benjamin S. Glicksberg
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
| | - Riccardo Miotto
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
| |
Collapse
|
82
|
Estiri H, Strasser ZH, Murphy SN. High-throughput phenotyping with temporal sequences. J Am Med Inform Assoc 2021; 28:772-781. [PMID: 33313899 DOI: 10.1093/jamia/ocaa288] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 11/04/2020] [Indexed: 12/15/2022] Open
Abstract
OBJECTIVE High-throughput electronic phenotyping algorithms can accelerate translational research using data from electronic health record (EHR) systems. The temporal information buried in EHRs is often underutilized in developing computational phenotypic definitions. This study aims to develop a high-throughput phenotyping method, leveraging temporal sequential patterns from EHRs. MATERIALS AND METHODS We develop a representation mining algorithm to extract 5 classes of representations from EHR diagnosis and medication records: the aggregated vector of the records (aggregated vector representation), the standard sequential patterns (sequential pattern mining), the transitive sequential patterns (transitive sequential pattern mining), and 2 hybrid classes. Using EHR data on 10 phenotypes from the Mass General Brigham Biobank, we train and validate phenotyping algorithms. RESULTS Phenotyping with temporal sequences resulted in a superior classification performance across all 10 phenotypes compared with the standard representations in electronic phenotyping. The high-throughput algorithm's classification performance was superior or similar to the performance of previously published electronic phenotyping algorithms. We characterize and evaluate the top transitive sequences of diagnosis records paired with the records of risk factors, symptoms, complications, medications, or vaccinations. DISCUSSION The proposed high-throughput phenotyping approach enables seamless discovery of sequential record combinations that may be difficult to assume from raw EHR data. Transitive sequences offer more accurate characterization of the phenotype, compared with its individual components, and reflect the actual lived experiences of the patients with that particular disease. CONCLUSION Sequential data representations provide a precise mechanism for incorporating raw EHR records into downstream machine learning. Our approach starts with user interpretability and works backward to the technology.
Collapse
Affiliation(s)
- Hossein Estiri
- Harvard Medical School, Boston, Massachusetts, USA.,Massachusetts General Hospital, Boston, Massachusetts, USA.,Mass General Brigham, Boston, Massachusetts, USA
| | - Zachary H Strasser
- Harvard Medical School, Boston, Massachusetts, USA.,Massachusetts General Hospital, Boston, Massachusetts, USA.,Mass General Brigham, Boston, Massachusetts, USA
| | - Shawn N Murphy
- Harvard Medical School, Boston, Massachusetts, USA.,Massachusetts General Hospital, Boston, Massachusetts, USA.,Mass General Brigham, Boston, Massachusetts, USA
| |
Collapse
|
83
|
Liu L, Bustamante R, Earles A, Demb J, Messer K, Gupta S. A strategy for validation of variables derived from large-scale electronic health record data. J Biomed Inform 2021; 121:103879. [PMID: 34329789 PMCID: PMC9615095 DOI: 10.1016/j.jbi.2021.103879] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Revised: 07/21/2021] [Accepted: 07/24/2021] [Indexed: 11/16/2022]
Abstract
Purpose: Standardized approaches for rigorous validation of phenotyping from large-scale electronic health record (EHR) data have not been widely reported. We proposed a methodologically rigorous and efficient approach to guide such validation, including strategies for sampling cases and controls, determining sample sizes, estimating algorithm performance, and terminating the validation process, hereafter referred to as the San Diego Approach to Variable Validation (SDAVV). Methods: We propose sample size formulae which should be used prior to chart review, based on pre-specified critical lower bounds for positive predictive value (PPV) and negative predictive value (NPV). We also propose a stepwise strategy for iterative algorithm development/validation cycles, updating sample sizes for data abstraction until both PPV and NPV achieve target performance. Results: We applied the SDAVV to a Department of Veterans Affairs study in which we created two phenotyping algorithms, one for distinguishing normal colonoscopy cases from abnormal colonoscopy controls and one for identifying aspirin exposure. Estimated PPV and NPV both reached 0.970 with a 95% confidence lower bound of 0.915, estimated sensitivity was 0.963 and specificity was 0.975 for identifying normal colonoscopy cases. The phenotyping algorithm for identifying aspirin exposure reached a PPV of 0.990 (a 95% lower bound of 0.950), an NPV of 0.980 (a 95% lower bound of 0.930), and sensitivity and specificity were 0.960 and 1.000. Conclusions: A structured approach for prospectively developing and validating phenotyping algorithms from large-scale EHR data can be successfully implemented, and should be considered to improve the quality of “big data” research.
Collapse
Affiliation(s)
- Lin Liu
- VA San Diego Healthcare System, 3500 La Jolla Village Dr, San Diego, CA 92161, USA; University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA.
| | - Ranier Bustamante
- University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
| | - Ashley Earles
- Veterans Medical Research Foundation, 3350 La Jolla Village Dr, San Diego, CA 92161, USA
| | - Joshua Demb
- University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
| | - Karen Messer
- University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
| | - Samir Gupta
- VA San Diego Healthcare System, 3500 La Jolla Village Dr, San Diego, CA 92161, USA; University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA.
| |
Collapse
|
84
|
Abstract
Machine learning can be used to make sense of healthcare data. Probabilistic machine learning models help provide a complete picture of observed data in healthcare. In this review, we examine how probabilistic machine learning can advance healthcare. We consider challenges in the predictive model building pipeline where probabilistic models can be beneficial, including calibration and missing data. Beyond predictive models, we also investigate the utility of probabilistic machine learning models in phenotyping, in generative models for clinical use cases, and in reinforcement learning.
Collapse
Affiliation(s)
- Irene Y Chen
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA;
| | | | - Marzyeh Ghassemi
- Vector Institute, Toronto, Ontario M5G 1M1, Canada; .,Institute for Medical and Evaluative Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Rajesh Ranganath
- Department of Computer Science, Courant Institute, New York University, New York, NY 10012, USA.,Center for Data Science, New York University, New York, NY 10012, USA.,Department of Population Health, New York University Grossman School of Medicine, New York, NY 10016, USA
| |
Collapse
|
85
|
Gibson TB, Nguyen MD, Burrell T, Yoon F, Wong J, Dharmarajan S, Ouellet-Hellstrom R, Hua W, Ma Y, Baro E, Bloemers S, Pack C, Kennedy A, Toh S, Ball R. Electronic phenotyping of health outcomes of interest using a linked claims-electronic health record database: Findings from a machine learning pilot project. J Am Med Inform Assoc 2021; 28:1507-1517. [PMID: 33712852 DOI: 10.1093/jamia/ocab036] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 02/19/2021] [Indexed: 01/04/2023] Open
Abstract
OBJECTIVE Claims-based algorithms are used in the Food and Drug Administration Sentinel Active Risk Identification and Analysis System to identify occurrences of health outcomes of interest (HOIs) for medical product safety assessment. This project aimed to apply machine learning classification techniques to demonstrate the feasibility of developing a claims-based algorithm to predict an HOI in structured electronic health record (EHR) data. MATERIALS AND METHODS We used the 2015-2019 IBM MarketScan Explorys Claims-EMR Data Set, linking administrative claims and EHR data at the patient level. We focused on a single HOI, rhabdomyolysis, defined by EHR laboratory test results. Using claims-based predictors, we applied machine learning techniques to predict the HOI: logistic regression, LASSO (least absolute shrinkage and selection operator), random forests, support vector machines, artificial neural nets, and an ensemble method (Super Learner). RESULTS The study cohort included 32 956 patients and 39 499 encounters. Model performance (positive predictive value [PPV], sensitivity, specificity, area under the receiver-operating characteristic curve) varied considerably across techniques. The area under the receiver-operating characteristic curve exceeded 0.80 in most model variations. DISCUSSION For the main Food and Drug Administration use case of assessing risk of rhabdomyolysis after drug use, a model with a high PPV is typically preferred. The Super Learner ensemble model without adjustment for class imbalance achieved a PPV of 75.6%, substantially better than a previously used human expert-developed model (PPV = 44.0%). CONCLUSIONS It is feasible to use machine learning methods to predict an EHR-derived HOI with claims-based predictors. Modeling strategies can be adapted for intended uses, including surveillance, identification of cases for chart review, and outcomes research.
Collapse
Affiliation(s)
- Teresa B Gibson
- Government Health and Human Services, IBM Watson Health, Bethesda, Maryland, USA
| | | | - Timothy Burrell
- Government Health and Human Services, IBM Watson Health, Bethesda, Maryland, USA
| | - Frank Yoon
- Government Health and Human Services, IBM Watson Health, Bethesda, Maryland, USA
| | - Jenna Wong
- Harvard Medical School and Harvard Pilgrim Health Care Institute, Harvard Medical School, Boston, Massachusetts, USA
| | - Sai Dharmarajan
- Office of Biostatistics, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Rita Ouellet-Hellstrom
- Division of Epidemiology II, Office of Pharmacovigilance and Epidemiology, Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Wei Hua
- Food and Drug Administration, Silver Spring, Maryland, USA
| | - Yong Ma
- Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Elande Baro
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, USA
| | - Sarah Bloemers
- Government Health and Human Services, IBM Watson Health, Bethesda, Maryland, USA
| | - Cory Pack
- Government Health and Human Services, IBM Watson Health, Bethesda, Maryland, USA
| | - Adee Kennedy
- Harvard Medical School and Harvard Pilgrim Health Care Institute, Harvard Medical School, Boston, Massachusetts, USA
| | - Sengwee Toh
- Harvard Medical School and Harvard Pilgrim Health Care Institute, Harvard Medical School, Boston, Massachusetts, USA
| | - Robert Ball
- Office of Surveillance and Epidemiology Center for Drug Evaluation and Research U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| |
Collapse
|
86
|
Callahan A, Polony V, Posada JD, Banda JM, Gombar S, Shah NH. ACE: the Advanced Cohort Engine for searching longitudinal patient records. J Am Med Inform Assoc 2021; 28:1468-1479. [PMID: 33712854 PMCID: PMC8279796 DOI: 10.1093/jamia/ocab027] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Accepted: 02/23/2021] [Indexed: 01/02/2023] Open
Abstract
OBJECTIVE To propose a paradigm for a scalable time-aware clinical data search, and to describe the design, implementation and use of a search engine realizing this paradigm. MATERIALS AND METHODS The Advanced Cohort Engine (ACE) uses a temporal query language and in-memory datastore of patient objects to provide a fast, scalable, and expressive time-aware search. ACE accepts data in the Observational Medicine Outcomes Partnership Common Data Model, and is configurable to balance performance with compute cost. ACE's temporal query language supports automatic query expansion using clinical knowledge graphs. The ACE API can be used with R, Python, Java, HTTP, and a Web UI. RESULTS ACE offers an expressive query language for complex temporal search across many clinical data types with multiple output options. ACE enables electronic phenotyping and cohort-building with subsecond response times in searching the data of millions of patients for a variety of use cases. DISCUSSION ACE enables fast, time-aware search using a patient object-centric datastore, thereby overcoming many technical and design shortcomings of relational algebra-based querying. Integrating electronic phenotype development with cohort-building enables a variety of high-value uses for a learning health system. Tradeoffs include the need to learn a new query language and the technical setup burden. CONCLUSION ACE is a tool that combines a unique query language for time-aware search of longitudinal patient records with a patient object datastore for rapid electronic phenotyping, cohort extraction, and exploratory data analyses.
Collapse
Affiliation(s)
- Alison Callahan
- Center for Biomedical Informatics Research, School of Medicine, School of Medicine, Stanford University, Stanford, California, USA
| | - Vladimir Polony
- Center for Biomedical Informatics Research, School of Medicine, School of Medicine, Stanford University, Stanford, California, USA
| | - José D Posada
- Center for Biomedical Informatics Research, School of Medicine, School of Medicine, Stanford University, Stanford, California, USA
| | - Juan M Banda
- Department of Computer Science, Georgia State University, Atlanta, Georgia, USA
| | - Saurabh Gombar
- Department of Pathology, School of Medicine, Stanford University, Stanford, California, USA
| | - Nigam H Shah
- Center for Biomedical Informatics Research, School of Medicine, School of Medicine, Stanford University, Stanford, California, USA
| |
Collapse
|
87
|
Yuan Q, Cai T, Hong C, Du M, Johnson BE, Lanuti M, Cai T, Christiani DC. Performance of a Machine Learning Algorithm Using Electronic Health Record Data to Identify and Estimate Survival in a Longitudinal Cohort of Patients With Lung Cancer. JAMA Netw Open 2021; 4:e2114723. [PMID: 34232304 PMCID: PMC8264641 DOI: 10.1001/jamanetworkopen.2021.14723] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
IMPORTANCE Electronic health records (EHRs) provide a low-cost means of accessing detailed longitudinal clinical data for large populations. A lung cancer cohort assembled from EHR data would be a powerful platform for clinical outcome studies. OBJECTIVE To investigate whether a clinical cohort assembled from EHRs could be used in a lung cancer prognosis study. DESIGN, SETTING, AND PARTICIPANTS In this cohort study, patients with lung cancer were identified among 76 643 patients with at least 1 lung cancer diagnostic code deposited in an EHR in Mass General Brigham health care system from July 1988 to October 2018. Patients were identified via a semisupervised machine learning algorithm, for which clinical information was extracted from structured and unstructured data via natural language processing tools. Data completeness and accuracy were assessed by comparing with the Boston Lung Cancer Study and against criterion standard EHR review results. A prognostic model for non-small cell lung cancer (NSCLC) overall survival was further developed for clinical application. Data were analyzed from March 2019 through July 2020. EXPOSURES Clinical data deposited in EHRs for cohort construction and variables of interest for the prognostic model were collected. MAIN OUTCOMES AND MEASURES The primary outcomes were the performance of the lung cancer classification model and the quality of the extracted variables; the secondary outcome was the performance of the prognostic model. RESULTS Among 76 643 patients with at least 1 lung cancer diagnostic code, 42 069 patients were identified as having lung cancer, with a positive predictive value of 94.4%. The study cohort consisted of 35 375 patients (16 613 men [47.0%] and 18 756 women [53.0%]; 30 140 White individuals [85.2%], 1040 Black individuals [2.9%], and 857 Asian individuals [2.4%]) after excluding patients with lung cancer history and less than 14 days of follow-up after initial diagnosis. The median (interquartile range) age at diagnosis was 66.7 (58.4-74.1) years. The area under the receiver operating characteristic curves of the prognostic model for overall survival with NSCLC were 0.828 (95% CI, 0.815-0.842) for 1-year prediction, 0.825 (95% CI, 0.812-0.836) for 2-year prediction, 0.814 (95% CI, 0.800-0.826) for 3-year prediction, 0.814 (95% CI, 0.799-0.828) for 4-year prediction, and 0.812 (95% CI, 0.798-0.825) for 5-year prediction. CONCLUSIONS AND RELEVANCE These findings suggest the feasibility of assembling a large-scale EHR-based lung cancer cohort with detailed longitudinal clinical measurements and that EHR data may be applied in cancer progression with a set of generalizable approaches.
Collapse
Affiliation(s)
- Qianyu Yuan
- Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Tianrun Cai
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Chuan Hong
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| | - Mulong Du
- Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- Department of Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Bruce E. Johnson
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts
- Center for Cancer Genomics, Dana-Farber Cancer Institute, Boston, Massachusetts
| | - Michael Lanuti
- Center for Thoracic Cancers, Division of Thoracic Surgery, Massachusetts General Hospital Cancer Center, Boston, Massachusetts
| | - Tianxi Cai
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| | - David C. Christiani
- Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- Department of Medicine, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
88
|
Jorm LR. Commentary: Towards machine learning-enabled epidemiology. Int J Epidemiol 2021; 49:1770-1773. [PMID: 33485274 DOI: 10.1093/ije/dyaa242] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Louisa R Jorm
- Centre for Big Data Research in Health, Faculty of Medicine, University of New South Wales, Sydney, Australia
| |
Collapse
|
89
|
Shung D, Tsay C, Laine L, Chang D, Li F, Thomas P, Partridge C, Simonov M, Hsiao A, Tay JK, Taylor A. Early identification of patients with acute gastrointestinal bleeding using natural language processing and decision rules. J Gastroenterol Hepatol 2021; 36:1590-1597. [PMID: 33105045 PMCID: PMC11874507 DOI: 10.1111/jgh.15313] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 08/21/2020] [Accepted: 10/13/2020] [Indexed: 12/14/2022]
Abstract
BACKGROUND AND AIM Guidelines recommend risk stratification scores in patients presenting with gastrointestinal bleeding (GIB), but such scores are uncommonly employed in practice. Automation and deployment of risk stratification scores in real time within electronic health records (EHRs) would overcome a major impediment. This requires an automated mechanism to accurately identify ("phenotype") patients with GIB at the time of presentation. The goal is to identify patients with acute GIB by developing and evaluating EHR-based phenotyping algorithms for emergency department (ED) patients. METHODS We specified criteria using structured data elements to create rules for identifying patients and also developed multiple natural language processing (NLP)-based approaches for automated phenotyping of patients, tested them with tenfold cross-validation for 10 iterations (n = 7144) and external validation (n = 2988) and compared them with a standard method to identify patient conditions, the Systematized Nomenclature of Medicine. The gold standard for GIB diagnosis was the independent dual manual review of medical records. The primary outcome was the positive predictive value. RESULTS A decision rule using GIB-specific terms from ED triage and ED review-of-systems assessment performed better than the Systematized Nomenclature of Medicine on internal validation and external validation (positive predictive value = 85% confidence interval:83%-87% vs 69% confidence interval:66%-72%; P < 0.001). The syntax-based NLP algorithm and Bidirectional Encoder Representation from Transformers neural network-based NLP algorithm had similar performance to the structured-data fields decision rule. CONCLUSIONS An automated decision rule employing GIB-specific triage and review-of-systems terms can be used to trigger EHR-based deployment of risk stratification models to guide clinical decision making in real time for patients with acute GIB presenting to the ED.
Collapse
Affiliation(s)
- Dennis Shung
- Yale School of Medicine, New Haven
- Section of Digestive Diseases, Department of Medicine, New Haven
| | | | - Loren Laine
- Section of Digestive Diseases, Department of Medicine, New Haven
- Department of Medicine, VA Connecticut Healthcare System, West Haven, Connecticut
| | - David Chang
- Computational Biology and Bioinformatics, Yale University, New Haven
| | - Fan Li
- Department of Biostatistics, Yale School of Public Health, New Haven
| | - Prem Thomas
- Yale School of Medicine, New Haven
- Clinical Informatics, Yale-New Haven Health System, New Haven
| | | | | | - Allen Hsiao
- Yale School of Medicine, New Haven
- Clinical Informatics, Yale-New Haven Health System, New Haven
| | - J Kenneth Tay
- Department of Statistics, Stanford University, Palo Alto, California, USA
| | - Andrew Taylor
- Department of Emergency Medicine, Yale School of Medicine, New Haven
| |
Collapse
|
90
|
Peer K, Adams WG, Legler A, Sandel M, Levy JI, Boynton-Jarrett R, Kim C, Leibler JH, Fabian MP. Developing and evaluating a pediatric asthma severity computable phenotype derived from electronic health records. J Allergy Clin Immunol 2021; 147:2162-2170. [PMID: 33338540 PMCID: PMC8328264 DOI: 10.1016/j.jaci.2020.11.045] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 11/23/2020] [Accepted: 11/26/2020] [Indexed: 10/22/2022]
Abstract
BACKGROUND Extensive data available in electronic health records (EHRs) have the potential to improve asthma care and understanding of factors influencing asthma outcomes. However, this work can be accomplished only when the EHR data allow for accurate measures of severity, which at present are complex and inconsistent. OBJECTIVE Our aims were to create and evaluate a standardized pediatric asthma severity phenotype based in clinical asthma guidelines for use in EHR-based health initiatives and studies and also to examine the presence and absence of these data in relation to patient characteristics. METHODS We developed an asthma severity computable phenotype and compared the concordance of different severity components contributing to the phenotype to trends in the literature. We used multivariable logistic regression to assess the presence of EHR data relevant to asthma severity. RESULTS The asthma severity computable phenotype performs as expected in comparison with national statistics and the literature. Severity classification for a child is maximized when based on the long-term medication regimen component and minimized when based only on the symptom data component. Use of the severity phenotype results in better, clinically grounded classification. Children for whom severity could be ascertained from these EHR data were more likely to be seen for asthma in the outpatient setting and less likely to be older or Hispanic. Black children were less likely to have lung function testing data present. CONCLUSION We developed a pragmatic computable phenotype for pediatric asthma severity that is transportable to other EHRs.
Collapse
Affiliation(s)
- Komal Peer
- Department of Environmental Health, Boston University School of Public Health, Boston, Mass.
| | - William G Adams
- Boston Medical Center, Boston, Mass; Department of Pediatrics, Boston University School of Medicine, Boston, Mass
| | | | - Megan Sandel
- Boston Medical Center, Boston, Mass; Department of Pediatrics, Boston University School of Medicine, Boston, Mass
| | - Jonathan I Levy
- Department of Environmental Health, Boston University School of Public Health, Boston, Mass
| | - Renée Boynton-Jarrett
- Boston Medical Center, Boston, Mass; Department of Pediatrics, Boston University School of Medicine, Boston, Mass
| | - Chanmin Kim
- Department of Statistics, SungKyunKwan University, Seoul, Korea
| | - Jessica H Leibler
- Department of Environmental Health, Boston University School of Public Health, Boston, Mass
| | - M Patricia Fabian
- Department of Environmental Health, Boston University School of Public Health, Boston, Mass
| |
Collapse
|
91
|
Lee J, Liu C, Kim JH, Butler A, Shang N, Pang C, Natarajan K, Ryan P, Ta C, Weng C. Comparative effectiveness of medical concept embedding for feature engineering in phenotyping. JAMIA Open 2021; 4:ooab028. [PMID: 34142015 PMCID: PMC8206403 DOI: 10.1093/jamiaopen/ooab028] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 02/23/2021] [Accepted: 05/03/2021] [Indexed: 01/20/2023] Open
Abstract
OBJECTIVE Feature engineering is a major bottleneck in phenotyping. Properly learned medical concept embeddings (MCEs) capture the semantics of medical concepts, thus are useful for retrieving relevant medical features in phenotyping tasks. We compared the effectiveness of MCEs learned from knowledge graphs and electronic healthcare records (EHR) data in retrieving relevant medical features for phenotyping tasks. MATERIALS AND METHODS We implemented 5 embedding methods including node2vec, singular value decomposition (SVD), LINE, skip-gram, and GloVe with 2 data sources: (1) knowledge graphs obtained from the observational medical outcomes partnership (OMOP) common data model; and (2) patient-level data obtained from the OMOP compatible electronic health records (EHR) from Columbia University Irving Medical Center (CUIMC). We used phenotypes with their relevant concepts developed and validated by the electronic medical records and genomics (eMERGE) network to evaluate the performance of learned MCEs in retrieving phenotype-relevant concepts. Hits@k% in retrieving phenotype-relevant concepts based on a single and multiple seed concept(s) was used to evaluate MCEs. RESULTS Among all MCEs, MCEs learned by using node2vec with knowledge graphs showed the best performance. Of MCEs based on knowledge graphs and EHR data, MCEs learned by using node2vec with knowledge graphs and MCEs learned by using GloVe with EHR data outperforms other MCEs, respectively. CONCLUSION MCE enables scalable feature engineering tasks, thereby facilitating phenotyping. Based on current phenotyping practices, MCEs learned by using knowledge graphs constructed by hierarchical relationships among medical concepts outperformed MCEs learned by using EHR data.
Collapse
Affiliation(s)
- Junghwan Lee
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Cong Liu
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Jae Hyun Kim
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Alex Butler
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Ning Shang
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Chao Pang
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Karthik Natarajan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Patrick Ryan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Casey Ta
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA
| |
Collapse
|
92
|
Ferté T, Cossin S, Schaeverbeke T, Barnetche T, Jouhet V, Hejblum BP. Automatic phenotyping of electronical health record: PheVis algorithm. J Biomed Inform 2021; 117:103746. [PMID: 33746080 DOI: 10.1016/j.jbi.2021.103746] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 03/02/2021] [Accepted: 03/05/2021] [Indexed: 11/18/2022]
Abstract
Electronic Health Records (EHRs) often lack reliable annotation of patient medical conditions. Phenorm, an automated unsupervised algorithm to identify patient medical conditions from EHR data, has been developed. PheVis extends PheNorm at the visit resolution. PheVis combines diagnosis codes together with medical concepts extracted from medical notes, incorporating past history in a machine learning approach to provide an interpretable parametric predictor of the occurrence probability for a given medical condition at each visit. PheVis is applied to two real-world use-cases using the datawarehouse of the University Hospital of Bordeaux: i) rheumatoid arthritis, a chronic condition; ii) tuberculosis, an acute condition. Cross-validated AUROC were respectively 0.943 [0.940; 0.945] and 0.987 [0.983; 0.990]. Cross-validated AUPRC were respectively 0.754 [0.744; 0.763] and 0.299 [0.198; 0.403]. PheVis performs well for chronic conditions, though absence of exclusion of past medical history by natural language processing tools limits its performance in French for acute conditions. It achieves significantly better performance than state-of-the-art unsupervised methods especially for chronic diseases.
Collapse
Affiliation(s)
- Thomas Ferté
- Bordeaux Hospital University Center, Pôle de santé publique, Service d'information médicale, Unité Informatique et Archivistique Médicales, F-33000 Bordeaux, France; Univ. Bordeaux ISPED, Inserm Bordeaux Population Health Research Center UMR 1219, Inria BSO, team SISTM, F-33000 Bordeaux, France.
| | - Sébastien Cossin
- Bordeaux Hospital University Center, Pôle de santé publique, Service d'information médicale, Unité Informatique et Archivistique Médicales, F-33000 Bordeaux, France; Univ. Bordeaux, Inserm, Bordeaux Population Health Research Center, team ERIAS, UMR 1219, F-33000 Bordeaux, France
| | - Thierry Schaeverbeke
- Rheumatology department, FHU ACRONIM, Bordeaux University Hospital, F-33076 Bordeaux, France
| | - Thomas Barnetche
- Rheumatology department, FHU ACRONIM, Bordeaux University Hospital, F-33076 Bordeaux, France
| | - Vianney Jouhet
- Bordeaux Hospital University Center, Pôle de santé publique, Service d'information médicale, Unité Informatique et Archivistique Médicales, F-33000 Bordeaux, France; Univ. Bordeaux, Inserm, Bordeaux Population Health Research Center, team ERIAS, UMR 1219, F-33000 Bordeaux, France
| | - Boris P Hejblum
- Univ. Bordeaux ISPED, Inserm Bordeaux Population Health Research Center UMR 1219, Inria BSO, team SISTM, F-33000 Bordeaux, France
| |
Collapse
|
93
|
Kashyap M, Seneviratne M, Banda JM, Falconer T, Ryu B, Yoo S, Hripcsak G, Shah NH. Development and validation of phenotype classifiers across multiple sites in the observational health data sciences and informatics network. J Am Med Inform Assoc 2021; 27:877-883. [PMID: 32374408 PMCID: PMC7309227 DOI: 10.1093/jamia/ocaa032] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 12/17/2019] [Accepted: 03/12/2020] [Indexed: 11/16/2022] Open
Abstract
Objective Accurate electronic phenotyping is essential to support collaborative observational research. Supervised machine learning methods can be used to train phenotype classifiers in a high-throughput manner using imperfectly labeled data. We developed 10 phenotype classifiers using this approach and evaluated performance across multiple sites within the Observational Health Data Sciences and Informatics (OHDSI) network. Materials and Methods We constructed classifiers using the Automated PHenotype Routine for Observational Definition, Identification, Training and Evaluation (APHRODITE) R-package, an open-source framework for learning phenotype classifiers using datasets in the Observational Medical Outcomes Partnership Common Data Model. We labeled training data based on the presence of multiple mentions of disease-specific codes. Performance was evaluated on cohorts derived using rule-based definitions and real-world disease prevalence. Classifiers were developed and evaluated across 3 medical centers, including 1 international site. Results Compared to the multiple mentions labeling heuristic, classifiers showed a mean recall boost of 0.43 with a mean precision loss of 0.17. Performance decreased slightly when classifiers were shared across medical centers, with mean recall and precision decreasing by 0.08 and 0.01, respectively, at a site within the USA, and by 0.18 and 0.10, respectively, at an international site. Discussion and Conclusion We demonstrate a high-throughput pipeline for constructing and sharing phenotype classifiers across sites within the OHDSI network using APHRODITE. Classifiers exhibit good portability between sites within the USA, however limited portability internationally, indicating that classifier generalizability may have geographic limitations, and, consequently, sharing the classifier-building recipe, rather than the pretrained classifiers, may be more useful for facilitating collaborative observational research.
Collapse
Affiliation(s)
- Mehr Kashyap
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA
| | - Martin Seneviratne
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA
| | - Juan M Banda
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA.,Department of Computer Science, Georgia State University, Atlanta, Georgia, USA
| | - Thomas Falconer
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Borim Ryu
- Office of eHealth and Business, Seoul National University Bundang Hospital, Gyeonggi-do, South Korea
| | - Sooyoung Yoo
- Office of eHealth and Business, Seoul National University Bundang Hospital, Gyeonggi-do, South Korea
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Nigam H Shah
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA
| |
Collapse
|
94
|
Estiri H, Vasey S, Murphy SN. Generative transfer learning for measuring plausibility of EHR diagnosis records. J Am Med Inform Assoc 2021; 28:559-568. [PMID: 33043366 PMCID: PMC7936395 DOI: 10.1093/jamia/ocaa215] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Accepted: 08/18/2020] [Indexed: 12/12/2022] Open
Abstract
OBJECTIVE Due to a complex set of processes involved with the recording of health information in the Electronic Health Records (EHRs), the truthfulness of EHR diagnosis records is questionable. We present a computational approach to estimate the probability that a single diagnosis record in the EHR reflects the true disease. MATERIALS AND METHODS Using EHR data on 18 diseases from the Mass General Brigham (MGB) Biobank, we develop generative classifiers on a small set of disease-agnostic features from EHRs that aim to represent Patients, pRoviders, and their Interactions within the healthcare SysteM (PRISM features). RESULTS We demonstrate that PRISM features and the generative PRISM classifiers are potent for estimating disease probabilities and exhibit generalizable and transferable distributional characteristics across diseases and patient populations. The joint probabilities we learn about diseases through the PRISM features via PRISM generative models are transferable and generalizable to multiple diseases. DISCUSSION The Generative Transfer Learning (GTL) approach with PRISM classifiers enables the scalable validation of computable phenotypes in EHRs without the need for domain-specific knowledge about specific disease processes. CONCLUSION Probabilities computed from the generative PRISM classifier can enhance and accelerate applied Machine Learning research and discoveries with EHR data.
Collapse
Affiliation(s)
- Hossein Estiri
- Harvard Medical School, Boston, Massachusetts, USA
- Massachusetts General Hospital, Boston, Massachusetts, USA
- Mass General Brigham, Boston, Massachusetts, USA
| | - Sebastien Vasey
- Department of Mathematics, Harvard University, Cambridge, Massachusetts, USA
| | - Shawn N Murphy
- Harvard Medical School, Boston, Massachusetts, USA
- Massachusetts General Hospital, Boston, Massachusetts, USA
- Mass General Brigham, Boston, Massachusetts, USA
| |
Collapse
|
95
|
Pellathy T, Saul M, Clermont G, Dubrawski AW, Pinsky MR, Hravnak M. Accuracy of identifying hospital acquired venous thromboembolism by administrative coding: implications for big data and machine learning research. J Clin Monit Comput 2021; 36:397-405. [PMID: 33558981 DOI: 10.1007/s10877-021-00664-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Accepted: 01/20/2021] [Indexed: 12/23/2022]
Abstract
Big data analytics research using heterogeneous electronic health record (EHR) data requires accurate identification of disease phenotype cases and controls. Overreliance on ground truth determination based on administrative data can lead to biased and inaccurate findings. Hospital-acquired venous thromboembolism (HA-VTE) is challenging to identify due to its temporal evolution and variable EHR documentation. To establish ground truth for machine learning modeling, we compared accuracy of HA-VTE diagnoses made by administrative coding to manual review of gold standard diagnostic test results. We performed retrospective analysis of EHR data on 3680 adult stepdown unit patients identifying HA-VTE. International Classification of Diseases, Ninth Revision (ICD-9-CM) codes for VTE were identified. 4544 radiology reports associated with VTE diagnostic tests were screened using terminology extraction and then manually reviewed by a clinical expert to confirm diagnosis. Of 415 cases with ICD-9-CM codes for VTE, 219 were identified with acute onset type codes. Test report review identified 158 new-onset HA-VTE cases. Only 40% of ICD-9-CM coded cases (n = 87) were confirmed by a positive diagnostic test report, leaving the majority of administratively coded cases unsubstantiated by confirmatory diagnostic test. Additionally, 45% of diagnostic test confirmed HA-VTE cases lacked corresponding ICD codes. ICD-9-CM coding missed diagnostic test-confirmed HA-VTE cases and inaccurately assigned cases without confirmed VTE, suggesting dependence on administrative coding leads to inaccurate HA-VTE phenotyping. Alternative methods to develop more sensitive and specific VTE phenotype solutions portable across EHR vendor data are needed to support case-finding in big-data analytics.
Collapse
Affiliation(s)
- Tiffany Pellathy
- University of Pittsburgh School of Nursing, 336 Victoria Hall; 3500 Victoria Street, Pittsburgh, PA, 15213, USA.
| | - Melissa Saul
- University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213, USA
| | - Gilles Clermont
- University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213, USA
| | - Artur W Dubrawski
- School of Computer Science, Auton Lab, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Michael R Pinsky
- University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213, USA
| | - Marilyn Hravnak
- University of Pittsburgh School of Nursing, 336 Victoria Hall; 3500 Victoria Street, Pittsburgh, PA, 15213, USA
| |
Collapse
|
96
|
Ma X, Imai T, Shinohara E, Kasai S, Kato K, Kagawa R, Ohe K. EHR2CCAS: A framework for mapping EHR to disease knowledge presenting causal chain of disorders - chronic kidney disease example. J Biomed Inform 2021; 115:103692. [PMID: 33548543 DOI: 10.1016/j.jbi.2021.103692] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Revised: 01/26/2021] [Accepted: 01/27/2021] [Indexed: 10/22/2022]
Abstract
OBJECTIVE The goal of this work was to capture diseases in patients by comprehending the fine-grained medical conditions and disease progression manifested by transitions in medical conditions. We realize this by introducing our earlier work on a state-of-the-art knowledge presentation, which defines a disease as a causal chain of abnormal states (CCAS). Here, we propose a framework, EHR2CCAS, for constructing a system to map electronic health record (EHR) data to CCAS. MATERIALS AND METHODS EHR2CCAS is a framework consisting of modules that access heterogeneous EHR to estimate the presence of abnormal states in a CCAS for a patient in a given time window. EHR2CCAS applies expert-driven (rule-based) and data-driven (machine learning) methods to identify abnormal states from structured and unstructured EHR data. It features data-driven approaches for unlocking clinical texts and imputations based on the EHR temporal properties and the causal CCAS structure. This study presents the CCAS of chronic kidney disease as an example. A mapping system between the EHR from the University of Tokyo Hospital and CCAS of chronic kidney disease was constructed and evaluated against expert annotation. RESULTS The system achieved high prediction performance in identifying abnormal states that had strong agreement among annotators. Our handling of narrative varieties in texts and our imputation of the presence of an abnormal state markedly improved the prediction performance. EHR2CCAS presents patient data describing the temporal presence of abnormal states in CCAS, which is useful in individual disease progression management. Further analysis of the differentiation of transition among abnormal states outputted by EHR2CCAS can contribute to detecting disease subtypes. CONCLUSION This work represents the first step toward combining disease knowledge and EHR to extract abnormality related to a disease defined as fine-grained abnormal states and transitions among them. This can aid in disease progression management and deep phenotyping.
Collapse
Affiliation(s)
- Xiaojun Ma
- Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
| | - Takeshi Imai
- Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
| | - Emiko Shinohara
- Department of Artificial Intelligence in Healthcare, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Satoshi Kasai
- Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan; Department of Healthcare Information Management, The University of Tokyo Hospital, Tokyo, Japan
| | - Kosuke Kato
- Department of Obstetrics and Gynecology, The University of Tokyo Hospital, Tokyo, Japan
| | - Rina Kagawa
- Department of Biomedical Informatics and Management, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan
| | - Kazuhiko Ohe
- Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan; Department of Healthcare Information Management, The University of Tokyo Hospital, Tokyo, Japan
| |
Collapse
|
97
|
Shung DL. Advancing care for acute gastrointestinal bleeding using artificial intelligence. J Gastroenterol Hepatol 2021; 36:273-278. [PMID: 33624892 PMCID: PMC11874509 DOI: 10.1111/jgh.15372] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 12/07/2020] [Accepted: 12/08/2020] [Indexed: 12/14/2022]
Abstract
The future of gastrointestinal bleeding will include the integration of machine learning algorithms to enhance clinician risk assessment and decision making. Machine learning algorithms have shown promise in outperforming existing clinical risk scores for both upper and lower gastrointestinal bleeding but have not been validated in any prospective clinical trials. The adoption of electronic health records provides an exciting opportunity to deploy risk prediction tools in real time and also to expand the data available to train predictive models. Machine learning algorithms can be used to identify patients with acute gastrointestinal bleeding using data extracted from the electronic health record. This can lead to an automated process to find patients with symptoms of acute gastrointestinal bleeding so that risk prediction tools can be then triggered to consistently provide decision support to the physician. Neural network models can be used to provide continuous risk predictions for patients who are at higher risk, which can be used to guide triage of patients to appropriate levels of care. Finally, the future will likely include neural network-based analysis of endoscopic stigmata of bleeding to help guide best practices for hemostasis during the endoscopic procedure. Machine learning will enhance the delivery of care at every level for patients with acute gastrointestinal bleeding through identifying very low risk patients for outpatient management, triaging high risk patients for higher levels of care, and guiding optimal intervention during endoscopy.
Collapse
|
98
|
Hernandez-Boussard T, Monda KL, Crespo BC, Riskin D. Real world evidence in cardiovascular medicine: ensuring data validity in electronic health record-based studies. J Am Med Inform Assoc 2021; 26:1189-1194. [PMID: 31414700 PMCID: PMC6798570 DOI: 10.1093/jamia/ocz119] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 05/17/2019] [Accepted: 06/19/2019] [Indexed: 01/11/2023] Open
Abstract
Objective With growing availability of digital health data and technology, health-related studies are increasingly augmented or implemented using real world data (RWD). Recent federal initiatives promote the use of RWD to make clinical assertions that influence regulatory decision-making. Our objective was to determine whether traditional real world evidence (RWE) techniques in cardiovascular medicine achieve accuracy sufficient for credible clinical assertions, also known as “regulatory-grade” RWE. Design Retrospective observational study using electronic health records (EHR), 2010–2016. Methods A predefined set of clinical concepts was extracted from EHR structured (EHR-S) and unstructured (EHR-U) data using traditional query techniques and artificial intelligence (AI) technologies, respectively. Performance was evaluated against manually annotated cohorts using standard metrics. Accuracy was compared to pre-defined criteria for regulatory-grade. Differences in accuracy were compared using Chi-square test. Results The dataset included 10 840 clinical notes. Individual concept occurrence ranged from 194 for coronary artery bypass graft to 4502 for diabetes mellitus. In EHR-S, average recall and precision were 51.7% and 98.3%, respectively and 95.5% and 95.3% in EHR-U, respectively. For each clinical concept, EHR-S accuracy was below regulatory-grade, while EHR-U met or exceeded criteria, with the exception of medications. Conclusions Identifying an appropriate RWE approach is dependent on cohorts studied and accuracy required. In this study, recall varied greatly between EHR-S and EHR-U. Overall, EHR-S did not meet regulatory grade criteria, while EHR-U did. These results suggest that recall should be routinely measured in EHR-based studes intended for regulatory use. Furthermore, advanced data and technologies may be required to achieve regulatory grade results.
Collapse
Affiliation(s)
- Tina Hernandez-Boussard
- Department of Medicine, Stanford University, Stanford, California, USA.,Department of Biomedical Data Science, Stanford University, Stanford, California, USA.,Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Keri L Monda
- The Center for Observational Research and Medical Affairs, Amgen, Inc., Thousand Oaks, California, USA.,Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Blai Coll Crespo
- The Center for Observational Research and Medical Affairs, Amgen, Inc., Thousand Oaks, California, USA
| | - Dan Riskin
- Department of Medicine, Stanford University, Stanford, California, USA.,Department of Surgery, Stanford University School of Medicine, Stanford, California, USA.,Verantos Inc, Menlo Park, California, USA
| |
Collapse
|
99
|
Bhavnani SP. Digital Health: Opportunities and Challenges to Develop the Next-Generation Technology-Enabled Models of Cardiovascular Care. Methodist Debakey Cardiovasc J 2021; 16:296-303. [PMID: 33500758 DOI: 10.14797/mdcj-16-4-296] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The wide gap between the development of new healthcare technologies and their integration into clinical practice argues for a deeper understanding of how effective quality improvement can be designed to meet the needs of patients and their clinical teams. The COVID-19 pandemic has forced us to address this gap and create long-term strategies to bridge it. On the one hand, it has enabled the rapid implementation of telehealth. On the other hand, it has raised important questions about our preparedness to adopt and employ new digital tools as part of a new process of care. While healthcare organizations are seeking to improve the quality of care by integrating innovations in digital health, they must also address key issues such as patient experience, develop clinical decision support systems that analyze digital health data trends, and create efficient clinical workflows. Given the breadth of such requirements, embracing new technologies as a core competency of a modern healthcare system introduces a host of questions, such as "How best do patients participate in digital health programs that promote behavioral changes and mitigate risk?" and "What type of data analytics are required that enable a deeper understanding of disease phenotypes and corresponding treatment decisions?" This review presents the challenges in implementing digital health technology and discusses how patient-centered digital health programs are designed within real-world models of remote monitoring. It also provides a framework for developing new devices and wearables for the next generation of data-driven, technology-enabled cardiovascular care.
Collapse
|
100
|
Coquet J, Bievre N, Billaut V, Seneviratne M, Magnani CJ, Bozkurt S, Brooks JD, Hernandez-Boussard T. Assessment of a Clinical Trial-Derived Survival Model in Patients With Metastatic Castration-Resistant Prostate Cancer. JAMA Netw Open 2021; 4:e2031730. [PMID: 33481032 PMCID: PMC7823224 DOI: 10.1001/jamanetworkopen.2020.31730] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
IMPORTANCE Randomized clinical trials (RCTs) are considered the criterion standard for clinical evidence. Despite their many benefits, RCTs have limitations, such as costliness, that may reduce the generalizability of their findings among diverse populations and routine care settings. OBJECTIVE To assess the performance of an RCT-derived prognostic model that predicts survival among patients with metastatic castration-resistant prostate cancer (CRPC) when the model is applied to real-world data from electronic health records (EHRs). DESIGN, SETTING, AND PARTICIPANTS The RCT-trained model and patient data from the RCTs were obtained from the Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge for prostate cancer, which occurred from March 16 to July 27, 2015. This challenge included 4 phase 3 clinical trials of patients with metastatic CRPC. Real-world data were obtained from the EHRs of a tertiary care academic medical center that includes a comprehensive cancer center. In this study, the DREAM challenge RCT-trained model was applied to real-world data from January 1, 2008, to December 31, 2019; the model was then retrained using EHR data with optimized feature selection. Patients with metastatic CRPC were divided into RCT and EHR cohorts based on data source. Data were analyzed from March 23, 2018, to October 22, 2020. EXPOSURES Patients who received treatment for metastatic CRPC. MAIN OUTCOMES AND MEASURES The primary outcome was the performance of an RCT-derived prognostic model that predicts survival among patients with metastatic CRPC when the model is applied to real-world data. Model performance was compared using 10-fold cross-validation according to time-dependent integrated area under the curve (iAUC) statistics. RESULTS Among 2113 participants with metastatic CRPC, 1600 participants were included in the RCT cohort, and 513 participants were included in the EHR cohort. The RCT cohort comprised a larger proportion of White participants (1390 patients [86.9%] vs 337 patients [65.7%]) and a smaller proportion of Hispanic participants (14 patients [0.9%] vs 42 patients [8.2%]), Asian participants (41 patients [2.6%] vs 88 patients [17.2%]), and participants older than 75 years (388 patients [24.3%] vs 191 patients [37.2%]) compared with the EHR cohort. Participants in the RCT cohort also had fewer comorbidities (mean [SD], 1.6 [1.8] comorbidities vs 2.5 [2.6] comorbidities, respectively) compared with those in the EHR cohort. Of the 101 variables used in the RCT-derived model, 10 were not available in the EHR data set, 3 of which were among the top 10 features in the DREAM challenge RCT model. The best-performing EHR-trained model included only 25 of the 101 variables included in the RCT-trained model. The performance of the RCT-trained and EHR-trained models was adequate in the EHR cohort (mean [SD] iAUC, 0.722 [0.118] and 0.762 [0.106], respectively); model optimization was associated with improved performance of the best-performing EHR model (mean [SD] iAUC, 0.792 [0.097]). The EHR-trained model classified 256 patients as having a high risk of mortality and 256 patients as having a low risk of mortality (hazard ratio, 2.7; 95% CI, 2.0-3.7; log-rank P < .001). CONCLUSIONS AND RELEVANCE In this study, although the RCT-trained models did not perform well when applied to real-world EHR data, retraining the models using real-world EHR data and optimizing variable selection was beneficial for model performance. As clinical evidence evolves to include more real-world data, both industry and academia will likely search for ways to balance model optimization with generalizability. This study provides a pragmatic approach to applying RCT-trained models to real-world data.
Collapse
Affiliation(s)
- Jean Coquet
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Nicolas Bievre
- Department of Statistics, Stanford University, Stanford, California
| | - Vincent Billaut
- Department of Statistics, Stanford University, Stanford, California
| | - Martin Seneviratne
- Department of Medicine, Stanford University School of Medicine, Stanford, California
- Department of Biomedical Data Science, Stanford University, Stanford, California
| | | | - Selen Bozkurt
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - James D. Brooks
- Department of Urology, Stanford University School of Medicine, Stanford, California
- Stanford Cancer Institute, Stanford University School of Medicine, Stanford, California
| | - Tina Hernandez-Boussard
- Department of Medicine, Stanford University School of Medicine, Stanford, California
- Department of Biomedical Data Science, Stanford University, Stanford, California
- Department of Surgery, Stanford University School of Medicine, Stanford, California
| |
Collapse
|