Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gibson TB, Nguyen MD, Burrell T, Yoon F, Wong J, Dharmarajan S, Ouellet-Hellstrom R, Hua W, Ma Y, Baro E, Bloemers S, Pack C, Kennedy A, Toh S, Ball R. Electronic phenotyping of health outcomes of interest using a linked claims-electronic health record database: Findings from a machine learning pilot project. J Am Med Inform Assoc 2021;28:1507-1517. [PMID: 33712852 DOI: 10.1093/jamia/ocab036] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 02/19/2021] [Indexed: 01/04/2023] Open

For:	Gibson TB, Nguyen MD, Burrell T, Yoon F, Wong J, Dharmarajan S, Ouellet-Hellstrom R, Hua W, Ma Y, Baro E, Bloemers S, Pack C, Kennedy A, Toh S, Ball R. Electronic phenotyping of health outcomes of interest using a linked claims-electronic health record database: Findings from a machine learning pilot project. J Am Med Inform Assoc 2021;28:1507-1517. [PMID: 33712852 DOI: 10.1093/jamia/ocab036] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 02/19/2021] [Indexed: 01/04/2023] Open

Number

Cited by Other Article(s)

Chen J, Li XN, Lu CC, Yuan S, Yung G, Ye J, Tian H, Lin J. Considerations for master protocols using external controls. J Biopharm Stat 2025;35:297-319. [PMID: 38363805 DOI: 10.1080/10543406.2024.2311248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 01/24/2024] [Indexed: 02/18/2024]

Dimitsaki S, Natsiavas P, Jaulent MC. Applying AI to Structured Real-World Data for Pharmacovigilance Purposes: Scoping Review. J Med Internet Res 2024;26:e57824. [PMID: 39753222 PMCID: PMC11729787 DOI: 10.2196/57824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 10/03/2024] [Accepted: 10/27/2024] [Indexed: 01/14/2025] Open

Abstract

BACKGROUND

Artificial intelligence (AI) applied to real-world data (RWD; eg, electronic health care records) has been identified as a potentially promising technical paradigm for the pharmacovigilance field. There are several instances of AI approaches applied to RWD; however, most studies focus on unstructured RWD (conducting natural language processing on various data sources, eg, clinical notes, social media, and blogs). Hence, it is essential to investigate how AI is currently applied to structured RWD in pharmacovigilance and how new approaches could enrich the existing methodology.

OBJECTIVE

This scoping review depicts the emerging use of AI on structured RWD for pharmacovigilance purposes to identify relevant trends and potential research gaps.

METHODS

The scoping review methodology is based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) methodology. We queried the MEDLINE database through the PubMed search engine. Relevant scientific manuscripts published from January 2010 to January 2024 were retrieved. The included studies were "mapped" against a set of evaluation criteria, including applied AI approaches, code availability, description of the data preprocessing pipeline, clinical validation of AI models, and implementation of trustworthy AI criteria following the guidelines of the FUTURE (Fairness, Universality, Traceability, Usability, Robustness, and Explainability)-AI initiative.

RESULTS

The scoping review ultimately yielded 36 studies. There has been a significant increase in relevant studies after 2019. Most of the articles focused on adverse drug reaction detection procedures (23/36, 64%) for specific adverse effects. Furthermore, a substantial number of studies (34/36, 94%) used nonsymbolic AI approaches, emphasizing classification tasks. Random forest was the most popular machine learning approach identified in this review (17/36, 47%). The most common RWD sources used were electronic health care records (28/36, 78%). Typically, these data were not available in a widely acknowledged data model to facilitate interoperability, and they came from proprietary databases, limiting their availability for reproducing results. On the basis of the evaluation criteria classification, 10% (4/36) of the studies published their code in public registries, 16% (6/36) tested their AI models in clinical environments, and 36% (13/36) provided information about the data preprocessing pipeline. In addition, in terms of trustworthy AI, 89% (32/36) of the studies followed at least half of the trustworthy AI initiative guidelines. Finally, selection and confounding biases were the most common biases in the included studies.

CONCLUSIONS

AI, along with structured RWD, constitutes a promising line of work for drug safety and pharmacovigilance. However, in terms of AI, some approaches have not been examined extensively in this field (such as explainable AI and causal AI). Moreover, it would be helpful to have a data preprocessing protocol for RWD to support pharmacovigilance processes. Finally, because of personal data sensitivity, evaluation procedures have to be investigated further.

Collapse

Chikamochi T, Ishiguro C, Mimura W, Maeda M, Murata F, Fukuda H. Validation Study of the Claims-Based Algorithm Using the International Classification of Diseases Codes to Identify Patients With Coronavirus Disease in Japan From 2020 to 2022: The VENUS Study. Pharmacoepidemiol Drug Saf 2024;33:e70032. [PMID: 39449609 DOI: 10.1002/pds.70032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2024] [Revised: 08/28/2024] [Accepted: 09/20/2024] [Indexed: 10/26/2024]

Simon GE, Shortreed SM, Johnson E, Yaseen ZS, Stone M, Mosholder AD, Ahmedani BK, Coleman KJ, Coley RY, Penfold RB, Toh S. Predicting risk of suicidal behavior from insurance claims data vs. linked data from insurance claims and electronic health records. Pharmacoepidemiol Drug Saf 2024;33:e5734. [PMID: 38112287 PMCID: PMC10843611 DOI: 10.1002/pds.5734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 10/16/2023] [Accepted: 11/10/2023] [Indexed: 12/21/2023]

Abstract

PURPOSE

Observational studies assessing effects of medical products on suicidal behavior often rely on health record data to account for pre-existing risk. We assess whether high-dimensional models predicting suicide risk using data derived from insurance claims and electronic health records (EHRs) are superior to models using data from insurance claims alone.

METHODS

Data were from seven large health systems identified outpatient mental health visits by patients aged 11 or older between 1/1/2009 and 9/30/2017. Data for the 5 years prior to each visit identified potential predictors of suicidal behavior typically available from insurance claims (e.g., mental health diagnoses, procedure codes, medication dispensings) and additional potential predictors available from EHRs (self-reported race and ethnicity, responses to Patient Health Questionnaire or PHQ-9 depression questionnaires). Nonfatal self-harm events following each visit were identified from insurance claims data and fatal self-harm events were identified by linkage to state mortality records. Random forest models predicting nonfatal or fatal self-harm over 90 days following each visit were developed in a 70% random sample of visits and validated in a held-out sample of 30%. Performance of models using linked claims and EHR data was compared to models using claims data only.

RESULTS

Among 15 845 047 encounters by 1 574 612 patients, 99 098 (0.6%) were followed by a self-harm event within 90 days. Overall classification performance did not differ between the best-fitting model using all data (area under the receiver operating curve or AUC = 0.846, 95% CI 0.839-0.854) and the best-fitting model limited to data available from insurance claims (AUC = 0.846, 95% CI 0.838-0.853). Competing models showed similar classification performance across a range of cut-points and similar calibration performance across a range of risk strata. Results were similar when the sample was limited to health systems and time periods where PHQ-9 depression questionnaires were recorded more frequently.

CONCLUSION

Investigators using health record data to account for pre-existing risk in observational studies of suicidal behavior need not limit that research to databases including linked EHR data.

Collapse

Ostropolets A, Hripcsak G, Husain SA, Richter LR, Spotnitz M, Elhussein A, Ryan PB. Scalable and interpretable alternative to chart review for phenotype evaluation using standardized structured data from electronic health records. J Am Med Inform Assoc 2023;31:119-129. [PMID: 37847668 PMCID: PMC10746303 DOI: 10.1093/jamia/ocad202] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 09/23/2023] [Accepted: 10/02/2023] [Indexed: 10/19/2023] Open

Maro JC, Nguyen MD, Kolonoski J, Schoeplein R, Huang TY, Dutcher SK, Dal Pan GJ, Ball R. Six Years of the US Food and Drug Administration's Postmarket Active Risk Identification and Analysis System in the Sentinel Initiative: Implications for Real World Evidence Generation. Clin Pharmacol Ther 2023;114:815-824. [PMID: 37391385 DOI: 10.1002/cpt.2979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 05/25/2023] [Indexed: 07/02/2023]

He T, Belouali A, Patricoski J, Lehmann H, Ball R, Anagnostou V, Kreimeyer K, Botsis T. Trends and opportunities in computable clinical phenotyping: A scoping review. J Biomed Inform 2023;140:104335. [PMID: 36933631 DOI: 10.1016/j.jbi.2023.104335] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 03/07/2023] [Accepted: 03/09/2023] [Indexed: 03/18/2023]

Abstract

Identifying patient cohorts meeting the criteria of specific phenotypes is essential in biomedicine and particularly timely in precision medicine. Many research groups deliver pipelines that automatically retrieve and analyze data elements from one or more sources to automate this task and deliver high-performing computable phenotypes. We applied a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines to conduct a thorough scoping review on computable clinical phenotyping. Five databases were searched using a query that combined the concepts of automation, clinical context, and phenotyping. Subsequently, four reviewers screened 7960 records (after removing over 4000 duplicates) and selected 139 that satisfied the inclusion criteria. This dataset was analyzed to extract information on target use cases, data-related topics, phenotyping methodologies, evaluation strategies, and portability of developed solutions. Most studies supported patient cohort selection without discussing the application to specific use cases, such as precision medicine. Electronic Health Records were the primary source in 87.1 % (N = 121) of all studies, and International Classification of Diseases codes were heavily used in 55.4 % (N = 77) of all studies, however, only 25.9 % (N = 36) of the records described compliance with a common data model. In terms of the presented methods, traditional Machine Learning (ML) was the dominant method, often combined with natural language processing and other approaches, while external validation and portability of computable phenotypes were pursued in many cases. These findings revealed that defining target use cases precisely, moving away from sole ML strategies, and evaluating the proposed solutions in the real setting are essential opportunities for future work. There is also momentum and an emerging need for computable phenotyping to support clinical and epidemiological research and precision medicine.

Collapse

Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023;30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open

Abstract

OBJECTIVE

Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used.

MATERIALS AND METHODS

We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies.

RESULTS

Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions.

DISCUSSION

Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released.

CONCLUSION

Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.

Collapse

Brown JS, Mendelsohn AB, Nam YH, Maro JC, Cocoros NM, Rodriguez-Watson C, Lockhart CM, Platt R, Ball R, Dal Pan GJ, Toh S. The US Food and Drug Administration Sentinel System: a national resource for a learning health system. J Am Med Inform Assoc 2022;29:2191-2200. [PMID: 36094070 PMCID: PMC9667154 DOI: 10.1093/jamia/ocac153] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 06/18/2022] [Accepted: 08/18/2022] [Indexed: 07/23/2023] Open

Levenson M, He W, Chen L, Dharmarajan S, Izem R, Meng Z, Pang H, Rockhold F. Statistical consideration for fit-for-use real-world data to support regulatory decision making in drug development. Stat Biopharm Res 2022. [DOI: 10.1080/19466315.2022.2120533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]

Pharmacovigilance and Pharmacoepidemiology as a Guarantee of Patient Safety: The Role of the Clinical Pharmacologist. J Clin Med 2022;11:jcm11123552. [PMID: 35743619 PMCID: PMC9225198 DOI: 10.3390/jcm11123552] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 06/17/2022] [Indexed: 11/21/2022] Open

Ball R, Dal Pan G. "Artificial Intelligence" for Pharmacovigilance: Ready for Prime Time? Drug Saf 2022;45:429-438. [PMID: 35579808 PMCID: PMC9112277 DOI: 10.1007/s40264-022-01157-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/10/2022] [Indexed: 01/28/2023]

Desai RJ, Matheny ME, Johnson K, Marsolo K, Curtis LH, Nelson JC, Heagerty PJ, Maro J, Brown J, Toh S, Nguyen M, Ball R, Pan GD, Wang SV, Gagne JJ, Schneeweiss S. Broadening the reach of the FDA Sentinel system: A roadmap for integrating electronic health record data in a causal analysis framework. NPJ Digit Med 2021;4:170. [PMID: 34931012 PMCID: PMC8688411 DOI: 10.1038/s41746-021-00542-0] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 11/28/2021] [Indexed: 11/09/2022] Open

Affiliation(s)

Rishi J Desai Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
Michael E Matheny Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
Kevin Johnson Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
Keith Marsolo Department of Population Health Sciences, Duke University, Durham, NC, USA
Lesley H Curtis Department of Population Health Sciences, Duke University, Durham, NC, USA
Jennifer C Nelson Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
Patrick J Heagerty Department of Biostatistics, University of Washington, Seattle, WA, USA
Judith Maro Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, MA, USA
Jeffery Brown Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, MA, USA
Sengwee Toh Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, MA, USA
Michael Nguyen Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, FDA, Silver Spring, MD, USA
Robert Ball Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, FDA, Silver Spring, MD, USA
Gerald Dal Pan Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, FDA, Silver Spring, MD, USA
Shirley V Wang Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
Joshua J Gagne Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.,Johnson & Johnson, New Brunswick, NJ, USA
Sebastian Schneeweiss Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA

Collapse