Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Meng W, Ou W, Chandwani S, Chen X, Black W, Cai Z. Temporal phenotyping by mining healthcare data to derive lines of therapy for cancer. J Biomed Inform 2019;100:103335. [PMID: 31689549 DOI: 10.1016/j.jbi.2019.103335] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 10/28/2019] [Accepted: 10/30/2019] [Indexed: 01/29/2023]

For:	Meng W, Ou W, Chandwani S, Chen X, Black W, Cai Z. Temporal phenotyping by mining healthcare data to derive lines of therapy for cancer. J Biomed Inform 2019;100:103335. [PMID: 31689549 DOI: 10.1016/j.jbi.2019.103335] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 10/28/2019] [Accepted: 10/30/2019] [Indexed: 01/29/2023]

Number

Cited by Other Article(s)

Swaminathan A, Ren AL, Wu JY, Bhargava-Shah A, Lopez I, Srivastava U, Alexopoulos V, Pizzitola R, Bui B, Alkhani L, Lee S, Mohit N, Seo N, Macedo N, Cheng W, Wang W, Tran E, Thomas R, Gevaert O. Extraction of Unstructured Electronic Health Records to Evaluate Glioblastoma Treatment Patterns. JCO Clin Cancer Inform 2024;8:e2300091. [PMID: 38857465 DOI: 10.1200/cci.23.00091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 11/15/2023] [Accepted: 03/12/2024] [Indexed: 06/12/2024] Open

Falchetto L, Bender B, Erhard I, Zeiner KN, Stratmann JA, Koll FJ, Wagner S, Reiser M, Gasimli K, Stehle A, Voss M, Ballo O, Vehreschild JJ, Maier D. Concepts of lines of therapy in cancer treatment: findings from an expert interview-based study. BMC Res Notes 2024;17:137. [PMID: 38750530 PMCID: PMC11094945 DOI: 10.1186/s13104-024-06789-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 04/25/2024] [Indexed: 05/19/2024] Open

Affiliation(s)

Lisa Falchetto Institute for Digital Medicine and Clinical Data Science, Goethe University Frankfurt, Faculty of Medicine, Frankfurt, Germany
Bernd Bender Institute for Digital Medicine and Clinical Data Science, Goethe University Frankfurt, Faculty of Medicine, Frankfurt, Germany. German Cancer Consortium (DKTK), partner site Frankfurt/Mainz and German Cancer Research Center (DKFZ), Heidelberg, Germany.
Ian Erhard Institute for Digital Medicine and Clinical Data Science, Goethe University Frankfurt, Faculty of Medicine, Frankfurt, Germany German Cancer Consortium (DKTK), partner site Frankfurt/Mainz and German Cancer Research Center (DKFZ), Heidelberg, Germany
Kim N Zeiner Department for Dermatology, Venerology and Allergology, University Hospital Frankfurt, Frankfurt, Germany
Jan A Stratmann Medical Department 2 (Hematology/Oncology), Center for Internal Medicine, University Hospital Frankfurt, Goethe University Frankfurt, Frankfurt, Germany
Florestan J Koll Department of Urology, University Hospital Frankfurt, Frankfurt, Germany
Sebastian Wagner Medical Department 2 (Hematology/Oncology), Center for Internal Medicine, University Hospital Frankfurt, Goethe University Frankfurt, Frankfurt, Germany
Marcel Reiser PIOH Praxis Internistischer Onkologie und Hämatologie, Cologne, Germany
Khayal Gasimli Clinic for Gynecology and Obstetrics, University Hospital Frankfurt, Frankfurt, Germany
Angelika Stehle Department for Internal Medicine 1, University Hospital Frankfurt, Frankfurt, Germany
Martin Voss Department Neuro-Oncology, University Hospital Frankfurt, Frankfurt, Germany
Olivier Ballo Medical Department 2 (Hematology/Oncology), Center for Internal Medicine, University Hospital Frankfurt, Goethe University Frankfurt, Frankfurt, Germany
Jörg Janne Vehreschild Institute for Digital Medicine and Clinical Data Science, Goethe University Frankfurt, Faculty of Medicine, Frankfurt, Germany Department I of Internal Medicine, University Hospital of Cologne, Cologne, Germany German Center for Infection Research (DZIF) partner site Bonn Cologne, Cologne, Germany
Daniel Maier Institute for Digital Medicine and Clinical Data Science, Goethe University Frankfurt, Faculty of Medicine, Frankfurt, Germany German Cancer Consortium (DKTK), partner site Frankfurt/Mainz and German Cancer Research Center (DKFZ), Heidelberg, Germany

Collapse

Shields RK, Yücel E, Turzhitsky V, Merchant S, Min JS, Watanabe AH. Real-world evaluation of imipenem/cilastatin/relebactam across US medical centres. J Glob Antimicrob Resist 2024;37:190-194. [PMID: 38588973 DOI: 10.1016/j.jgar.2024.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 02/25/2024] [Accepted: 03/05/2024] [Indexed: 04/10/2024] Open

Gadgeel SM, Rai P, Annavarapu S, Alam S, Goldschmidt JH, West H(J, Santorelli M, Martins RE. Frontline pembrolizumab monotherapy for metastatic non-small cell lung cancer with PD-L1 expression ≥50%: real-world outcomes in a US community oncology setting. Front Oncol 2024;14:1298603. [PMID: 38525422 PMCID: PMC10958653 DOI: 10.3389/fonc.2024.1298603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 02/12/2024] [Indexed: 03/26/2024] Open

Abstract

Background

This study investigated real-world time on treatment (rwToT) and overall survival (OS) for patients with metastatic non-small cell lung cancer (mNSCLC) who initiated first-line (1L) pembrolizumab monotherapy. We also explored discontinuation reasons and subsequent treatments, stratified by number of cycles among those who completed ≥17 cycles of 1L pembrolizumab.

Methods

Patients with mNSCLC without actionable genetic aberrations, Eastern Cooperative Oncology Group performance status (ECOG PS) 0-2 and unknown, and PD-L1 TPS ≥ 50% starting 1L pembrolizumab monotherapy between 24-Oct-2016 and 31-Dec-2018 within The US Oncology Network were identified retrospectively and evaluated using structured data, with a data cutoff of 30-Sep-2021. Patient characteristics and disposition were summarized using descriptive statistics. OS and rwToT were evaluated using Kaplan-Meier method for all ECOG PS and PS 0-1. A subgroup of patients who completed ≥17 cycles were evaluated using supplemental chart review data to discern reasons for discontinuation.

Results

Of the 505 patients with mNSCLC with PD-L1 TPS ≥50%, 61% had ECOG PS 0-1, 23% had ECOG PS 2, and 65% had nonsquamous histology. Median rwToT and OS of pembrolizumab were 7.0 (95% CI, 6.0-8.4) months and 24.5 (95% CI, 20.1-29.3) months, respectively. In the subgroup with ECOG PS 0-1, they were 7.6 months (95% CI, 6.2-9.2) and 28.8 months (95% CI, 22.4-37.5), respectively. Of the 103 patients who completed ≥17 cycles, 57 (55.3%) patients received 17 - 34 cycles and 46 (44.7%) patients received ≥35 cycles. Approximately 7.7% of the study population received pembrolizumab beyond 35 cycles. Most common reasons for discontinuation were disease progression (38.6%) and toxicity (19.3%) among patients who received 17-34 cycles of pembrolizumab, and disease progression (13.0%) and completion of therapy (10.9%) among patients who received ≥35 cycles.

Conclusion

Consistent with findings from KEYNOTE-024 and other real-world studies, this study demonstrates the long-term effectiveness of pembrolizumab monotherapy as 1L treatment for mNSCLC with PD-L1 TPS ≥50%. Among patients who completed ≥17 cycles, nearly half completed ≥35 cycles. Disease progression and toxicity were the most common reasons for discontinuation among patients who received 17-34 cycles of pembrolizumab. Reasons for discontinuation beyond 35 cycles need further exploration.

Collapse

Ru B, Sillah A, Desai K, Chandwani S, Yao L, Kothari S. Real-World Data Quality Framework for Oncology Time to Treatment Discontinuation Use Case: Implementation and Evaluation Study. JMIR Med Inform 2024;12:e47744. [PMID: 38446504 PMCID: PMC10955397 DOI: 10.2196/47744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 11/30/2023] [Accepted: 01/14/2024] [Indexed: 03/07/2024] Open

Abstract

BACKGROUND

The importance of real-world evidence is widely recognized in observational oncology studies. However, the lack of interoperable data quality standards in the fragmented health information technology landscape represents an important challenge. Therefore, adopting validated systematic methods for evaluating data quality is important for oncology outcomes research leveraging real-world data (RWD).

OBJECTIVE

This study aims to implement real-world time to treatment discontinuation (rwTTD) for a systemic anticancer therapy (SACT) as a new use case for the Use Case Specific Relevance and Quality Assessment, a framework linking data quality and relevance in fit-for-purpose RWD assessment.

METHODS

To define the rwTTD use case, we mapped the operational definition of rwTTD to RWD elements commonly available from oncology electronic health record-derived data sets. We identified 20 tasks to check the completeness and plausibility of data elements concerning SACT use, line of therapy (LOT), death date, and length of follow-up. Using descriptive statistics, we illustrated how to implement the Use Case Specific Relevance and Quality Assessment on 2 oncology databases (Data sets A and B) to estimate the rwTTD of an SACT drug (target SACT) for patients with advanced head and neck cancer diagnosed on or after January 1, 2015.

RESULTS

A total of 1200 (24.96%) of 4808 patients in Data set A and 237 (5.92%) of 4003 patients in Data set B received the target SACT, suggesting better relevance of the former in estimating the rwTTD of the target SACT. The 2 data sets differed with regard to the terminology used for SACT drugs, LOT format, and target SACT LOT distribution over time. Data set B appeared to have less complete SACT records, longer lags in incorporating the latest data, and incomplete mortality data, suggesting a lack of fitness for estimating rwTTD.

CONCLUSIONS

The fit-for-purpose data quality assessment demonstrated substantial variability in the quality of the 2 real-world data sets. The data quality specifications applied for rwTTD estimation can be expanded to support a broad spectrum of oncology use cases.

Collapse

Hur B, Verspoor KM, Baldwin T, Hardefeldt LY, Pfeiffer C, Mansfield C, Scarborough R, Gilkerson JR. Using natural language processing and patient journey clustering for temporal phenotyping of antimicrobial therapies for cat bite abscesses. Prev Vet Med 2024;223:106112. [PMID: 38176151 DOI: 10.1016/j.prevetmed.2023.106112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 11/09/2023] [Accepted: 12/18/2023] [Indexed: 01/06/2024]

Abstract

BACKGROUND

Temporal phenotyping of patient journeys, which capture the common sequence patterns of interventions in the treatment of a specific condition, is useful to support understanding of antimicrobial usage in veterinary patients. Identifying and describing these phenotypes can inform antimicrobial stewardship programs designed to fight antimicrobial resistance, a major health crisis affecting both humans and animals, in which veterinarians have an important role to play.

OBJECTIVE

This research proposes a framework for extracting temporal phenotypes of patient journeys from clinical practice data through the application of natural language processing (NLP) and unsupervised machine learning (ML) techniques, using cat bite abscesses as a model condition. By constructing temporal phenotypes from key events, the relationship between antimicrobial administration and surgical interventions can be described, and similar treatment patterns can be grouped together to describe outcomes associated with specific antimicrobial selection.

METHODS

Cases identified as having a cat bite abscess as a diagnosis were extracted from VetCompass Australia, a database of veterinary clinical records. A classifier was trained and used to label the most clinically relevant event features in each record as chosen by a group of veterinarians. The labeled records were processed into coded character strings, where each letter represents a summary of specific types of treatments performed at a given visit. The sequences of letters representing the cases were clustered based on weighted Levenshtein edit distances with KMeans+ + to identify the main variations of the patient treatment journeys, including the antimicrobials used and their duration of administration.

RESULTS

A total of 13,744 records that met the selection criteria was extracted and grouped into 8436 cases. There were 9 clinically distinct event sequence patterns (temporal phenotypes) of patient journeys identified, representing the main sequences in which surgery and antimicrobial interventions are performed. Patients receiving amoxicillin and surgery had the shortest duration of antimicrobial administration (median of 3.4 days) and patients receiving cefovecin with no surgical intervention had the longest antimicrobial treatment duration (median of 27 days).

CONCLUSION

Our study demonstrates methods to extract and provide an overview of temporal phenotypes of patient journeys, which can be applied to text-based clinical records for multiple species or clinical conditions. We demonstrate the effectiveness of this approach to derive real-world evidence of treatment impacts using cat bite abscesses as a model condition to describe patterns of antimicrobial therapy prescriptions and their outcomes.

Collapse

Wang Y, Stroh JN, Hripcsak G, Low Wang CC, Bennett TD, Wrobel J, Der Nigoghossian C, Mueller SW, Claassen J, Albers DJ. A methodology of phenotyping ICU patients from EHR data: High-fidelity, personalized, and interpretable phenotypes estimation. J Biomed Inform 2023;148:104547. [PMID: 37984547 PMCID: PMC10802138 DOI: 10.1016/j.jbi.2023.104547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 11/13/2023] [Accepted: 11/16/2023] [Indexed: 11/22/2023]

Abstract

OBJECTIVE

Computing phenotypes that provide high-fidelity, time-dependent characterizations and yield personalized interpretations is challenging, especially given the complexity of physiological and healthcare systems and clinical data quality. This paper develops a methodological pipeline to estimate unmeasured physiological parameters and produce high-fidelity, personalized phenotypes anchored to physiological mechanics from electronic health record (EHR).

METHODS

A methodological phenotyping pipeline is developed that computes new phenotypes defined with unmeasurable computational biomarkers quantifying specific physiological properties in real time. Working within the inverse problem framework, this pipeline is applied to the glucose-insulin system for ICU patients using data assimilation to estimate an established mathematical physiological model with stochastic optimization. This produces physiological model parameter vectors of clinically unmeasured endocrine properties, here insulin secretion, clearance, and resistance, estimated for individual patient. These physiological parameter vectors are used as inputs to unsupervised machine learning methods to produce phenotypic labels and discrete physiological phenotypes. These phenotypes are inherently interpretable because they are based on parametric physiological descriptors. To establish potential clinical utility, the computed phenotypes are evaluated with external EHR data for consistency and reliability and with clinician face validation.

RESULTS

The phenotype computation was performed on a cohort of 109 ICU patients who received no or short-acting insulin therapy, rendering continuous and discrete physiological phenotypes as specific computational biomarkers of unmeasured insulin secretion, clearance, and resistance on time windows of three days. Six, six, and five discrete phenotypes were found in the first, middle, and last three-day periods of ICU stays, respectively. Computed phenotypic labels were predictive with an average accuracy of 89%. External validation of discrete phenotypes showed coherence and consistency in clinically observable differences based on laboratory measurements and ICD 9/10 codes and clinical concordance from face validity. A particularly clinically impactful parameter, insulin secretion, had a concordance accuracy of 83%±27%.

CONCLUSION

The new physiological phenotypes computed with individual patient ICU data and defined by estimates of mechanistic model parameters have high physiological fidelity, are continuous, time-specific, personalized, interpretable, and predictive. This methodology is generalizable to other clinical and physiological settings and opens the door for discovering deeper physiological information to personalize medical care.

Collapse

Affiliation(s)

Yanran Wang Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, 13001 East 17th Place, 3rd Floor, Mail Stop B119, Aurora, CO 80045, United States of America; Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America.
J N Stroh Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America; Department of Biomedical Engineering, University of Colorado, 12705 East Montview Boulevard, Suite 100, Aurora, CO 80045, United States of America
George Hripcsak Biomedical Informatics, Columbia University, 622 W. 168th Street, PH20, New York, NY 10032, United States of America
Cecilia C Low Wang Division of Endocrinology, Metabolism and Diabetes, Department of Medicine, University of Colorado School of Medicine, 12801 East 17th Avenue, 7103, Aurora, CO 80045, United States of America
Tellen D Bennett Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America
Julia Wrobel Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Rd, NE Atlanta, GA 30322, United States of America
Caroline Der Nigoghossian Columbia University School of Nursing, 560 West 168th Street, New York, NY 10032, United States of America
Scott W Mueller Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, 12850 East Montview Boulevard, Aurora, CO 80045, United States of America
Jan Claassen The Neurological Institute of New York, Columbia University Irving Medical Center, 710 West 168th Street, New York NY 10032, United States of America
D J Albers Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, 13001 East 17th Place, 3rd Floor, Mail Stop B119, Aurora, CO 80045, United States of America; Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America; Department of Biomedical Engineering, University of Colorado, 12705 East Montview Boulevard, Suite 100, Aurora, CO 80045, United States of America; Biomedical Informatics, Columbia University, 622 W. 168th Street, PH20, New York, NY 10032, United States of America

Collapse

Zu K, Arunachalam A, Hohlbauch A, Silver M, Robert N. Real-world utilization of immune checkpoint inhibitors in extensive stage small-cell lung cancer in community settings. Immunotherapy 2023;15:1375-1387. [PMID: 37694560 DOI: 10.2217/imt-2023-0073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/12/2023] Open

Flothow A, Novelli A, Sundmacher L. Analytical methods for identifying sequences of utilization in health data: a scoping review. BMC Med Res Methodol 2023;23:212. [PMID: 37759162 PMCID: PMC10523647 DOI: 10.1186/s12874-023-02019-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 08/08/2023] [Indexed: 09/29/2023] Open

Abstract

BACKGROUND

Healthcare, as with other sectors, has undergone progressive digitalization, generating an ever-increasing wealth of data that enables research and the analysis of patient movement. This can help to evaluate treatment processes and outcomes, and in turn improve the quality of care. This scoping review provides an overview of the algorithms and methods that have been used to identify care pathways from healthcare utilization data.

METHOD

This review was conducted according to the methodology of the Joanna Briggs Institute and the Preferred Reporting Items for Systematic Reviews Extension for Scoping Reviews (PRISMA-ScR) Checklist. The PubMed, Web of Science, Scopus, and EconLit databases were searched and studies published in English between 2000 and 2021 considered. The search strategy used keywords divided into three categories: the method of data analysis, the requirement profile for the data, and the intended presentation of results. Criteria for inclusion were that health data were analyzed, the methodology used was described and that the chronology of care events was considered. In a two-stage review process, records were reviewed by two researchers independently for inclusion. Results were synthesized narratively.

RESULTS

The literature search yielded 2,865 entries; 51 studies met the inclusion criteria. Health data from different countries ([Formula: see text]) and of different types of disease ([Formula: see text]) were analyzed with respect to different care events. Applied methods can be divided into those identifying subsequences of care and those describing full care trajectories. Variants of pattern mining or Markov models were mostly used to extract subsequences, with clustering often applied to find care trajectories. Statistical algorithms such as rule mining, probability-based machine learning algorithms or a combination of methods were also applied. Clustering methods were sometimes used for data preparation or result compression. Further characteristics of the included studies are presented.

CONCLUSION

Various data mining methods are already being applied to gain insight from health data. The great heterogeneity of the methods used shows the need for a scoping review. We performed a narrative review and found that clustering methods currently dominate the literature for identifying complete care trajectories, while variants of pattern mining dominate for identifying subsequences of limited length.

Collapse

Carlos Souto Maior Borba MA, de Mendonça Batista P, Falcão Almeida M, do Carmo Rego MA, Brandão Serra F, Barbour Oliveira JC, Nakajima K, Silva Julian G, Amorim G. Treatment patterns and healthcare resource utilization for triple negative breast cancer in the Brazilian private healthcare system: a database study. Sci Rep 2023;13:15785. [PMID: 37737435 PMCID: PMC10516856 DOI: 10.1038/s41598-023-43131-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 09/20/2023] [Indexed: 09/23/2023] Open

O'Rourke J, Warnick J, Doole J, De Keyser L, Drebert Z, Wan O, Thompson CN, London JW, Fairchild K, Palchuk MB. Exploring Breast Cancer Systemic Drug Therapy Patterns in Real-World Data. JCO Clin Cancer Inform 2023;7:e2300061. [PMID: 37851942 PMCID: PMC10642877 DOI: 10.1200/cci.23.00061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 07/06/2023] [Accepted: 07/07/2023] [Indexed: 10/20/2023] Open

Abstract

PURPOSE

To explore medications and their administration patterns in real-world patients with breast cancer.

METHODS

A retrospective study was performed using TriNetX, a federated network of deidentified, Health Insurance Portability and Accountability Act-compliant data from 21 health care organizations across North America. Patients diagnosed with breast cancer between January 1, 2013, and May 31, 2022, were included. We investigated a rule-based and unsupervised learning algorithm to extract medications and their administration patterns. To group similar administration patterns, we used three features in k-means clustering: total number of administrations, median number of days between administrations, and standard deviation of the days between administrations. We explored the first three lines of therapy for patients classified into six groups on the basis of their stage at diagnosis (early as stages I-III v late as stage IV) and the sensitivity of the tumor's receptors to targeted therapies: hormone receptor-positive/human epidermal growth factor 2-negative (HR+/ERBB2-), ERBB2-positive (ERBB2+/HR±), or triple-negative (TN; HR-/ERBB2-). To add credence to the derived regimens, we compared them to the National Comprehensive Cancer Network (NCCN): Breast Cancer (version 2.2023) recommendations.

RESULTS

In early-stage HR+/ERBB2- and TN groups, the most common regimens were (1) cyclophosphamide and docetaxel, administered once every 3 weeks for three to six cycles and (2) cyclophosphamide and doxorubicin, administered once every 2 weeks for four cycles, followed by paclitaxel administered once every week for 12 cycles. In the early-stage ERBB2+/HR± group, most patients were administered carboplatin and docetaxel with or without pertuzumab and with trastuzumab (for six or more cycles). Medications most commonly administered in our data set (7,798 patients) agreed with recommendations from the NCCN in terms of medications (regimens), number of administrations (cycles), and days between administrations (cycle length).

CONCLUSION

Although there is a general agreement with the NCCN Guidelines, real-world medication data exhibit variability in the medications and their administration patterns.

Collapse

Wang Y, Stroh JN, Hripcsak G, Low Wang CC, Bennett TD, Wrobel J, Der Nigoghossian C, Mueller S, Claassen J, Albers DJ. A methodology of phenotyping ICU patients from EHR data: high-fidelity, personalized, and interpretable phenotypes estimation. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.03.15.23287315. [PMID: 37662404 PMCID: PMC10473766 DOI: 10.1101/2023.03.15.23287315] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]

Abstract

Objective

Methods

Results

Conclusion

Collapse

Gendrin A, Souliotis L, Loudon-Griffiths J, Aggarwal R, Amoako D, Desouza G, Dimitrievska S, Metcalfe P, Louvet E, Sahni H. Identifying Patient Populations in Texts Describing Drug Approvals Through Deep Learning-Based Information Extraction: Development of a Natural Language Processing Algorithm. JMIR Form Res 2023;7:e44876. [PMID: 37347514 DOI: 10.2196/44876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 03/30/2023] [Accepted: 04/17/2023] [Indexed: 06/23/2023] Open

Abstract

BACKGROUND

New drug treatments are regularly approved, and it is challenging to remain up-to-date in this rapidly changing environment. Fast and accurate visualization is important to allow a global understanding of the drug market. Automation of this information extraction provides a helpful starting point for the subject matter expert, helps to mitigate human errors, and saves time.

OBJECTIVE

We aimed to semiautomate disease population extraction from the free text of oncology drug approval descriptions from the BioMedTracker database for 6 selected drug targets. More specifically, we intended to extract (1) line of therapy, (2) stage of cancer of the patient population described in the approval, and (3) the clinical trials that provide evidence for the approval. We aimed to use these results in downstream applications, aiding the searchability of relevant content against related drug project sources.

METHODS

We fine-tuned a state-of-the-art deep learning model, Bidirectional Encoder Representations from Transformers, for each of the 3 desired outputs. We independently applied rule-based text mining approaches. We compared the performances of deep learning and rule-based approaches and selected the best method, which was then applied to new entries. The results were manually curated by a subject matter expert and then used to train new models.

RESULTS

The training data set is currently small (433 entries) and will enlarge over time when new approval descriptions become available or if a choice is made to take another drug target into account. The deep learning models achieved 61% and 56% 5-fold cross-validated accuracies for line of therapy and stage of cancer, respectively, which were treated as classification tasks. Trial identification is treated as a named entity recognition task, and the 5-fold cross-validated F₁-score is currently 87%. Although the scores of the classification tasks could seem low, the models comprise 5 classes each, and such scores are a marked improvement when compared to random classification. Moreover, we expect improved performance as the input data set grows, since deep learning models need to be trained on a large enough amount of data to be able to learn the task they are taught. The rule-based approach achieved 60% and 74% 5-fold cross-validated accuracies for line of therapy and stage of cancer, respectively. No attempt was made to define a rule-based approach for trial identification.

CONCLUSIONS

We developed a natural language processing algorithm that is currently assisting subject matter experts in disease population extraction, which supports health authority approvals. This algorithm achieves semiautomation, enabling subject matter experts to leverage the results for deeper analysis and to accelerate information retrieval in a crowded clinical environment such as oncology.

Collapse

Zhou Y, Shi J, Stein R, Liu X, Baldassano RN, Forrest CB, Chen Y, Huang J. Missing data matter: an empirical evaluation of the impacts of missing EHR data in comparative effectiveness research. J Am Med Inform Assoc 2023;30:1246-1256. [PMID: 37337922 PMCID: PMC10280351 DOI: 10.1093/jamia/ocad066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 03/20/2023] [Accepted: 04/08/2023] [Indexed: 06/21/2023] Open

Kim DH, Jensen A, Jones K, Raghavan S, Phillips LS, Hung A, Sun YV, Li G, Reaven P, Zhou H, Zhou JJ. A platform for phenotyping disease progression and associated longitudinal risk factors in large-scale EHRs, with application to incident diabetes complications in the UK Biobank. JAMIA Open 2023;6:ooad006. [PMID: 36789288 PMCID: PMC9912368 DOI: 10.1093/jamiaopen/ooad006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 01/19/2023] [Accepted: 01/31/2023] [Indexed: 02/12/2023] Open

Abstract

Objective

Modern healthcare data reflect massive multi-level and multi-scale information collected over many years. The majority of the existing phenotyping algorithms use case-control definitions of disease. This paper aims to study the time to disease onset and progression and identify the time-varying risk factors that drive them.

Materials and Methods

We developed an algorithmic approach to phenotyping the incidence of diseases by consolidating data sources from the UK Biobank (UKB), including primary care electronic health records (EHRs). We focused on defining events, event dates, and their censoring time, including relevant terms and existing phenotypes, excluding generic, rare, or semantically distant terms, forward-mapping terminology terms, and expert review. We applied our approach to phenotyping diabetes complications, including a composite cardiovascular disease (CVD) outcome, diabetic kidney disease (DKD), and diabetic retinopathy (DR), in the UKB study.

Results

We identified 49 049 participants with diabetes. Among them, 1023 had type 1 diabetes (T1D), and 40 193 had type 2 diabetes (T2D). A total of 23 833 diabetes subjects had linked primary care records. There were 3237, 3113, and 4922 patients with CVD, DKD, and DR events, respectively. The risk prediction performance for each outcome was assessed, and our results are consistent with the prediction area under the ROC (receiver operating characteristic) curve (AUC) of standard risk prediction models using cohort studies.

Discussion and Conclusion

Our publicly available pipeline and platform enable streamlined curation of incidence events, identification of time-varying risk factors underlying disease progression, and the definition of a relevant cohort for time-to-event analyses. These important steps need to be considered simultaneously to study disease progression.

Collapse

Dragoni M, Eccher C, Ferro A, Bailoni T, Maimone R, Zorzi A, Bacchiega A, Stulzer G, Ghidini C. Supporting patients and clinicians during the breast cancer care path with AI: The Arianna solution. Artif Intell Med 2023;138:102514. [PMID: 36990591 DOI: 10.1016/j.artmed.2023.102514] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 12/09/2022] [Accepted: 02/17/2023] [Indexed: 02/24/2023]

Kerchberger VE, Peterson JF, Wei WQ. Scanning the medical phenome to identify new diagnoses after recovery from COVID-19 in a US cohort. J Am Med Inform Assoc 2023;30:233-244. [PMID: 36005898 PMCID: PMC9452157 DOI: 10.1093/jamia/ocac159] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 06/29/2022] [Accepted: 08/23/2022] [Indexed: 01/20/2023] Open

Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023;30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open

Abstract

OBJECTIVE

Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used.

MATERIALS AND METHODS

We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies.

RESULTS

Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions.

DISCUSSION

Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released.

CONCLUSION

Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.

Collapse

Epstein RS, Nelms J, Moran D, Girman C, Huang H, Chioda M. Treatment patterns and burden of myelosuppression for patients with small cell lung cancer: A SEER-medicare study. Cancer Treat Res Commun 2022;31:100555. [PMID: 35421820 DOI: 10.1016/j.ctarc.2022.100555] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 03/28/2022] [Accepted: 03/29/2022] [Indexed: 06/14/2023]

Abstract

PURPOSE

To depict the treatment journey for patients with small cell lung cancer (SCLC) and evaluate health care resource utilization (HCRU) associated with myelosuppression, a complication induced by chemotherapy or chemotherapy plus radiation therapy.

PATIENTS AND METHODS

This was a descriptive, retrospective study of patients with SCLC aged ≥65 years, identified from linked Surveillance, Epidemiology, and End Results (SEER)-Medicare data curated between January 2012 and December 2015. Treatment types (chemotherapy, radiation therapy, surgery) were classified as first, second, or third line, depending on the temporal sequence in which regimens were prescribed. For each year, the proportions of patients completing 4- or 6-cycle chemotherapy regimens, with hospital admissions associated with myelosuppression, or who used granulocyte colony-stimulating factors (G-CSFs), blood/platelet transfusions, or erythropoiesis-stimulating agents (ESAs), were calculated.

RESULTS

Chemotherapy was administered as initial treatment in 7,807/11,907 (65.6%) patients whose treatment journey was recorded. Approximately one-third (n = 3,985) subsequently received radiation therapy. In total, 5,791 (57.8%) patients completed the guideline-recommended 4-6 cycles of chemotherapy. Among all chemotherapy-treated patients, 10,370 (74.3%) experienced ≥1 inpatient admission associated with myelosuppression (anemia, 7,366 [52.8%]; neutropenia, 4,642 [33.3%]; thrombocytopenia, 2,375 [17.0%]; pancytopenia, 1,983 [14.2%]). Supportive care interventions included G-CSF (6,756 [48.4%] patients), ESAs (1,534 [11.0%]), and transfusions (3,674 [26.3%]).

CONCLUSION

Chemotherapy remains a cornerstone of care for patients with SCLC. Slightly over half of patients completed the recommended number of cycles, underscoring the frailty of patients and aggressiveness of SCLC. HCRU associated with myelosuppression was prominent, suggesting a substantial burden on older patients with SCLC.

Collapse

Meng W, Mosesso KM, Lane KA, Roberts AR, Griffith A, Ou W, Dexter PR. An Automated Line-of-Therapy Algorithm for Adults With Metastatic Non-Small Cell Lung Cancer: Validation Study Using Blinded Manual Chart Review. JMIR Med Inform 2021;9:e29017. [PMID: 34636730 PMCID: PMC8548977 DOI: 10.2196/29017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 06/22/2021] [Accepted: 07/02/2021] [Indexed: 11/25/2022] Open

Abstract

BACKGROUND

Extraction of line-of-therapy (LOT) information from electronic health record and claims data is essential for determining longitudinal changes in systemic anticancer therapy in real-world clinical settings.

OBJECTIVE

The aim of this retrospective cohort analysis is to validate and refine our previously described open-source LOT algorithm by comparing the output of the algorithm with results obtained through blinded manual chart review.

METHODS

We used structured electronic health record data and clinical documents to identify 500 adult patients treated for metastatic non-small cell lung cancer with systemic anticancer therapy from 2011 to mid-2018; we assigned patients to training (n=350) and test (n=150) cohorts, randomly divided proportional to the overall ratio of simple:complex cases (n=254:246). Simple cases were patients who received one LOT and no maintenance therapy; complex cases were patients who received more than one LOT and/or maintenance therapy. Algorithmic changes were performed using the training cohort data, after which the refined algorithm was evaluated against the test cohort.

RESULTS

For simple cases, 16 instances of discordance between the LOT algorithm and chart review prerefinement were reduced to 8 instances postrefinement; in the test cohort, there was no discordance between algorithm and chart review. For complex cases, algorithm refinement reduced the discordance from 68 to 62 instances, with 37 instances in the test cohort. The percentage agreement between LOT algorithm output and chart review for patients who received one LOT was 89% prerefinement, 93% postrefinement, and 93% for the test cohort, whereas the likelihood of precise matching between algorithm output and chart review decreased with an increasing number of unique regimens. Several areas of discordance that arose from differing definitions of LOTs and maintenance therapy could not be objectively resolved because of a lack of precise definitions in the medical literature.

CONCLUSIONS

Our findings identify common sources of discordance between the LOT algorithm and clinician documentation, providing the possibility of targeted algorithm refinement.

Collapse

Hess LM, Li X, Wu Y, Goodloe RJ, Cui ZL. Defining treatment regimens and lines of therapy using real-world data in oncology. Future Oncol 2021;17:1865-1877. [DOI: 10.2217/fon-2020-1041] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

Majnarić LT, Babič F, O’Sullivan S, Holzinger A. AI and Big Data in Healthcare: Towards a More Comprehensive Research Framework for Multimorbidity. J Clin Med 2021;10:jcm10040766. [PMID: 33672914 PMCID: PMC7918668 DOI: 10.3390/jcm10040766] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 02/02/2021] [Accepted: 02/11/2021] [Indexed: 12/11/2022] Open

Guo Y, Yang PT, Wang ZW, Xu K, Kou WH, Luo H. Identification of Three Autophagy-Related Long Non-Coding RNAs as a Novel Head and Neck Squamous Cell Carcinoma Prognostic Signature. Front Oncol 2021;10:603864. [PMID: 33575215 PMCID: PMC7871905 DOI: 10.3389/fonc.2020.603864] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Accepted: 11/09/2020] [Indexed: 01/08/2023] Open

Abstract

Head and neck squamous cell carcinoma (HNSCC) has a poor prognosis. Considerable evidence indicates that autophagy and non-coding RNA play essential roles in the biological processes involved in cancers, but associations between autophagy-related long non-coding RNAs (lncRNAs) and HNSCC remain unclear. In the present study, HNSCC RNA sequences and autophagy-related gene data were extracted from The Cancer Genome Atlas database and the Human Autophagy Database. A total of 1,153 autophagy-related lncRNAs were selected via calculating Pearson’s correlation coefficient. Three prognosis-related autophagy lncRNAs were identified via univariate Cox regression, least absolute shrinkage and selection operator analysis, and multivariate Cox regression analysis. We also constructed a prognostic model based on these autophagy-related lncRNAs and evaluated its ability to accurately and independently predict the prognosis of HNSCC patients. The area under the curve (AUC) was 0.864 (3-year) and 0.836 (5-year), and our model can independently predict the prognosis of HNSCC. The prognostic value of the three autophagy lncRNAs was confirmed via analysis of samples from five databases. To further identify the functions of the three lncRNAs, a co-expression network was constructed and pathway analysis was performed. In that analysis the lncRNAs were correlated with 189 related genes and 20 autophagy-related genes, and these lncRNAs mainly involved homologous recombination, the Fanconi anemia pathway, the autophagy-related pathway, and immune-related pathways. In addition, we validated the expression levels of three lncRNAs and autophagy markers (ATG12, BECN1, and MAP1LC3B) based on TIMER, Oncomine, and HPA database analysis. Our results indicated that TTTY15 was increased in HPV positive and HPV negative HNSCC patients, and three autophagy markers were up-regulated in all HNSCCC patients. Lastly, association between three lncRNAs and autophagy markers was performed, and our results showed that TTTY15 and MIF-AS1 were associated with autophagy markers. Collectively, these results suggested that three autophagy-related lncRNAs have prognostic value in HNSCC patients.

Collapse

Digital systems for improving outcomes in patients with primary immune defects. Curr Opin Pediatr 2020;32:772-779. [PMID: 33060445 DOI: 10.1097/mop.0000000000000963] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Weng C, Shah NH, Hripcsak G. Deep phenotyping: Embracing complexity and temporality-Towards scalability, portability, and interoperability. J Biomed Inform 2020;105:103433. [PMID: 32335224 PMCID: PMC7179504 DOI: 10.1016/j.jbi.2020.103433] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Accepted: 04/20/2020] [Indexed: 01/07/2023]