1
|
Hasan N, Mehrotra K, Danzig CJ, Eichenbaum DA, Ewald A, Regillo C, Momenaei B, Sheth VS, Lally DR, Chhablani J. Screen failures in clinical trials in retina. Ophthalmol Retina 2024:S2468-6530(24)00263-X. [PMID: 38810882 DOI: 10.1016/j.oret.2024.05.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 05/16/2024] [Accepted: 05/21/2024] [Indexed: 05/31/2024]
Abstract
PURPOSE Disparities in clinical trials are a major problem due to significant underrepresentation of certain gender, racial and ethnic groups. Several factors including stringent eligibility criteria and recruitment strategies hinder our understanding of retinal disease. Thus, we aimed to study the various reasons of screen failures and specific patient and study characteristics among screen failures. DESIGN This is a cross-sectional retrospective study METHODS: Screening data of 87 trials from 6 centers were analyzed. Study characteristics (disease studied, phase of trial, route of drug administration) and patient demographics (age, gender, race, ethnicity, and employment status) were compared among different causes of screen failures. Screen failures were broadly classified into six categories: exclusion due to vision-based criteria, exclusion due to imaging findings, exclusion due to other factors, patient-related criteria, physician related criteria and miscellaneous. Descriptive statistics, Pearson Chi-square test and ANOVA were used for statistical analysis. MAIN OUTCOME MEASURES Determine the prevalence of various reasons for screen failures in multiple trials and its trend among different study and patient characteristics. RESULTS Among 87 trials and 962 patients, 465(48.2%) patients were successfully randomized and 497(51.8%) patients were classified as screen failures. The trials were conducted for various retinal diseases. Mean age was 76.50 ±10.45 years and 59.4% were females. Predominantly whites(93.4%) and unemployed/retired patients(66.6%) were screened. Of the 497 screen failures, most were due to patients not meeting inclusion criteria of imaging findings (n=221[44.5%]) followed by inclusion of vision-based criteria (n=73 [14.7%]), exclusion due to other factors (n=75[15.1%]), patient-related (n=34[6.8%]), physician-related (n=28[5.6%]) and miscellaneous reasons (n= 39[17.8%]). Reason for screen failure was not available for 27(5.4%) patients. A higher proportion of patients screened for surgical trials (15%) declined to participate in the study compared to non-invasive trials involving topical drugs and photobiomodulation (0%).(p=0.02) CONCLUSION: Patients not meeting the imaging and vision-cased criteria were the most common reasons for screen failures. Whites and unemployed patients predominantly participated in clinical trials. Patients are more inclined to continue participation in non-invasive clinical trials compared to surgical trials. Better recruitment strategies and careful consideration of study criteria can aid in decreasing the rate of screen failures.
Collapse
Affiliation(s)
- Nasiq Hasan
- Ophthalmology, UPMC, Pittsburgh, PA, United States
| | | | | | - David A Eichenbaum
- Retina Vitreous Associates of Florida, Saint Petersburg, FL; Morsani College of Medicine at the University of South Florida, Tampa, FL, United States
| | - Amy Ewald
- Retina Vitreous Associates of Florida, Saint Petersburg, FL; Morsani College of Medicine at the University of South Florida, Tampa, FL, United States
| | - Carl Regillo
- Retina Service, Wills Eye Hospital, Thomas Jefferson University, Philadelphia, PA, United States
| | - Bita Momenaei
- Retina Service, Wills Eye Hospital, Thomas Jefferson University, Philadelphia, PA, United States
| | - Veeral S Sheth
- University Retina and Macula Associates PC, Oak Forest, IL, United States
| | - David R Lally
- New England Retina Associates, Springfield, MA, United States
| | | |
Collapse
|
2
|
Gulden C, Macho P, Reinecke I, Strantz C, Prokosch HU, Blasini R. recruIT: A cloud-native clinical trial recruitment support system based on Health Level 7 Fast Healthcare Interoperability Resources (HL7 FHIR) and the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM). Comput Biol Med 2024; 174:108411. [PMID: 38626510 DOI: 10.1016/j.compbiomed.2024.108411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 03/17/2024] [Accepted: 04/02/2024] [Indexed: 04/18/2024]
Abstract
BACKGROUND Clinical trials (CTs) are foundational to the advancement of evidence-based medicine and recruiting a sufficient number of participants is one of the crucial steps to their successful conduct. Yet, poor recruitment remains the most frequent reason for premature discontinuation or costly extension of clinical trials. METHODS We designed and implemented a novel, open-source software system to support the recruitment process in clinical trials by generating automatic recruitment recommendations. The development is guided by modern, cloud-native design principles and based on Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) as an interoperability standard with the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) being used as a source of patient data. We evaluated the usability using the system usability scale (SUS) after deploying the application for use by study personnel. RESULTS The implementation is based on the OMOP CDM as a repository of patient data that is continuously queried for possible trial candidates based on given clinical trial eligibility criteria. A web-based screening list can be used to display the candidates and email notifications about possible new trial participants can be sent automatically. All interactions between services use HL7 FHIR as the communication standard. The system can be installed using standard container technology and supports more sophisticated deployments on Kubernetes clusters. End-users (n = 19) rated the system with a SUS score of 79.9/100. CONCLUSION We contribute a novel, open-source implementation to support the patient recruitment process in clinical trials that can be deployed using state-of-the art technologies. According to the SUS score, the system provides good usability.
Collapse
Affiliation(s)
- Christian Gulden
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Department of Medical Informatics, Biometrics and Epidemiology, Medical Informatics, Erlangen, Germany.
| | - Philipp Macho
- Medical Informatics, Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany
| | - Ines Reinecke
- Carl Gustav Carus Faculty of Medicine, Center for Medical Informatics, Institute for Medical Informatics and Biometry, Technische Universität Dresden, Dresden, Germany
| | - Cosima Strantz
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Department of Medical Informatics, Biometrics and Epidemiology, Medical Informatics, Erlangen, Germany
| | - Hans-Ulrich Prokosch
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Department of Medical Informatics, Biometrics and Epidemiology, Medical Informatics, Erlangen, Germany
| | - Romina Blasini
- Institute of Medical Informatics, Justus Liebig University, Giessen, Germany
| |
Collapse
|
3
|
Beattie J, Neufeld S, Yang D, Chukwuma C, Gul A, Desai N, Jiang S, Dohopolski M. Utilizing Large Language Models for Enhanced Clinical Trial Matching: A Study on Automation in Patient Screening. Cureus 2024; 16:e60044. [PMID: 38854210 PMCID: PMC11162699 DOI: 10.7759/cureus.60044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/10/2024] [Indexed: 06/11/2024] Open
Abstract
Background Clinical trial matching, essential for advancing medical research, involves detailed screening of potential participants to ensure alignment with specific trial requirements. Research staff face challenges due to the high volume of eligible patients and the complexity of varying eligibility criteria. The traditional manual process, both time-consuming and error-prone, often leads to missed opportunities. Recently, large language models (LLMs), specifically generative pre-trained transformers (GPTs), have become impressive and impactful tools. Utilizing such tools from artificial intelligence (AI) and natural language processing (NLP) may enhance the accuracy and efficiency of this process through automated patient screening against established criteria. Methods Utilizing data from the National NLP Clinical Challenges (n2c2) 2018 Challenge, we utilized 202 longitudinal patient records. These records were annotated by medical professionals and evaluated against 13 selection criteria encompassing various health assessments. Our approach involved embedding medical documents into a vector database to determine relevant document sections and then using an LLM (OpenAI's GPT-3.5 Turbo and GPT-4) in tandem with structured and chain-of-thought prompting techniques for systematic document assessment against the criteria. Misclassified criteria were also examined to identify classification challenges. Results This study achieved an accuracy of 0.81, sensitivity of 0.80, specificity of 0.82, and a micro F1 score of 0.79 using GPT-3.5 Turbo, and an accuracy of 0.87, sensitivity of 0.85, specificity of 0.89, and micro F1 score of 0.86 using GPT-4. Notably, some criteria in the ground truth appeared mislabeled, an issue we couldn't explore further due to insufficient label generation guidelines on the website. Conclusion Our findings underscore the potential of AI and NLP technologies, including LLMs, in the clinical trial matching process. The study demonstrated strong capabilities in identifying eligible patients and minimizing false inclusions. Such automated systems promise to alleviate the workload of research staff and improve clinical trial enrollment, thus accelerating the process and enhancing the overall feasibility of clinical research. Further work is needed to determine the potential of this approach when implemented on real clinical data.
Collapse
Affiliation(s)
- Jacob Beattie
- Department of Radiation Oncology, University of Texas (UT) Southwestern Medical Center, Dallas, USA
| | - Sarah Neufeld
- Department of Radiation Oncology, University of Texas (UT) Southwestern Medical Center, Dallas, USA
| | - Daniel Yang
- Department of Radiation Oncology, University of Texas (UT) Southwestern Medical Center, Dallas, USA
| | - Christian Chukwuma
- Department of Radiation Oncology, University of Texas (UT) Southwestern Medical Center, Dallas, USA
| | - Ahmed Gul
- Department of Radiation Oncology, University of Texas (UT) Southwestern Medical Center, Dallas, USA
| | - Neil Desai
- Department of Radiation Oncology, University of Texas (UT) Southwestern Medical Center, Dallas, USA
| | - Steve Jiang
- Department of Radiation Oncology, University of Texas (UT) Southwestern Medical Center, Dallas, USA
| | - Michael Dohopolski
- Department of Radiation Oncology, University of Texas (UT) Southwestern Medical Center, Dallas, USA
| |
Collapse
|
4
|
Nievas M, Basu A, Wang Y, Singh H. Distilling large language models for matching patients to clinical trials. J Am Med Inform Assoc 2024:ocae073. [PMID: 38641416 DOI: 10.1093/jamia/ocae073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 03/14/2024] [Accepted: 03/25/2024] [Indexed: 04/21/2024] Open
Abstract
OBJECTIVE The objective of this study is to systematically examine the efficacy of both proprietary (GPT-3.5, GPT-4) and open-source large language models (LLMs) (LLAMA 7B, 13B, 70B) in the context of matching patients to clinical trials in healthcare. MATERIALS AND METHODS The study employs a multifaceted evaluation framework, incorporating extensive automated and human-centric assessments along with a detailed error analysis for each model, and assesses LLMs' capabilities in analyzing patient eligibility against clinical trial's inclusion and exclusion criteria. To improve the adaptability of open-source LLMs, a specialized synthetic dataset was created using GPT-4, facilitating effective fine-tuning under constrained data conditions. RESULTS The findings indicate that open-source LLMs, when fine-tuned on this limited and synthetic dataset, achieve performance parity with their proprietary counterparts, such as GPT-3.5. DISCUSSION This study highlights the recent success of LLMs in the high-stakes domain of healthcare, specifically in patient-trial matching. The research demonstrates the potential of open-source models to match the performance of proprietary models when fine-tuned appropriately, addressing challenges like cost, privacy, and reproducibility concerns associated with closed-source proprietary LLMs. CONCLUSION The study underscores the opportunity for open-source LLMs in patient-trial matching. To encourage further research and applications in this field, the annotated evaluation dataset and the fine-tuned LLM, Trial-LLAMA, are released for public use.
Collapse
Affiliation(s)
- Mauro Nievas
- Triomics Research, Triomics, Inc., San Francisco, CA 94105, United States
| | - Aditya Basu
- Triomics Research, Triomics, Inc., Bengaluru, Karnataka 560102, India
| | - Yanshan Wang
- Department of Health Information Management, University of Pittsburgh, Pittsburgh, PA 15260, United States
| | - Hrituraj Singh
- Triomics Research, Triomics, Inc., San Francisco, CA 94105, United States
| |
Collapse
|
5
|
Wang K, Cui H, Zhu Y, Hu X, Hong C, Guo Y, An L, Zhang Q, Liu L. Evaluation of an artificial intelligence-based clinical trial matching system in Chinese patients with hepatocellular carcinoma: a retrospective study. BMC Cancer 2024; 24:246. [PMID: 38388861 PMCID: PMC10885498 DOI: 10.1186/s12885-024-11959-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 02/05/2024] [Indexed: 02/24/2024] Open
Abstract
BACKGROUND Artificial intelligence (AI)-assisted clinical trial screening is a promising prospect, although previous matching systems were developed in English, and relevant studies have only been conducted in Western countries. Therefore, we evaluated an AI-based clinical trial matching system (CTMS) that extracts medical data from the electronic health record system and matches them to clinical trials automatically. METHODS This study included 1,053 consecutive inpatients primarily diagnosed with hepatocellular carcinoma who were referred to the liver tumor center of an academic medical center in China between January and December 2019. The eligibility criteria extracted from two clinical trials, patient attributes, and gold standard were decided manually. We evaluated the performance of the CTMS against the established gold standard by measuring the accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and run time required. RESULTS The manual reviewers demonstrated acceptable interrater reliability (Cohen's kappa 0.65-0.88). The performance results for the CTMS were as follows: accuracy, 92.9-98.0%; sensitivity, 51.9-83.5%; specificity, 99.0-99.1%; PPV, 75.7-85.1%; and NPV, 97.4-98.9%. The time required for eligibility determination by the CTMS and manual reviewers was 2 and 150 h, respectively. CONCLUSIONS We found that the CTMS is particularly reliable in excluding ineligible patients in a significantly reduced amount of time. The CTMS excluded ineligible patients for clinical trials with good performance, reducing 98.7% of the work time. Thus, such AI-based systems with natural language processing and machine learning have potential utility in Chinese clinical trials.
Collapse
Affiliation(s)
- Kunyuan Wang
- State Key Laboratory of Organ Failure Research, Guangdong Provincial Key Laboratory of Viral Hepatitis Research, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, No. 1838, North Guangzhou Avenue, Baiyun District, Guangzhou, China
| | - Hao Cui
- State Key Laboratory of Organ Failure Research, Guangdong Provincial Key Laboratory of Viral Hepatitis Research, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, No. 1838, North Guangzhou Avenue, Baiyun District, Guangzhou, China
| | - Yun Zhu
- State Key Laboratory of Organ Failure Research, Guangdong Provincial Key Laboratory of Viral Hepatitis Research, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, No. 1838, North Guangzhou Avenue, Baiyun District, Guangzhou, China
| | - Xiaoyun Hu
- State Key Laboratory of Organ Failure Research, Guangdong Provincial Key Laboratory of Viral Hepatitis Research, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, No. 1838, North Guangzhou Avenue, Baiyun District, Guangzhou, China
| | - Chang Hong
- State Key Laboratory of Organ Failure Research, Guangdong Provincial Key Laboratory of Viral Hepatitis Research, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, No. 1838, North Guangzhou Avenue, Baiyun District, Guangzhou, China
| | - Yabing Guo
- State Key Laboratory of Organ Failure Research, Guangdong Provincial Key Laboratory of Viral Hepatitis Research, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, No. 1838, North Guangzhou Avenue, Baiyun District, Guangzhou, China
| | - Lingyao An
- Research and Development Department, Huimei Technology Co., Ltd, Beijing, China
| | - Qi Zhang
- Research and Development Department, Huimei Technology Co., Ltd, Beijing, China
| | - Li Liu
- State Key Laboratory of Organ Failure Research, Guangdong Provincial Key Laboratory of Viral Hepatitis Research, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, No. 1838, North Guangzhou Avenue, Baiyun District, Guangzhou, China.
- Big Data Centre, Nanfang Hospital, Southern Medical University, Guangzhou, China.
| |
Collapse
|
6
|
DO NV, ELBERS DC, FILLMORE NR, AJJARAPU S, BERGSTROM SJ, BIHN J, CORRIGAN JK, DHOND R, DIPIETRO S, DOLGIN A, FELDMAN TC, GORYACHEV SD, HUHMANN LB, Jennifer LA, MARCANTONIO PA, MCGRATH KM, MILLER SJ, NGUYEN VQ, SCHNEELOCH GR, SUNG FC, SWINNERTON KN, TARREN AH, TOSI HM, VALLEY D, VO AD, YILDIRIM C, ZHENG C, ZWOLINSKI R, SAROSY GA, LOOSE D, SHANNON C, BROPHY MT. Matching Patients to Accelerate Clinical Trials (MPACT): Enabling Technology for Oncology Clinical Trial Workflow. Stud Health Technol Inform 2024; 310:1086-1090. [PMID: 38269982 PMCID: PMC11128308 DOI: 10.3233/shti231132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
Clinical trial enrollment is impeded by the significant time burden placed on research coordinators screening eligible patients. With 50,000 new cancer cases every year, the Veterans Health Administration (VHA) has made increased access for Veterans to high-quality clinical trials a priority. To aid in this effort, we worked with research coordinators to build the MPACT (Matching Patients to Accelerate Clinical Trials) platform with a goal of improving efficiency in the screening process. MPACT supports both a trial prescreening workflow and a screening workflow, employing Natural Language Processing and Data Science methods to produce reliable phenotypes of trial eligibility criteria. MPACT also has a functionality to track a patient's eligibility status over time. Qualitative feedback has been promising with users reporting a reduction in time spent on identifying eligible patients.
Collapse
Affiliation(s)
- Nhan V DO
- VA Boston Healthcare System, Boston MA, USA
- Boston University School of Medicine, Boston MA, USA
| | - Danne C ELBERS
- VA Boston Healthcare System, Boston MA, USA
- Harvard Medical School, Boston MA, USA
| | - Nathanael R FILLMORE
- VA Boston Healthcare System, Boston MA, USA
- Harvard Medical School, Boston MA, USA
| | | | | | - John BIHN
- VA Boston Healthcare System, Boston MA, USA
| | | | - Rupali DHOND
- VA Boston Healthcare System, Boston MA, USA
- Boston University School of Medicine, Boston MA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Mary T BROPHY
- VA Boston Healthcare System, Boston MA, USA
- Boston University School of Medicine, Boston MA, USA
| |
Collapse
|
7
|
Xu Q, Liu Y, Sun D, Huang X, Li F, Zhai J, Li Y, Zhou Q, Qian N, Niu B. OncoCTMiner: streamlining precision oncology trial matching via molecular profile analysis. Database (Oxford) 2023; 2023:baad077. [PMID: 37935585 PMCID: PMC10630409 DOI: 10.1093/database/baad077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 09/08/2023] [Accepted: 10/21/2023] [Indexed: 11/09/2023]
Abstract
By establishing omics sequencing of patient tumors as a crucial element in cancer treatment, the extensive implementation of precision oncology necessitates effective and prompt execution of clinical studies for approving molecular-targeted therapies. However, the substantial volume of patient sequencing data, combined with strict clinical trial criteria, increasingly complicates the process of matching patients to precision oncology studies. To streamline enrollment in these studies, we developed OncoCTMiner, an automated pre-screening platform for molecular cancer clinical trials. Through manual tagging of eligibility criteria for 2227 oncology trials, we identified key bio-concepts such as cancer types, genes, alterations, drugs, biomarkers and therapies. Utilizing this manually annotated corpus along with open-source biomedical natural language processing tools, we trained multiple named entity recognition models specifically designed for precision oncology trials. These models analyzed 460 952 clinical trials, revealing 8.15 million precision medicine concepts, 9.32 million entity-criteria-trial triplets and a comprehensive precision oncology eligibility criteria database. Most significantly, we developed a patient-trial matching system based on cancer patients' clinical and genetic profiles, which can seamlessly integrate with the omics data analysis platform. This system expedites the pre-screening process for potentially suitable precision oncology trials, offering patients swifter access to promising treatment options. Database URL https://oncoctminer.chosenmedinfo.com.
Collapse
Affiliation(s)
- Quan Xu
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
- Research and Development Center, ChosenMed Technology (Zhejiang) Co. Ltd., Room 101, Building 8, Jincheng International Science and Technology City, No. 26 Zhenxing East Road, Linping District, Hangzhou, 311103, China
| | - Yueyue Liu
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
| | - Dawei Sun
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
- Research and Development Center, ChosenMed Technology (Zhejiang) Co. Ltd., Room 101, Building 8, Jincheng International Science and Technology City, No. 26 Zhenxing East Road, Linping District, Hangzhou, 311103, China
| | - Xiaoqian Huang
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
| | - Feihong Li
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
| | - JinCheng Zhai
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
| | - Yang Li
- Beijing International Center for Mathematical Research, Peking University, No. 5 Yiheyuan Road Haidian District, Beijing 100871, China
- Chongqing Research Institute of Big Data, Peking University, Chongqing 401333, China
| | - Qiming Zhou
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
- Research and Development Center, ChosenMed Technology (Zhejiang) Co. Ltd., Room 101, Building 8, Jincheng International Science and Technology City, No. 26 Zhenxing East Road, Linping District, Hangzhou, 311103, China
| | - Niansong Qian
- Department of Oncology, Senior Department of Respiratory and Critical Care Medicine, The Eighth Medical Center of Chinese PLA General Hospital, No.17 A Heishanhu Road, Haidian District, Beijing 100853, China
| | - Beifang Niu
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100190, China
| |
Collapse
|
8
|
Idnay B, Fang Y, Butler A, Moran J, Li Z, Lee J, Ta C, Liu C, Yuan C, Chen H, Stanley E, Hripcsak G, Larson E, Marder K, Chung W, Ruotolo B, Weng C. Uncovering key clinical trial features influencing recruitment. J Clin Transl Sci 2023; 7:e199. [PMID: 37830010 PMCID: PMC10565197 DOI: 10.1017/cts.2023.623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 08/21/2023] [Accepted: 08/22/2023] [Indexed: 10/14/2023] Open
Abstract
Background Randomized clinical trials (RCT) are the foundation for medical advances, but participant recruitment remains a persistent barrier to their success. This retrospective data analysis aims to (1) identify clinical trial features associated with successful participant recruitment measured by accrual percentage and (2) compare the characteristics of the RCTs by assessing the most and least successful recruitment, which are indicated by varying thresholds of accrual percentage such as ≥ 90% vs ≤ 10%, ≥ 80% vs ≤ 20%, and ≥ 70% vs ≤ 30%. Methods Data from the internal research registry at Columbia University Irving Medical Center and Aggregated Analysis of ClinicalTrials.gov were collected for 393 randomized interventional treatment studies closed to further enrollment. We compared two regularized linear regression and six tree-based machine learning models for accrual percentage (i.e., reported accrual to date divided by the target accrual) prediction. The outperforming model and Tree SHapley Additive exPlanations were used for feature importance analysis for participant recruitment. The identified features were compared between the two subgroups. Results CatBoost regressor outperformed the others. Key features positively associated with recruitment success, as measured by accrual percentage, include government funding and compensation. Meanwhile, cancer research and non-conventional recruitment methods (e.g., websites) are negatively associated with recruitment success. Statistically significant subgroup differences (corrected p-value < .05) were found in 15 of the top 30 most important features. Conclusion This multi-source retrospective study highlighted key features influencing RCT participant recruitment, offering actionable steps for improvement, including flexible recruitment infrastructure and appropriate participant compensation.
Collapse
Affiliation(s)
- Betina Idnay
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Yilu Fang
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Alex Butler
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Joyce Moran
- Department of Neurology, Columbia University Irving Medical Center, NY Research, New York, NY, USA
| | - Ziran Li
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Junghwan Lee
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Casey Ta
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Cong Liu
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Chi Yuan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Huanyao Chen
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Edward Stanley
- Compliance Applications, Information Technology, Columbia University, New York, NY, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Elaine Larson
- School of Nursing, Columbia University Irving Medical Center, New York, NY, USA
- New York Academy of Medicine, New York, NY, USA
| | - Karen Marder
- Department of Neurology, Columbia University Irving Medical Center, NY Research, New York, NY, USA
| | - Wendy Chung
- Department of Pediatrics, Columbia University Irving Medical Center, New York, NY, USA
| | - Brenda Ruotolo
- Institutional Review Board for Human Subjects Research, Columbia University, New York, NY, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| |
Collapse
|
9
|
Pierre K, Gupta M, Raviprasad A, Sadat Razavi SM, Patel A, Peters K, Hochhegger B, Mancuso A, Forghani R. Medical imaging and multimodal artificial intelligence models for streamlining and enhancing cancer care: opportunities and challenges. Expert Rev Anticancer Ther 2023; 23:1265-1279. [PMID: 38032181 DOI: 10.1080/14737140.2023.2286001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 11/16/2023] [Indexed: 12/01/2023]
Abstract
INTRODUCTION Artificial intelligence (AI) has the potential to transform oncologic care. There have been significant developments in AI applications in medical imaging and increasing interest in multimodal models. These are likely to enable improved oncologic care through more precise diagnosis, increasingly in a more personalized and less invasive manner. In this review, we provide an overview of the current state and challenges that clinicians, administrative personnel and policy makers need to be aware of and mitigate for the technology to reach its full potential. AREAS COVERED The article provides a brief targeted overview of AI, a high-level review of the current state and future potential AI applications in diagnostic radiology and to a lesser extent digital pathology, focusing on oncologic applications. This is followed by a discussion of emerging approaches, including multimodal models. The article concludes with a discussion of technical, regulatory challenges and infrastructure needs for AI to realize its full potential. EXPERT OPINION There is a large volume of promising research, and steadily increasing commercially available tools using AI. For the most advanced and promising precision diagnostic applications of AI to be used clinically, robust and comprehensive quality monitoring systems and informatics platforms will likely be required.
Collapse
Affiliation(s)
- Kevin Pierre
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL, USA
- Department of Radiology, University of Florida College of Medicine, Gainesville, FL, USA
| | - Manas Gupta
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL, USA
| | - Abheek Raviprasad
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL, USA
- University of Florida College of Medicine, Gainesville, FL, USA
| | - Seyedeh Mehrsa Sadat Razavi
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL, USA
- University of Florida College of Medicine, Gainesville, FL, USA
| | - Anjali Patel
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL, USA
- University of Florida College of Medicine, Gainesville, FL, USA
| | - Keith Peters
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL, USA
- Department of Radiology, University of Florida College of Medicine, Gainesville, FL, USA
| | - Bruno Hochhegger
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL, USA
- Department of Radiology, University of Florida College of Medicine, Gainesville, FL, USA
| | - Anthony Mancuso
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL, USA
- Department of Radiology, University of Florida College of Medicine, Gainesville, FL, USA
| | - Reza Forghani
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL, USA
- Department of Radiology, University of Florida College of Medicine, Gainesville, FL, USA
- Division of Medical Physics, University of Florida College of Medicine, Gainesville, FL, USA
- Department of Neurology, Division of Movement Disorders, University of Florida College of Medicine, Gainesville, FL, USA
| |
Collapse
|
10
|
Idnay B, Butler A, Fang Y, Li Z, Lee J, Ta C, Liu C, Ruotolo B, Yuan C, Chen H, Hripcsak G, Larson E, Weng C. Principal Investigators' Perceptions on Factors Associated with Successful Recruitment in Clinical Trials. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2023; 2023:281-290. [PMID: 37350899 PMCID: PMC10283115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/24/2023]
Abstract
Participant recruitment continues to be a challenge to the success of randomized controlled trials, resulting in increased costs, extended trial timelines and delayed treatment availability. Literature provides evidence that study design features (e.g., trial phase, study site involvement) and trial sponsor are significantly associated with recruitment success. Principal investigators oversee the conduct of clinical trials, including recruitment. Through a cross-sectional survey and a thematic analysis of free-text responses, we assessed the perceptions of sixteen principal investigators regarding success factors for participant recruitment. Study site involvement and funding source do not necessarily make recruitment easier or more challenging from the perspective of the principal investigators. The most commonly used recruitment strategies are also the most effort inefficient (e.g., in-person recruitment, reviewing the electronic medical records for prescreening). Finally, we recommended actionable steps, such as improving staff support and leveraging informatics-driven approaches, to allow clinical researchers to enhance participant recruitment.
Collapse
Affiliation(s)
| | | | - Yilu Fang
- Department of Biomedical Informatics
| | - Ziran Li
- Department of Biomedical Informatics
| | | | - Casey Ta
- Department of Biomedical Informatics
| | - Cong Liu
- Department of Biomedical Informatics
| | | | - Chi Yuan
- Department of Biomedical Informatics
| | | | | | - Elaine Larson
- 3School of Nursing, Columbia University, New York, NY
| | | |
Collapse
|
11
|
Sun Z, Tao C. Named Entity Recognition and Normalization for Alzheimer's Disease Eligibility Criteria. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS 2023; 2023:558-564. [PMID: 38283164 PMCID: PMC10815931 DOI: 10.1109/ichi57859.2023.00100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2024]
Abstract
Alzheimer's Disease (AD) is a complex neurodegenerative disorder that affects millions of people worldwide. Finding effective treatments for this disease is crucial. Clinical trials play an essential role in developing and testing new treatments for AD. However, identifying eligible participants can be challenging, time-consuming, and costly. In recent years, the development of natural language processing (NLP) techniques, specifically named entity recognition (NER) and named entity normalization (NEN), have helped to automate the identification and extraction of relevant information from the eligibility criteria (EC) more efficiently, in order to facilitate semi-automatic patient recruitment and enable data FAIRness for clinical trial data. Nevertheless, most current biomedical NER models only provide annotations for a restricted set of entity types that may not be applicable to the clinical trial data. Additionally, accurately performing NEN on entities that are negated using a negative prefix currently lacks established techniques. In this paper, we introduce a pipeline designed for information extraction from AD clinical trial EC, which involves preprocessing of the EC data, clinical NER, and biomedical NEN to Unified Medical Language System (UMLS). Our NER model can identify named entities in seven pre-defined categories, while our NEN model employs a combination of exact match and partial match search strategies, as well as customized rules to accurately normalize entities with negative prefixes. To evaluate the performance of our pipeline, we measured the precision, recall, and F1 score for the NER component, and we manually reviewed the top five mapping results produced by the NEN component. Our evaluation of the pipeline's performance revealed that it can successfully normalize named entities in clinical trial ECs with optimal accuracies. The NER component achieved a overall F1 of 0.816, demonstrating its ability to accurately identify seven types of named entities in clinical text. The NEN component of the pipeline also demonstrated impressive performance, with customized rules and a combination of exact and partial match strategies leading to an accuracy of 0.940 for normalized entities.
Collapse
Affiliation(s)
- Zenan Sun
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas
| | - Cui Tao
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas
| |
Collapse
|
12
|
Meystre SM, Heider PM, Cates A, Bastian G, Pittman T, Gentilin S, Kelechi TJ. Piloting an automated clinical trial eligibility surveillance and provider alert system based on artificial intelligence and standard data models. BMC Med Res Methodol 2023; 23:88. [PMID: 37041475 PMCID: PMC10088225 DOI: 10.1186/s12874-023-01916-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 04/04/2023] [Indexed: 04/13/2023] Open
Abstract
BACKGROUND To advance new therapies into clinical care, clinical trials must recruit enough participants. Yet, many trials fail to do so, leading to delays, early trial termination, and wasted resources. Under-enrolling trials make it impossible to draw conclusions about the efficacy of new therapies. An oft-cited reason for insufficient enrollment is lack of study team and provider awareness about patient eligibility. Automating clinical trial eligibility surveillance and study team and provider notification could offer a solution. METHODS To address this need for an automated solution, we conducted an observational pilot study of our TAES (TriAl Eligibility Surveillance) system. We tested the hypothesis that an automated system based on natural language processing and machine learning algorithms could detect patients eligible for specific clinical trials by linking the information extracted from trial descriptions to the corresponding clinical information in the electronic health record (EHR). To evaluate the TAES information extraction and matching prototype (i.e., TAES prototype), we selected five open cardiovascular and cancer trials at the Medical University of South Carolina and created a new reference standard of 21,974 clinical text notes from a random selection of 400 patients (including at least 100 enrolled in the selected trials), with a small subset of 20 notes annotated in detail. We also developed a simple web interface for a new database that stores all trial eligibility criteria, corresponding clinical information, and trial-patient match characteristics using the Observational Medical Outcomes Partnership (OMOP) common data model. Finally, we investigated options for integrating an automated clinical trial eligibility system into the EHR and for notifying health care providers promptly of potential patient eligibility without interrupting their clinical workflow. RESULTS Although the rapidly implemented TAES prototype achieved only moderate accuracy (recall up to 0.778; precision up to 1.000), it enabled us to assess options for integrating an automated system successfully into the clinical workflow at a healthcare system. CONCLUSIONS Once optimized, the TAES system could exponentially enhance identification of patients potentially eligible for clinical trials, while simultaneously decreasing the burden on research teams of manual EHR review. Through timely notifications, it could also raise physician awareness of patient eligibility for clinical trials.
Collapse
Affiliation(s)
- Stéphane M Meystre
- OnePlanet Research Center and imec, Toernooiveld 300, Nijmegen, 6525 EC, The Netherlands.
| | - Paul M Heider
- Medical University of South Carolina, Charleston, SC, USA
| | - Andrew Cates
- Medical University of South Carolina, Charleston, SC, USA
| | - Grace Bastian
- Medical University of South Carolina, Charleston, SC, USA
| | - Tara Pittman
- Medical University of South Carolina, Charleston, SC, USA
| | | | | |
Collapse
|
13
|
Idnay B, Fang Y, Dreisbach C, Marder K, Weng C, Schnall R. Clinical research staff perceptions on a natural language processing-driven tool for eligibility prescreening: An iterative usability assessment. Int J Med Inform 2023; 171:104985. [PMID: 36638583 PMCID: PMC9912278 DOI: 10.1016/j.ijmedinf.2023.104985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 12/30/2022] [Accepted: 01/02/2023] [Indexed: 01/07/2023]
Abstract
BACKGROUND Participant recruitment is a barrier to successful clinical research. One strategy to improve recruitment is to conduct eligibility prescreening, a resource-intensive process where clinical research staff manually reviews electronic health records data to identify potentially eligible patients. Criteria2Query (C2Q) was developed to address this problem by capitalizing on natural language processing to generate queries to identify eligible participants from clinical databases semi-autonomously. OBJECTIVE We examined the clinical research staff's perceived usability of C2Q for clinical research eligibility prescreening. METHODS Twenty clinical research staff evaluated the usability of C2Q using a cognitive walkthrough with a think-aloud protocol and a Post-Study System Usability Questionnaire. On-screen activity and audio were recorded and transcribed. After every-five evaluators completed an evaluation, usability problems were rated by informatics experts and prioritized for system refinement. There were four iterations of system refinement based on the evaluation feedback. Guided by the Organizational Framework for Intuitive Human-computer Interaction, we performed a directed deductive content analysis of the verbatim transcriptions. RESULTS Evaluators aged from 24 to 46 years old (33.8; SD: 7.32) demonstrated high computer literacy (6.36; SD:0.17); female (75 %), White (35 %), and clinical research coordinators (45 %). C2Q demonstrated high usability during the final cycle (2.26 out of 7 [lower scores are better], SD: 0.74). The number of unique usability issues decreased after each refinement. Fourteen subthemes emerged from three themes: seeking user goals, performing well-learned tasks, and determining what to do next. CONCLUSIONS The cognitive walkthrough with a think-aloud protocol informed iterative system refinement and demonstrated the usability of C2Q by clinical research staff. Key recommendations for system development and implementation include improving system intuitiveness and overall user experience through comprehensive consideration of user needs and requirements for task completion.
Collapse
Affiliation(s)
- Betina Idnay
- Columbia University, School of Nursing, New York, NY, USA; Columbia University, Department of Neurology, New York, NY, USA; Columbia University, Department of Biomedical Informatics, New York, NY, USA.
| | - Yilu Fang
- Columbia University, Department of Biomedical Informatics, New York, NY, USA
| | | | - Karen Marder
- Columbia University, Department of Neurology, New York, NY, USA
| | - Chunhua Weng
- Columbia University, Department of Biomedical Informatics, New York, NY, USA
| | - Rebecca Schnall
- Columbia University, School of Nursing, New York, NY, USA; Columbia University, Mailman School of Public Health, Department of Population and Family Health, New York, NY, USA
| |
Collapse
|
14
|
Xiang JJ, Roy A, Summers C, Delvy M, O’Donovan J, Christensen J, Dwy C, Perry L, Connery D, Rose MG, Sheehan K, Chao HH. Brief Report: Implementation of a Universal Prescreening Protocol to Increase Recruitment to Lung Cancer Studies at a Veterans Affairs Cancer Center. JTO Clin Res Rep 2022; 3:100357. [PMID: 35815320 PMCID: PMC9256656 DOI: 10.1016/j.jtocrr.2022.100357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 05/20/2022] [Accepted: 06/03/2022] [Indexed: 10/31/2022] Open
Abstract
Introduction The oncology clinical trial recruitment process is time, labor, and resource intensive, and poor accrual rates are common. We describe the VA Connecticut Cancer Center experience of implementing a standardized, universal prescreening protocol and its impact on thoracic oncology research recruitment. Methods Research coordinators prescreened potentially eligible patients with confirmed or suspected cancer from multiple clinical sources and entered relevant patient and research study information into a centralized electronic database. The database provided real-time lists of potential studies for each patient. This enabled the research team to alert the patient's oncologist in advance of clinic visits and to prepare documents needed for enrollment. Clinicians could ensure sufficient time and attention in clinic to the informed consent process, therefore maximizing enrollment opportunities. Patients were also monitored on waitlists for future studies. Results From March 2017 to December 2020, a total of 1518 patients with lung nodules and suspected or confirmed lung cancers were prescreened. Of these, 379 patients were enrolled to a study, 103 patients declined participation, and 639 were monitored for future studies. Our prescreening protocol identified all new patients with lung cancer who were ultimately added to the cancer registry. We found a substantial increase in study enrollment after prescreening implementation. Conclusions Universal prescreening was associated with improved patient enrollment to thoracic oncology studies. The protocol was integral in our VA becoming the top accruing VA site for National Cancer Institute's National Clinical Trials Network studies for 2019 to 2021.
Collapse
Affiliation(s)
- Jenny J. Xiang
- Department of Internal Medicine, Yale University School of Medicine, New Haven, Connecticut
- Department of Internal Medicine, VA Connecticut Healthcare System, West Haven, Connecticut
- Corresponding author. Address for correspondence: Jenny J. Xiang, MD, Department of Internal Medicine, Yale University School of Medicine, 333 Cedar Street, New Haven, CT 06510.
| | - Alicia Roy
- VA Connecticut Healthcare System Comprehensive Cancer Center, West Haven, Connecticut
| | - Christine Summers
- VA Connecticut Healthcare System Comprehensive Cancer Center, West Haven, Connecticut
| | - Monica Delvy
- VA Connecticut Healthcare System Comprehensive Cancer Center, West Haven, Connecticut
| | - Jessica O’Donovan
- VA Connecticut Healthcare System Comprehensive Cancer Center, West Haven, Connecticut
| | - John Christensen
- VA Connecticut Healthcare System Comprehensive Cancer Center, West Haven, Connecticut
| | - Christopher Dwy
- VA Connecticut Healthcare System Comprehensive Cancer Center, West Haven, Connecticut
| | - Lydia Perry
- VA Connecticut Healthcare System Comprehensive Cancer Center, West Haven, Connecticut
| | - Donna Connery
- VA Connecticut Healthcare System Comprehensive Cancer Center, West Haven, Connecticut
| | - Michal G. Rose
- VA Connecticut Healthcare System Comprehensive Cancer Center, West Haven, Connecticut
- Yale Cancer Center and Yale School of Medicine, New Haven, Connecticut
| | - Kelsey Sheehan
- VA Connecticut Healthcare System Comprehensive Cancer Center, West Haven, Connecticut
- Yale Cancer Center and Yale School of Medicine, New Haven, Connecticut
| | - Herta H. Chao
- VA Connecticut Healthcare System Comprehensive Cancer Center, West Haven, Connecticut
- Yale Cancer Center and Yale School of Medicine, New Haven, Connecticut
| |
Collapse
|
15
|
Unberath P, Mahlmeister L, Reimer N, Busch H, Boerries M, Christoph J. Searching of Clinical Trials Made Easier in cBioPortal Using Patients' Genetic and Clinical Profiles. Appl Clin Inform 2022; 13:363-369. [PMID: 35354211 PMCID: PMC8967483 DOI: 10.1055/s-0042-1743560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background
Molecular tumor boards (MTBs) cope with the complexity of an increased usage of genome sequencing data in cancer treatment. As for most of these patients, guideline-based therapy options are exhausted, finding matching clinical trials is crucial. This search process is often performed manually and therefore time consuming and complex due to the heterogeneous and challenging dataset.
Objectives
In this study, a prototype for a search tool was developed to demonstrate how cBioPortal as a clinical and genomic patient data source can be integrated with ClinicalTrials.gov, a database of clinical studies to simplify the search for trials based on genetic and clinical data of a patient. The design of this tool should rest on the specific needs of MTB participants and the architecture of the integration should be as lightweight as possible and should not require manual curation of trial data in advance with the goal of quickly and easily finding a matching study.
Methods
Based on a requirements analysis, interviewing MTB experts, a prototype was developed. It was further refined using a user-centered development process with multiple feedback loops. Finally, the usability of the application was evaluated with user interviews including the thinking-aloud protocol and the system usability scale (SUS) questionnaire.
Results
The integration of ClinicalTrials.gov in cBioPortal is achieved by a new tab in the patient view where the genomic profile for the search is prefilled and additional parameters can be adjusted. These parameters are then used to query the application programming interface (API) of ClinicalTrials.gov. The returned search results subsequently are ranked and presented to the user. The evaluation of the application resulted in an SUS score of 83.5.
Conclusion
This work demonstrates the integration of cBioPortal with ClinicalTrials.gov to use clinical and genomic patient data to search for appropriate trials within an MTB.
Collapse
Affiliation(s)
- Philipp Unberath
- Friedrich-Alexander University Erlangen-Nuremberg, Chair of Medical Informatics, Erlangen, Bayern, Germany
| | - Lukas Mahlmeister
- Friedrich-Alexander University Erlangen-Nuremberg, Chair of Medical Informatics, Erlangen, Bayern, Germany
| | - Niklas Reimer
- Universität zu Lübeck, Group for Medical Systems Biology, Lübeck Institute of Experimental Dermatology, Lübeck, Schleswig-Holstein, Germany
| | - Hauke Busch
- Universität zu Lübeck, Group for Medical Systems Biology, Lübeck Institute of Experimental Dermatology, Lübeck, Schleswig-Holstein, Germany
| | - Melanie Boerries
- University of Freiburg Faculty of Medicine, Institute of Medical Bioinformatics and Systems Medicine, University Medical Center Freiburg, Freiburg, Baden-Württemberg, Germany.,German Cancer Consortium (DKTK), partner site Freiburg, German Cancer Research Center (DKFZ), Heidelberg, Baden-Württemberg, Germany
| | - Jan Christoph
- Friedrich-Alexander University Erlangen-Nuremberg, Chair of Medical Informatics, Erlangen, Bayern, Germany.,Martin-Luther-University Halle-Wittenberg, Faculty of Medicine, Junior Research Group (Bio-)Medical Data Science, Halle, Sachsen-Anhalt, Germany
| |
Collapse
|
16
|
Pichon A, Idnay B, Marder K, Schnall R, Weng C. Cognitive Function Characterization Using Electronic Health Records Notes. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2022; 2021:999-1008. [PMID: 35308911 PMCID: PMC8861713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Cognitive impairment is a defining feature of neurological disorders such as Alzheimer's disease (AD), one of the leading causes of disability and mortality in the elderly population. Assessing cognitive impairment is important for diagnostic, clinical management, and research purposes. The Folstein Mini-Mental State Examination (MMSE) is the most common screening measure of cognitive function, yet this score is not consistently available in the electronic health records. We conducted a pilot study to extract frequently used concepts characterizing cognitive function from the clinical notes of AD patients in an Aging and Dementia clinical practice. Then we developed a model to infer the severity of cognitive impairment and created a subspecialized taxonomy for concepts associated with MMSE scores. We evaluated the taxonomy and the severity prediction model and presented example use cases of this model.
Collapse
Affiliation(s)
| | - Betina Idnay
- School of Nursing
- Department of Neurology
- Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University, New York, New York, USA
| | - Karen Marder
- Department of Neurology
- Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University, New York, New York, USA
| | | | | |
Collapse
|
17
|
Idnay B, Dreisbach C, Weng C, Schnall R. A systematic review on natural language processing systems for eligibility prescreening in clinical research. J Am Med Inform Assoc 2021; 29:197-206. [PMID: 34725689 PMCID: PMC8714283 DOI: 10.1093/jamia/ocab228] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 08/30/2021] [Accepted: 10/04/2021] [Indexed: 11/14/2022] Open
Abstract
OBJECTIVE We conducted a systematic review to assess the effect of natural language processing (NLP) systems in improving the accuracy and efficiency of eligibility prescreening during the clinical research recruitment process. MATERIALS AND METHODS Guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) standards of quality for reporting systematic reviews, a protocol for study eligibility was developed a priori and registered in the PROSPERO database. Using predetermined inclusion criteria, studies published from database inception through February 2021 were identified from 5 databases. The Joanna Briggs Institute Critical Appraisal Checklist for Quasi-experimental Studies was adapted to determine the study quality and the risk of bias of the included articles. RESULTS Eleven studies representing 8 unique NLP systems met the inclusion criteria. These studies demonstrated moderate study quality and exhibited heterogeneity in the study design, setting, and intervention type. All 11 studies evaluated the NLP system's performance for identifying eligible participants; 7 studies evaluated the system's impact on time efficiency; 4 studies evaluated the system's impact on workload; and 2 studies evaluated the system's impact on recruitment. DISCUSSION NLP systems in clinical research eligibility prescreening are an understudied but promising field that requires further research to assess its impact on real-world adoption. Future studies should be centered on continuing to develop and evaluate relevant NLP systems to improve enrollment into clinical studies. CONCLUSION Understanding the role of NLP systems in improving eligibility prescreening is critical to the advancement of clinical research recruitment.
Collapse
Affiliation(s)
- Betina Idnay
- School of Nursing, Columbia University, New York, New York, USA
- Department of Neurology, Columbia University, New York, New York, USA
| | - Caitlin Dreisbach
- Data Science Institute, Columbia University, New York, New York, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Rebecca Schnall
- School of Nursing, Columbia University, New York, New York, USA
| |
Collapse
|
18
|
Zeng K, Xu Y, Lin G, Liang L, Hao T. Automated classification of clinical trial eligibility criteria text based on ensemble learning and metric learning. BMC Med Inform Decis Mak 2021; 21:129. [PMID: 34330259 PMCID: PMC8323220 DOI: 10.1186/s12911-021-01492-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 04/08/2021] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Eligibility criteria are the primary strategy for screening the target participants of a clinical trial. Automated classification of clinical trial eligibility criteria text by using machine learning methods improves recruitment efficiency to reduce the cost of clinical research. However, existing methods suffer from poor classification performance due to the complexity and imbalance of eligibility criteria text data. METHODS An ensemble learning-based model with metric learning is proposed for eligibility criteria classification. The model integrates a set of pre-trained models including Bidirectional Encoder Representations from Transformers (BERT), A Robustly Optimized BERT Pretraining Approach (RoBERTa), XLNet, Pre-training Text Encoders as Discriminators Rather Than Generators (ELECTRA), and Enhanced Representation through Knowledge Integration (ERNIE). Focal Loss is used as a loss function to address the data imbalance problem. Metric learning is employed to train the embedding of each base model for feature distinguish. Soft Voting is applied to achieve final classification of the ensemble model. The dataset is from the standard evaluation task 3 of 5th China Health Information Processing Conference containing 38,341 eligibility criteria text in 44 categories. RESULTS Our ensemble method had an accuracy of 0.8497, a precision of 0.8229, and a recall of 0.8216 on the dataset. The macro F1-score was 0.8169, outperforming state-of-the-art baseline methods by 0.84% improvement on average. In addition, the performance improvement had a p-value of 2.152e-07 with a standard t-test, indicating that our model achieved a significant improvement. CONCLUSIONS A model for classifying eligibility criteria text of clinical trials based on multi-model ensemble learning and metric learning was proposed. The experiments demonstrated that the classification performance was improved by our ensemble model significantly. In addition, metric learning was able to improve word embedding representation and the focal loss reduced the impact of data imbalance to model performance.
Collapse
Affiliation(s)
- Kun Zeng
- School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, China
| | - Yibin Xu
- School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, China
| | - Ge Lin
- National Engineering Research Center of Digital Life, Sun Yat-Sen University, Guangzhou, China
| | - Likeng Liang
- School of Computer Science, South China Normal University, Guangzhou, China
| | - Tianyong Hao
- School of Computer Science, South China Normal University, Guangzhou, China
| |
Collapse
|
19
|
Cai T, Cai F, Dahal KP, Cremone G, Lam E, Golnik C, Seyok T, Hong C, Cai T, Liao KP. Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening. ACR Open Rheumatol 2021; 3:593-600. [PMID: 34296815 PMCID: PMC8449035 DOI: 10.1002/acr2.11289] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 05/18/2021] [Indexed: 11/22/2022] Open
Abstract
Objective Efficiently identifying eligible patients is a crucial first step for a successful clinical trial. The objective of this study was to test whether an approach using electronic health record (EHR) data and an ensemble machine learning algorithm incorporating billing codes and data from clinical notes processed by natural language processing (NLP) can improve the efficiency of eligibility screening. Methods We studied patients screened for a clinical trial of rheumatoid arthritis (RA) with one or more International Classification of Diseases (ICD) code for RA and age greater than 35 years, from a tertiary care center and a community hospital. The following three groups of EHR features were considered for the algorithm: 1) structured features, 2) the counts of NLP concepts from notes, 3) health care utilization. All features were linked to dates. We applied random forest and logistic regression with least absolute shrinkage and selection operator penalty against the following two standard approaches: 1) one or more RA ICD code and no ICD codes related to exclusion criteria (ScreenRAICD1+EX) and 2) two or more RA ICD codes (ScreenRAICD2). To test the portability, we trained the algorithm at one institution and tested it at the other. Results In total, 3359 patients at Brigham and Women’s Hospital (BWH) and 642 patients at Faulkner Hospital (FH) were studied, with 461 (13.7%) eligible patients at BWH and 84 (13.4%) at FH. The application of the algorithm reduced ineligible patients from chart review by 40.5% at the tertiary care center and by 57.0% at the community hospital. In contrast, ScreenRAICD2 reduced patients for chart review by 2.7% to 11.3%; ScreenRAICD1+EX reduced patients for chart review by 63% to 65% but excluded 22% to 27% of eligible patients. Conclusion The ensemble machine learning algorithm incorporating billing codes and NLP data increased the efficiency of eligibility screening by reducing the number of patients requiring chart review while not excluding eligible patients. Moreover, this approach can be trained at one institution and applied at another for multicenter clinical trials.
Collapse
Affiliation(s)
- Tianrun Cai
- Brigham and Women's Hospital, Boston, Massachusetts, United States
| | - Fiona Cai
- Massachusetts Institute of Technology, Cambridge, Massachusetts, United States
| | - Kumar P Dahal
- Brigham and Women's Hospital, Boston, Massachusetts, United States
| | | | - Ethan Lam
- Brigham and Women's Hospital, Boston, Massachusetts, United States
| | - Charlotte Golnik
- Brigham and Women's Hospital, Boston, Massachusetts, United States
| | - Thany Seyok
- Brigham and Women's Hospital, Boston, Massachusetts, United States
| | - Chuan Hong
- Harvard University, Boston, Massachusetts, United States
| | - Tianxi Cai
- Harvard University, Boston, Massachusetts, United States
| | - Katherine P Liao
- Brigham and Women's Hospital, Harvard University, and Veterans Affairs Boston Healthcare System, Boston, Massachusetts, United States
| |
Collapse
|
20
|
O'Brien EC, Raman SR, Ellis A, Hammill BG, Berdan LG, Rorick T, Janmohamed S, Lampron Z, Hernandez AF, Curtis LH. The use of electronic health records for recruitment in clinical trials: a mixed methods analysis of the Harmony Outcomes Electronic Health Record Ancillary Study. Trials 2021; 22:465. [PMID: 34281607 PMCID: PMC8287813 DOI: 10.1186/s13063-021-05397-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Accepted: 06/24/2021] [Indexed: 11/22/2022] Open
Abstract
Background The electronic health record (EHR) contains a wealth of clinical data that may be used to streamline the identification of potential clinical trial participants. However, there is little empirical information on site-level facilitators of and barriers to optimal use of EHR systems with respect to trial recruitment. Methods We conducted qualitative focus groups and quantitative surveys as part of the EHR Ancillary Study, which is being conducted alongside the multicenter, global, Harmony Outcomes Trial comparing albiglutide to standard care for the prevention of cardiovascular events in type 2 diabetes. Subject matter experts used findings from focus groups to draft a 20-question survey examining the use of the EHR for participant identification, common site recruitment strategies, and variation in perceived barriers to optimal use of the EHR. The final survey was fielded with 446 site investigators actively enrolling participants in the main trial. Results Nearly two-thirds of respondents were study coordinators (63.2%), 23.1% were principal investigators, and 13.7% held other research roles. Approximately half of the respondents reported using the EHR to find potential trial participants. Of these, 79.4% reported using EHR searches in conjunction with other recruitment methods, including reviewing of upcoming clinic schedules (75.3%) and contacting past trial participants (71.2%). Important barriers to optimal use of the EHR included the lack of availability of certain research-focused EHR modules and limitations on the ability to contact patients cared for by other providers. Of survey respondents who did not use the EHR to find potential participants, one-quarter reported that the EHR was not accessible in their country; this finding varied from 2.6% of respondents in North America to 50% of respondents in the Asia Pacific. Conclusions While EHR screening was commonly used for recruitment in a cardiovascular outcomes trial, important technical, governance, and regulatory barriers persist. Multifaceted, scalable, and customizable strategies are needed to support the optimal use of the EHR for trial participant identification. Trial registration ClinicalTrials.gov NCT02465515. Registered on 8 June 2015 Supplementary Information The online version contains supplementary material available at 10.1186/s13063-021-05397-0.
Collapse
Affiliation(s)
- Emily C O'Brien
- Duke Clinical Research Institute, Durham, NC, USA. .,Department of Population Health Sciences, Duke University School of Medicine, 215 Morris Street, Suite 210, Durham, NC, 27701, USA.
| | - Sudha R Raman
- Department of Population Health Sciences, Duke University School of Medicine, 215 Morris Street, Suite 210, Durham, NC, 27701, USA
| | - Alicia Ellis
- Duke Clinical Research Institute, Durham, NC, USA.,UCB, Durham, NC, USA
| | - Bradley G Hammill
- Duke Clinical Research Institute, Durham, NC, USA.,Department of Population Health Sciences, Duke University School of Medicine, 215 Morris Street, Suite 210, Durham, NC, 27701, USA
| | | | - Tyrus Rorick
- Duke Clinical Research Institute, Durham, NC, USA
| | | | | | - Adrian F Hernandez
- Duke Clinical Research Institute, Durham, NC, USA.,Department of Medicine, Duke University School of Medicine, Durham, NC, USA
| | - Lesley H Curtis
- Duke Clinical Research Institute, Durham, NC, USA.,Department of Population Health Sciences, Duke University School of Medicine, 215 Morris Street, Suite 210, Durham, NC, 27701, USA
| |
Collapse
|
21
|
Callahan A, Polony V, Posada JD, Banda JM, Gombar S, Shah NH. ACE: the Advanced Cohort Engine for searching longitudinal patient records. J Am Med Inform Assoc 2021; 28:1468-1479. [PMID: 33712854 PMCID: PMC8279796 DOI: 10.1093/jamia/ocab027] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Accepted: 02/23/2021] [Indexed: 01/02/2023] Open
Abstract
OBJECTIVE To propose a paradigm for a scalable time-aware clinical data search, and to describe the design, implementation and use of a search engine realizing this paradigm. MATERIALS AND METHODS The Advanced Cohort Engine (ACE) uses a temporal query language and in-memory datastore of patient objects to provide a fast, scalable, and expressive time-aware search. ACE accepts data in the Observational Medicine Outcomes Partnership Common Data Model, and is configurable to balance performance with compute cost. ACE's temporal query language supports automatic query expansion using clinical knowledge graphs. The ACE API can be used with R, Python, Java, HTTP, and a Web UI. RESULTS ACE offers an expressive query language for complex temporal search across many clinical data types with multiple output options. ACE enables electronic phenotyping and cohort-building with subsecond response times in searching the data of millions of patients for a variety of use cases. DISCUSSION ACE enables fast, time-aware search using a patient object-centric datastore, thereby overcoming many technical and design shortcomings of relational algebra-based querying. Integrating electronic phenotype development with cohort-building enables a variety of high-value uses for a learning health system. Tradeoffs include the need to learn a new query language and the technical setup burden. CONCLUSION ACE is a tool that combines a unique query language for time-aware search of longitudinal patient records with a patient object datastore for rapid electronic phenotyping, cohort extraction, and exploratory data analyses.
Collapse
Affiliation(s)
- Alison Callahan
- Center for Biomedical Informatics Research, School of Medicine, School of Medicine, Stanford University, Stanford, California, USA
| | - Vladimir Polony
- Center for Biomedical Informatics Research, School of Medicine, School of Medicine, Stanford University, Stanford, California, USA
| | - José D Posada
- Center for Biomedical Informatics Research, School of Medicine, School of Medicine, Stanford University, Stanford, California, USA
| | - Juan M Banda
- Department of Computer Science, Georgia State University, Atlanta, Georgia, USA
| | - Saurabh Gombar
- Department of Pathology, School of Medicine, Stanford University, Stanford, California, USA
| | - Nigam H Shah
- Center for Biomedical Informatics Research, School of Medicine, School of Medicine, Stanford University, Stanford, California, USA
| |
Collapse
|
22
|
von Itzstein MS, Hullings M, Mayo H, Beg MS, Williams EL, Gerber DE. Application of Information Technology to Clinical Trial Evaluation and Enrollment: A Review. JAMA Oncol 2021; 7:1559-1566. [PMID: 34236403 DOI: 10.1001/jamaoncol.2021.1165] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Importance As cancer treatment has become more individualized, oncologic clinical trials have become more complex. Increasingly numerous and stringent eligibility criteria frequently include tumor molecular or genomic characteristics that may not be readily identified in medical records, rendering it difficult to best match clinical trials with clinical sites and to identify potentially eligible patients once a clinical trial has been selected and activated. Partly because of these factors, enrollment rates for cancer clinical trials remain low, creating delays and increased costs for drug development. Information technology (IT) platforms have been applied to the implementation and conduct of clinical trials to improve efficiencies in several medical fields, and these platforms have recently been introduced to oncologic studies. Observations This review summarizes cancer and noncancer studies that used IT platforms for assistance with clinical trial site selection, patient recruitment, and patient screening. The review does not address the use of IT in other aspects of clinical research, such as wearable physical activity monitors or telehealth visits. A large number of IT platforms (which may be patient facing, site or investigator facing, or sponsor facing) are now commercially available. These applications use artificial intelligence and/or natural language processing to identify and summarize protocol eligibility criteria, institutional patient populations, and individual electronic health records. Although there is an expanding body of literature examining the role of this technology, relatively few studies to date have been performed in oncologic settings. Conclusions and Relevance This review found that an increasing number and variety of IT platforms were available to assist in the planning and conduct of clinical trials. Because oncologic clinical care and clinical trial protocols are particularly complex, nuanced, and individualized, published experience with this technology in other fields may not be fully applicable to cancer settings. The extent to which these services will overcome ongoing and increasing challenges in cancer clinical research remains unclear.
Collapse
Affiliation(s)
- Mitchell S von Itzstein
- Department of Internal Medicine, Division of Hematology-Oncology, The University of Texas Southwestern Medical Center, Dallas.,Harold C. Simmons Comprehensive Cancer Center, The University of Texas Southwestern Medical Center, Dallas
| | - Melanie Hullings
- Harold C. Simmons Comprehensive Cancer Center, The University of Texas Southwestern Medical Center, Dallas
| | - Helen Mayo
- Southwestern Health Sciences Digital Library and Learning Center, The University of Texas, Dallas
| | - M Shaalan Beg
- Department of Internal Medicine, Division of Hematology-Oncology, The University of Texas Southwestern Medical Center, Dallas.,Harold C. Simmons Comprehensive Cancer Center, The University of Texas Southwestern Medical Center, Dallas
| | - Erin L Williams
- Harold C. Simmons Comprehensive Cancer Center, The University of Texas Southwestern Medical Center, Dallas
| | - David E Gerber
- Department of Internal Medicine, Division of Hematology-Oncology, The University of Texas Southwestern Medical Center, Dallas.,Harold C. Simmons Comprehensive Cancer Center, The University of Texas Southwestern Medical Center, Dallas.,Department of Population and Data Sciences, The University of Texas, Southwestern Medical Center, Dallas
| |
Collapse
|
23
|
Kirshner J, Cohn K, Dunder S, Donahue K, Richey M, Larson P, Sutton L, Siu E, Donegan J, Chen Z, Nightingale C, Estévez M, Hamrick HJ. Automated Electronic Health Record-Based Tool for Identification of Patients With Metastatic Disease to Facilitate Clinical Trial Patient Ascertainment. JCO Clin Cancer Inform 2021; 5:719-727. [PMID: 34197178 DOI: 10.1200/cci.20.00180] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE To facilitate identification of clinical trial participation candidates, we developed a machine learning tool that automates the determination of a patient's metastatic status, on the basis of unstructured electronic health record (EHR) data. METHODS This tool scans EHR documents, extracting text snippet features surrounding key words (such as metastatic, progression, and local). A regularized logistic regression model was trained and used to classify patients across five metastatic categories: highly likely and likely positive, highly likely and likely negative, and unknown. Using a real-world oncology database of patients with solid tumors with manually abstracted information as reference, we calculated sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). We validated the performance in a real-world data set, evaluating accuracy gains upon additional user review of tool's outputs after integration into clinic workflows. RESULTS In the training data set (N = 66,532), the model sensitivity and specificity (% [95% CI]) were 82.4 [81.9 to 83.0] and 95.5 [95.3 to 96.7], respectively; the PPV was 89.3 [88.8 to 90.0], and the NPV was 94.0 [93.8 to 94.2]. In the validation sample (n = 200 from five distinct care sites), after user review of model outputs, values increased to 97.1 [85.1 to 99.9] for sensitivity, 98.2 [94.8 to 99.6] for specificity, 91.9 [78.1 to 98.3] for PPV, and 99.4 [96.6 to 100.0] for NPV. The model assigned 163 of 200 patients to the highly likely categories. The error prevalence was 4% before and 2% after user review. CONCLUSION This tool infers metastatic status from unstructured EHR data with high accuracy and high confidence in more than 75% of cases, without requiring additional manual review. By enabling efficient characterization of metastatic status, this tool could mitigate a key barrier for patient ascertainment and clinical trial participation in community clinics.
Collapse
Affiliation(s)
- Jeffrey Kirshner
- Hematology Oncology Associates of Central New York, East Syracuse, NY
| | - Kelly Cohn
- Hematology Oncology Associates of Central New York, East Syracuse, NY
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Jain NM, Culley A, Micheel CM, Osterman TJ, Levy MA. Learnings From Precision Clinical Trial Matching for Oncology Patients Who Received NGS Testing. JCO Clin Cancer Inform 2021; 5:231-238. [PMID: 33625867 PMCID: PMC8140789 DOI: 10.1200/cci.20.00142] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
PURPOSE Tumor next-generation sequencing reports typically generate trial recommendations for patients based on their diagnosis and genomic profile. However, these require additional refinement and prescreening, which can add to physician burden. We wanted to use human prescreening efforts to efficiently refine these trial options and also elucidate the high-value parameters that have a major impact on efficient trial matching. METHODS Clinical trial recommendations were generated based on diagnosis and biomarker criteria using an informatics platform and were further refined by manual prescreening. The refined results were then compared with the initial trial recommendations and the reasons for false-positive matches were evaluated. RESULTS Manual prescreening significantly reduced the number of false positives from the informatics generated trial recommendations, as expected. We found that trial-specific criteria, especially recruiting status for individual trial arms, were a high value parameter and led to the largest number of automated false-positive matches. CONCLUSION Reflex clinical trial matching approaches that refine trial recommendations based on the clinical details as well as trial-specific criteria have the potential to help alleviate physician burden for selecting the most appropriate trial for their patient. Investing in publicly available resources that capture the recruiting status of a trial at the cohort or arm level would, therefore, allow us to make meaningful contributions to increase the clinical trial enrollments by eliminating false positives.
Collapse
Affiliation(s)
- Neha M Jain
- Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN
| | - Alison Culley
- Vanderbilt-Ingram Cancer Center, Clinical Trial Shared Resource, Vanderbilt University Medical Center, Nashville, TN
| | - Christine M Micheel
- Department of Medicine, Division of Hematology/Oncology, Vanderbilt University Medical Center, Nashville, TN
| | - Travis J Osterman
- Department of Medicine, Division of Hematology/Oncology, Vanderbilt University Medical Center, Nashville, TN.,Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Mia A Levy
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN.,Department of Internal Medicine, Division of Hematology/Oncology, Rush University Medical Center, Chicago, IL.,Rush University Cancer Center, Rush University Medical Center, Chicago, IL
| |
Collapse
|
25
|
Stemerman R, Bunning T, Grover J, Kitzmiller R, Patel MD. Identifying Patient Phenotype Cohorts Using Prehospital Electronic Health Record Data. PREHOSP EMERG CARE 2021:1-14. [PMID: 33315497 DOI: 10.1080/10903127.2020.1859658] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 12/01/2020] [Indexed: 10/22/2022]
Abstract
Objective: Emergency medical services (EMS) provide critical interventions for patients with acute illness and injury and are important in implementing prehospital emergency care research. Retrospective, manual patient record review, the current reference-standard for identifying patient cohorts, requires significant time and financial investment. We developed automated classification models to identify eligible patients for prehospital clinical trials using EMS clinical notes and compared model performance to manual review.Methods: With eligibility criteria for an ongoing prehospital study of chest pain patients, we used EMS clinical notes (n = 1208) to manually classify patients as eligible, ineligible, and indeterminate. We randomly split these same records into training and test sets to develop and evaluate machine-learning (ML) algorithms using natural language processing (NLP) for feature (variable) selection. We compared models to the manual classification to calculate sensitivity, specificity, accuracy, positive predictive value, and F1 measure. We measured clinical expert time to perform review for manual and automated methods.Results: ML models' sensitivity, specificity, accuracy, positive predictive value, and F1 measure ranged from 0.93 to 0.98. Compared to manual classification (N = 363 records), the automated method excluded 90.9% of records as ineligible and leaving only 33 records for manual review.Conclusions: Our ML derived approach demonstrates the feasibility of developing a high-performing, automated classification system using EMS clinical notes to streamline the identification of a specific cardiac patient cohort. This efficient approach can be leveraged to facilitate prehospital patient-trial matching, patient phenotyping (i.e. influenza-like illness), and create prehospital patient registries.
Collapse
Affiliation(s)
- Rachel Stemerman
- Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
| | - Thomas Bunning
- Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
| | - Joseph Grover
- Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
| | - Rebecca Kitzmiller
- Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
| | - Mehul D Patel
- Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
| |
Collapse
|
26
|
Wang Y, Lakoma A, Zogopoulos G. Building towards Precision Oncology for Pancreatic Cancer: Real-World Challenges and Opportunities. Genes (Basel) 2020; 11:E1098. [PMID: 32967105 PMCID: PMC7563487 DOI: 10.3390/genes11091098] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2020] [Revised: 09/15/2020] [Accepted: 09/17/2020] [Indexed: 02/06/2023] Open
Abstract
The advent of next-generation sequencing (NGS) has provided unprecedented insight into the molecular complexity of pancreatic ductal adenocarcinoma (PDAC). This has led to the emergence of biomarker-driven treatment paradigms that challenge empiric treatment approaches. However, the growth of sequencing technologies is outpacing the development of the infrastructure required to implement precision oncology as routine clinical practice. Addressing these logistical barriers is imperative to maximize the clinical impact of molecular profiling initiatives. In this review, we examine the evolution of precision oncology in PDAC, spanning from germline testing for cancer susceptibility genes to multi-omic tumor profiling. Furthermore, we highlight real-world challenges to delivering precision oncology for PDAC, and propose strategies to improve the generation, interpretation, and clinical translation of molecular profiling data.
Collapse
Affiliation(s)
- Yifan Wang
- Department of Surgery, McGill University, Montreal, QC H4A 3J1, Canada; (Y.W.); (A.L.)
- Research Institute of the McGill University Health Centre, McGill University, Montreal, QC H4A 3J1, Canada
- The Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, QC H3A 1A3, Canada
| | - Anna Lakoma
- Department of Surgery, McGill University, Montreal, QC H4A 3J1, Canada; (Y.W.); (A.L.)
- Research Institute of the McGill University Health Centre, McGill University, Montreal, QC H4A 3J1, Canada
- The Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, QC H3A 1A3, Canada
| | - George Zogopoulos
- Department of Surgery, McGill University, Montreal, QC H4A 3J1, Canada; (Y.W.); (A.L.)
- Research Institute of the McGill University Health Centre, McGill University, Montreal, QC H4A 3J1, Canada
- The Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, QC H3A 1A3, Canada
| |
Collapse
|
27
|
Johnson EA, Carrington JM. Clinical Research Integration Within the Electronic Health Record: A Literature Review. Comput Inform Nurs 2020; 39:129-135. [PMID: 33657055 DOI: 10.1097/cin.0000000000000659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Clinical trials have become commonplace as a treatment option. As clinical trial participants are integrated into all healthcare delivery settings, organizations are tasked with sustaining specific care regimens with appropriate documentation and maintenance of participant protections within electronic health records. Our aim was to identify the common elements necessary for electronic health record integration of clinical research for optimal trial conduct and participant management. Review of literature was conducted utilizing PubMed and CINAHL to identify relevant publications that described use of the electronic health record to directly support trial conduct, with a total of 15 publications ultimately meeting inclusion criteria. Three thematic groupings emerged that categorized common aspects of clinical research integration: functional, structural, and procedural components. These components include technological requirements (platform/system), regulatory and legal compliance, and stakeholder involvement with clinical trial procedures (recruitment of participants). Without a centralized means of providing clinicians with current treatment and adverse event management information, participant injury or likelihood of withdrawal will increase. Further research is required to develop an optimal model of research-related integration within commercial electronic health records.
Collapse
Affiliation(s)
- Elizabeth A Johnson
- Author Affiliations: The University of Arizona (Ms Johnson), Tucson; and University of Florida (Dr Carrington), Gainesville
| | | |
Collapse
|
28
|
Zeng K, Pan Z, Xu Y, Qu Y. An Ensemble Learning Strategy for Eligibility Criteria Text Classification for Clinical Trial Recruitment: Algorithm Development and Validation. JMIR Med Inform 2020; 8:e17832. [PMID: 32609092 PMCID: PMC7367522 DOI: 10.2196/17832] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 03/09/2020] [Accepted: 03/14/2020] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Eligibility criteria are the main strategy for screening appropriate participants for clinical trials. Automatic analysis of clinical trial eligibility criteria by digital screening, leveraging natural language processing techniques, can improve recruitment efficiency and reduce the costs involved in promoting clinical research. OBJECTIVE We aimed to create a natural language processing model to automatically classify clinical trial eligibility criteria. METHODS We proposed a classifier for short text eligibility criteria based on ensemble learning, where a set of pretrained models was integrated. The pretrained models included state-of-the-art deep learning methods for training and classification, including Bidirectional Encoder Representations from Transformers (BERT), XLNet, and A Robustly Optimized BERT Pretraining Approach (RoBERTa). The classification results by the integrated models were combined as new features for training a Light Gradient Boosting Machine (LightGBM) model for eligibility criteria classification. RESULTS Our proposed method obtained an accuracy of 0.846, a precision of 0.803, and a recall of 0.817 on a standard data set from a shared task of an international conference. The macro F1 value was 0.807, outperforming the state-of-the-art baseline methods on the shared task. CONCLUSIONS We designed a model for screening short text classification criteria for clinical trials based on multimodel ensemble learning. Through experiments, we concluded that performance was improved significantly with a model ensemble compared to a single model. The introduction of focal loss could reduce the impact of class imbalance to achieve better performance.
Collapse
Affiliation(s)
- Kun Zeng
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
| | - Zhiwei Pan
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
| | - Yibin Xu
- School of Computer Science, South China Normal University, Guangzhou, China
| | - Yingying Qu
- School of Business, Guangdong University of Foreign Studies, Guangzhou, China
| |
Collapse
|
29
|
Jain NM, Culley A, Knoop T, Micheel C, Osterman T, Levy M. Conceptual Framework to Support Clinical Trial Optimization and End-to-End Enrollment Workflow. JCO Clin Cancer Inform 2020; 3:1-10. [PMID: 31225983 PMCID: PMC6873934 DOI: 10.1200/cci.19.00033] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
In this work, we present a conceptual framework to support clinical trial optimization and enrollment workflows and review the current state, limitations, and future trends in this space. This framework includes knowledge representation of clinical trials, clinical trial optimization, clinical trial design, enrollment workflows for prospective clinical trial matching, waitlist management, and, finally, evaluation strategies for assessing improvement.
Collapse
Affiliation(s)
- Neha M Jain
- Vanderbilt University Medical Center, Nashville, TN
| | | | - Teresa Knoop
- Vanderbilt University Medical Center, Nashville, TN
| | | | | | - Mia Levy
- Vanderbilt University Medical Center, Nashville, TN.,Rush University Medical Center, Chicago, IL
| |
Collapse
|
30
|
Alexander M, Solomon B, Ball DL, Sheerin M, Dankwa-Mullan I, Preininger AM, Jackson GP, Herath DM. Evaluation of an artificial intelligence clinical trial matching system in Australian lung cancer patients. JAMIA Open 2020; 3:209-215. [PMID: 32734161 PMCID: PMC7382632 DOI: 10.1093/jamiaopen/ooaa002] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 01/31/2020] [Indexed: 11/21/2022] Open
Abstract
Objective The objective of this technical study was to evaluate the performance of an artificial intelligence (AI)-based system for clinical trials matching for a cohort of lung cancer patients in an Australian cancer hospital. Methods A lung cancer cohort was derived from clinical data from patients attending an Australian cancer hospital. Ten phases I–III clinical trials registered on clinicaltrials.gov and open to lung cancer patients at this institution were utilized for assessments. The trial matching system performance was compared to a gold standard established by clinician consensus for trial eligibility. Results The study included 102 lung cancer patients. The trial matching system evaluated 7252 patient attributes (per patient median 74, range 53–100) against 11 467 individual trial eligibility criteria (per trial median 597, range 243–4132). Median time for the system to run a query and return results was 15.5 s (range 7.2–37.8). In establishing the gold standard, clinician interrater agreement was high (Cohen’s kappa 0.70–1.00). On a per-patient basis, the performance of the trial matching system for eligibility was as follows: accuracy, 91.6%; recall (sensitivity), 83.3%; precision (positive predictive value), 76.5%; negative predictive value, 95.7%; and specificity, 93.8%. Discussion and Conclusion The AI-based clinical trial matching system allows efficient and reliable screening of cancer patients for clinical trials with 95.7% accuracy for exclusion and 91.6% accuracy for overall eligibility assessment; however, clinician input and oversight are still required. The automated system demonstrates promise as a clinical decision support tool to prescreen a large patient cohort to identify subjects suitable for further assessment.
Collapse
Affiliation(s)
- Marliese Alexander
- Department of Pharmacy, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia.,Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Victoria, Australia
| | - Benjamin Solomon
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Victoria, Australia.,Department of Medical Oncology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - David L Ball
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Victoria, Australia.,Department of Radiation Oncology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - Mimi Sheerin
- IBM Watson Health, Cambridge, Massachusetts, USA
| | | | | | | | - Dishan M Herath
- Department of Medical Oncology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| |
Collapse
|
31
|
Yuan C, Ryan PB, Ta C, Guo Y, Li Z, Hardin J, Makadia R, Jin P, Shang N, Kang T, Weng C. Criteria2Query: a natural language interface to clinical databases for cohort definition. J Am Med Inform Assoc 2020; 26:294-305. [PMID: 30753493 PMCID: PMC6402359 DOI: 10.1093/jamia/ocy178] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Revised: 11/16/2018] [Accepted: 11/29/2018] [Indexed: 11/12/2022] Open
Abstract
Objective Cohort definition is a bottleneck for conducting clinical research and depends on subjective decisions by domain experts. Data-driven cohort definition is appealing but requires substantial knowledge of terminologies and clinical data models. Criteria2Query is a natural language interface that facilitates human-computer collaboration for cohort definition and execution using clinical databases. Materials and Methods Criteria2Query uses a hybrid information extraction pipeline combining machine learning and rule-based methods to systematically parse eligibility criteria text, transforms it first into a structured criteria representation and next into sharable and executable clinical data queries represented as SQL queries conforming to the OMOP Common Data Model. Users can interactively review, refine, and execute queries in the ATLAS web application. To test effectiveness, we evaluated 125 criteria across different disease domains from ClinicalTrials.gov and 52 user-entered criteria. We evaluated F1 score and accuracy against 2 domain experts and calculated the average computation time for fully automated query formulation. We conducted an anonymous survey evaluating usability. Results Criteria2Query achieved 0.795 and 0.805 F1 score for entity recognition and relation extraction, respectively. Accuracies for negation detection, logic detection, entity normalization, and attribute normalization were 0.984, 0.864, 0.514 and 0.793, respectively. Fully automatic query formulation took 1.22 seconds/criterion. More than 80% (11+ of 13) of users would use Criteria2Query in their future cohort definition tasks. Conclusions We contribute a novel natural language interface to clinical databases. It is open source and supports fully automated and interactive modes for autonomous data-driven cohort definition by researchers with minimal human effort. We demonstrate its promising user friendliness and usability.
Collapse
Affiliation(s)
- Chi Yuan
- Department of Biomedical Informatics, Columbia University, New York, New York, USA.,Department of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing, Jiangsu Province, P.R. China
| | - Patrick B Ryan
- Department of Biomedical Informatics, Columbia University, New York, New York, USA.,Epidemiology Analytics, Janssen Research and Development, Titusville, New Jersey, USA
| | - Casey Ta
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Yixuan Guo
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Ziran Li
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Jill Hardin
- Epidemiology Analytics, Janssen Research and Development, Titusville, New Jersey, USA
| | - Rupa Makadia
- Epidemiology Analytics, Janssen Research and Development, Titusville, New Jersey, USA
| | - Peng Jin
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Ning Shang
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Tian Kang
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| |
Collapse
|
32
|
Abstract
OBJECTIVE Challenges with efficient patient recruitment including sociotechnical barriers for clinical trials are major barriers to the timely and efficacious conduct of translational studies. We conducted a time-and-motion study to investigate the workflow of clinical trial enrollment in a pediatric emergency department. METHODS We observed clinical research coordinators during 3 clinically staffed shifts. One clinical research coordinator was shadowed at a time. Tasks were marked in 30-second intervals and annotated to include patient screening, patient contact, performing procedures, and physician contact. Statistical analysis was conducted on the patient enrollment activities. RESULTS We conducted fifteen 120-minute observations from December 12, 2013, to January 3, 2014 and shadowed 8 clinical research coordinators. Patient screening took 31.62% of their time, patient contact took 18.67%, performing procedures took 17.6%, physician contact was 1%, and other activities took 31.0%. CONCLUSIONS Screening patients for eligibility constituted the most time. Automated screening methods could help reduce this time. The findings suggest improvement areas in recruitment planning to increase the efficiency of clinical trial enrollment.
Collapse
|
33
|
Ni Y, Bermudez M, Kennebeck S, Liddy-Hicks S, Dexheimer J. A Real-Time Automated Patient Screening System for Clinical Trials Eligibility in an Emergency Department: Design and Evaluation. JMIR Med Inform 2019; 7:e14185. [PMID: 31342909 PMCID: PMC6685132 DOI: 10.2196/14185] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 06/07/2019] [Accepted: 06/12/2019] [Indexed: 01/23/2023] Open
Abstract
Background One critical hurdle for clinical trial recruitment is the lack of an efficient method for identifying subjects who meet the eligibility criteria. Given the large volume of data documented in electronic health records (EHRs), it is labor-intensive for the staff to screen relevant information, particularly within the time frame needed. To facilitate subject identification, we developed a natural language processing (NLP) and machine learning–based system, Automated Clinical Trial Eligibility Screener (ACTES), which analyzes structured data and unstructured narratives automatically to determine patients’ suitability for clinical trial enrollment. In this study, we integrated the ACTES into clinical practice to support real-time patient screening. Objective This study aimed to evaluate ACTES’s impact on the institutional workflow, prospectively and comprehensively. We hypothesized that compared with the manual screening process, using EHR-based automated screening would improve efficiency of patient identification, streamline patient recruitment workflow, and increase enrollment in clinical trials. Methods The ACTES was fully integrated into the clinical research coordinators’ (CRC) workflow in the pediatric emergency department (ED) at Cincinnati Children’s Hospital Medical Center. The system continuously analyzed EHR information for current ED patients and recommended potential candidates for clinical trials. Relevant patient eligibility information was presented in real time on a dashboard available to CRCs to facilitate their recruitment. To assess the system’s effectiveness, we performed a multidimensional, prospective evaluation for a 12-month period, including a time-and-motion study, quantitative assessments of enrollment, and postevaluation usability surveys collected from the CRCs. Results Compared with manual screening, the use of ACTES reduced the patient screening time by 34% (P<.001). The saved time was redirected to other activities such as study-related administrative tasks (P=.03) and work-related conversations (P=.006) that streamlined teamwork among the CRCs. The quantitative assessments showed that automated screening improved the numbers of subjects screened, approached, and enrolled by 14.7%, 11.1%, and 11.1%, respectively, suggesting the potential of ACTES in streamlining recruitment workflow. Finally, the ACTES achieved a system usability scale of 80.0 in the postevaluation surveys, suggesting that it was a good computerized solution. Conclusions By leveraging NLP and machine learning technologies, the ACTES demonstrated good capacity for improving efficiency of patient identification. The quantitative assessments demonstrated the potential of ACTES in streamlining recruitment workflow and improving patient enrollment. The postevaluation surveys suggested that the system was a good computerized solution with satisfactory usability.
Collapse
Affiliation(s)
- Yizhao Ni
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Monica Bermudez
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Stephanie Kennebeck
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Stacey Liddy-Hicks
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Judith Dexheimer
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| |
Collapse
|
34
|
Shen W, Weaver AM, Salazar C, Samet JM, Diaz-Sanchez D, Tong H. Validation of a Dietary Questionnaire to Screen Omega-3 Fatty Acids Levels in Healthy Adults. Nutrients 2019; 11:nu11071470. [PMID: 31261632 PMCID: PMC6682879 DOI: 10.3390/nu11071470] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Revised: 06/25/2019] [Accepted: 06/26/2019] [Indexed: 01/17/2023] Open
Abstract
To facilitate a clinical observational study to identify healthy volunteers with low (defined as ≤4%) and high (defined as ≥5.5%) omega-3 indices, a dietary questionnaire to rapidly assess habitual dietary intake of eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) was developed. This study aimed to determine the validity of this newly developed dietary questionnaire. One hundred and eight volunteers were included and were assessed for habitual dietary intake of EPA and DHA using the questionnaire. The United States Department of Agriculture food products database and nutrition fact label was referenced for calculation. Blood samples were collected for the analysis of fatty acids in whole blood specimens and to derive omega-3 indices. A linear correlation was observed between reported dietary consumption of EPA, DHA, EPA+DHA and the whole blood levels of EPA, DHA, and the omega-3 indices (r = 0.67, 0.62, 0.67, respectively, p < 0.001 for all). The findings also suggested that the questionnaire was substantially better at identifying volunteers with high omega-3 indices (sensitivity 89%, specificity 84%, and agreement 86%) compared to volunteers with low omega-3 indices (sensitivity 100%, specificity 66%, and agreement 42%). In conclusion, this newly developed questionnaire is an efficient tool for the assessment of omega-3 indices in study populations and is particularly effective in identifying individuals with high omega-3 indices.
Collapse
Affiliation(s)
- Wan Shen
- Oak Ridge Institute of Science and Education, 100 ORAU Way, Oak Ridge, TN 37830, USA.
- Department of Public and Allied Health, 119 Health and Human Services, Bowling Green State University, Bowling Green, OH 43403, USA.
| | - Anne M Weaver
- Environmental Public Health Division, National Health and Environmental Effects Research Laboratory, U.S. Environmental Protection Agency, 104 Mason Farm Road, Chapel Hill, NC 27514, USA
| | - Claudia Salazar
- Environmental Public Health Division, National Health and Environmental Effects Research Laboratory, U.S. Environmental Protection Agency, 104 Mason Farm Road, Chapel Hill, NC 27514, USA
| | - James M Samet
- Environmental Public Health Division, National Health and Environmental Effects Research Laboratory, U.S. Environmental Protection Agency, 104 Mason Farm Road, Chapel Hill, NC 27514, USA
| | - David Diaz-Sanchez
- Environmental Public Health Division, National Health and Environmental Effects Research Laboratory, U.S. Environmental Protection Agency, 104 Mason Farm Road, Chapel Hill, NC 27514, USA
| | - Haiyan Tong
- Environmental Public Health Division, National Health and Environmental Effects Research Laboratory, U.S. Environmental Protection Agency, 104 Mason Farm Road, Chapel Hill, NC 27514, USA.
| |
Collapse
|
35
|
Chen B, Jin H, Yang Z, Qu Y, Weng H, Hao T. An approach for transgender population information extraction and summarization from clinical trial text. BMC Med Inform Decis Mak 2019; 19:62. [PMID: 30961595 PMCID: PMC6454593 DOI: 10.1186/s12911-019-0768-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Background Gender information frequently exists in the eligibility criteria of clinical trial text as essential information for participant population recruitment. Particularly, current eligibility criteria text contains the incompleteness and ambiguity issues in expressing transgender population, leading to difficulties or even failure of transgender population recruitment in clinical trial studies. Methods A new gender model is proposed for providing comprehensive transgender requirement specification. In addition, an automated approach is developed to extract and summarize gender requirements from unstructured text in accordance with the gender model. This approach consists of: 1) the feature extraction module, and 2) the feature summarization module. The first module identifies and extracts gender features using heuristic rules and automatically-generated patterns. The second module summarizes gender requirements by relation inference. Results Based on 100,134 clinical trials from ClinicalTrials.gov, our approach was compared with 20 commonly applied machine learning methods. It achieved a macro-averaged precision of 0.885, a macro-averaged recall of 0.871 and a macro-averaged F1-measure of 0.878. The results illustrated that our approach outperformed all baseline methods in terms of both commonly used metrics and macro-averaged metrics. Conclusions This study presented a new gender model aiming for specifying the transgender requirement more precisely. We also proposed an approach for gender information extraction and summarization from unstructured clinical text to enhance transgender-related clinical trial population recruitment. The experiment results demonstrated that the approach was effective in transgender criteria extraction and summarization.
Collapse
Affiliation(s)
- Boyu Chen
- School of Information Science and Technology, Guangdong University of Foreign Studies, Guangzhou, China
| | - Hao Jin
- School of Information Science and Technology, Guangdong University of Foreign Studies, Guangzhou, China
| | - Zhiwen Yang
- School of Information Science and Technology, Guangdong University of Foreign Studies, Guangzhou, China
| | - Yingying Qu
- School of Business, Guangdong University of Foreign Studies, Guangzhou, China.
| | - Heng Weng
- The Second Affiliated Hospital, Guangzhou University of Chinese Medicine, Guangzhou, China.
| | - Tianyong Hao
- School of Computer Science, South China Normal University, Guangzhou, China.
| |
Collapse
|
36
|
Conte C, Vaysse C, Bosco P, Noize P, Fourrier-Reglat A, Despas F, Lapeyre-Mestre M. The value of a health insurance database to conduct pharmacoepidemiological studies in oncology. Therapie 2019; 74:279-288. [DOI: 10.1016/j.therap.2018.09.076] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Accepted: 09/29/2018] [Indexed: 01/28/2023]
|
37
|
Brennan CW, Krumlauf M, Feigenbaum K, Gartrell K, Cusack G. Patient Acuity Related to Clinical Research: Concept Clarification and Literature Review. West J Nurs Res 2018; 41:1306-1331. [PMID: 30319047 DOI: 10.1177/0193945918804545] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In research settings, clinical and research requirements contribute to nursing workload, staffing decisions, and resource allocation. The aim of this article is to define patient acuity in the context of clinical research, or research intensity, and report available instruments to measure it. The design was based on Centre for Reviews and Dissemination recommendations, including defining search terms, developing inclusion and exclusion criteria, followed by abstract review by three members of the team, thorough reading of each article by two team members, and data extraction procedures, including a quality appraisal of each article. Few instruments were available to measure research intensity. Findings provide foundational work for conceptual clarity and tool development, both of which are necessary before workforce allocation based on research intensity can occur.
Collapse
Affiliation(s)
- Caitlin W Brennan
- 1 National Institutes of Health Clinical Center, Nursing Department, Bethesda, MD, USA
| | - Michael Krumlauf
- 1 National Institutes of Health Clinical Center, Nursing Department, Bethesda, MD, USA
| | - Kathryn Feigenbaum
- 1 National Institutes of Health Clinical Center, Nursing Department, Bethesda, MD, USA
| | - Kyungsook Gartrell
- 1 National Institutes of Health Clinical Center, Nursing Department, Bethesda, MD, USA.,2 National Library of Medicine, Bethesda, MD, USA
| | - Georgie Cusack
- 3 National Heart, Lung, and Blood Institute, Bethesda, MD, USA
| |
Collapse
|
38
|
Mudaranthakam DP, Thompson J, Hu J, Pei D, Chintala SR, Park M, Fridley BL, Gajewski B, Koestler DC, Mayo MS. A Curated Cancer Clinical Outcomes Database (C3OD) for accelerating patient recruitment in cancer clinical trials. JAMIA Open 2018; 1:166-171. [PMID: 30474074 PMCID: PMC6241508 DOI: 10.1093/jamiaopen/ooy023] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Revised: 04/29/2018] [Accepted: 05/29/2018] [Indexed: 11/13/2022] Open
Abstract
Data used to determine patient eligibility for cancer clinical trials often come from disparate sources that are typically maintained by different groups within an institution, use differing technologies, and are stored in different formats. Collecting data and resolving inconsistencies across sources increase the time it takes to screen eligible patients, potentially delaying study completion. To address these challenges, the Biostatistics and Informatics Shared Resource at The University of Kansas Cancer Center developed the Curated Cancer Clinical Outcomes Database (C3OD). C3OD merges data from the electronic medical record, tumor registry, bio-specimen and data registry, and allows querying through a single unified platform. By centralizing access and maintaining appropriate controls, C3OD allows researchers to more rapidly obtain detailed information about each patient in order to accelerate eligibility screening. This case report describes the design of this informatics platform as well as initial assessments of its reliability and usability.
Collapse
Affiliation(s)
- Dinesh Pal Mudaranthakam
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, Kansas, USA.,University of Kansas Cancer Center, Kansas City, Kansas, USA
| | - Jeffrey Thompson
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, Kansas, USA.,University of Kansas Cancer Center, Kansas City, Kansas, USA
| | - Jinxiang Hu
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, Kansas, USA.,University of Kansas Cancer Center, Kansas City, Kansas, USA
| | - Dong Pei
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, Kansas, USA
| | | | - Michele Park
- University of Kansas Cancer Center, Kansas City, Kansas, USA
| | - Brooke L Fridley
- Department of Biostatistics and Bioinformatics, Moffitt Cancer Center, Tampa, Florida, USA
| | - Byron Gajewski
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, Kansas, USA.,University of Kansas Cancer Center, Kansas City, Kansas, USA
| | - Devin C Koestler
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, Kansas, USA.,University of Kansas Cancer Center, Kansas City, Kansas, USA
| | - Matthew S Mayo
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, Kansas, USA.,University of Kansas Cancer Center, Kansas City, Kansas, USA
| |
Collapse
|
39
|
Butler A, Wei W, Yuan C, Kang T, Si Y, Weng C. The Data Gap in the EHR for Clinical Research Eligibility Screening. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2018; 2017:320-329. [PMID: 29888090 PMCID: PMC5961795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Much effort has been devoted to leverage EHR data for matching patients into clinical trials. However, EHRs may not contain all important data elements for clinical research eligibility screening. To better design research-friendly EHRs, an important step is to identify data elements frequently used for eligibility screening but not yet available in EHRs. This study fills this knowledge gap. Using the Alzheimer's disease domain as an example, we performed text mining on the eligibility criteria text in Clinicaltrials.gov to identify frequently used eligibility criteria concepts. We compared them to the EHR data elements of a cohort of Alzheimer's Disease patients to assess the data gap by usingthe OMOP Common Data Model to standardize the representations for both criteria concepts and EHR data elements. We identified the most common SNOMED CT concepts used in Alzheimer 's Disease trials, andfound 40% of common eligibility criteria concepts were not even defined in the concept space in the EHR dataset for a cohort of Alzheimer 'sDisease patients, indicating a significant data gap may impede EHR-based eligibility screening. The results of this study can be useful for designing targeted research data collection forms to help fill the data gap in the EHR.
Collapse
Affiliation(s)
- Alex Butler
- Department of Biomedical Informatics, Columbia University, New York City, New York
| | - Wei Wei
- Department of Biomedical Informatics, Columbia University, New York City, New York
| | - Chi Yuan
- Department of Biomedical Informatics, Columbia University, New York City, New York
| | - Tian Kang
- Department of Biomedical Informatics, Columbia University, New York City, New York
| | - Yuqi Si
- The University of Texas Health Science Center at Houston, Houston, Texas
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York City, New York
| |
Collapse
|
40
|
Zhang K, Demner-Fushman D. Automated classification of eligibility criteria in clinical trials to facilitate patient-trial matching for specific patient populations. J Am Med Inform Assoc 2018; 24:781-787. [PMID: 28339690 DOI: 10.1093/jamia/ocw176] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Accepted: 12/01/2016] [Indexed: 11/14/2022] Open
Abstract
Objective To develop automated classification methods for eligibility criteria in ClinicalTrials.gov to facilitate patient-trial matching for specific populations such as persons living with HIV or pregnant women. Materials and Methods We annotated 891 interventional cancer trials from ClinicalTrials.gov based on their eligibility for human immunodeficiency virus (HIV)-positive patients using their eligibility criteria. These annotations were used to develop classifiers based on regular expressions and machine learning (ML). After evaluating classification of cancer trials for eligibility of HIV-positive patients, we sought to evaluate the generalizability of our approach to more general diseases and conditions. We annotated the eligibility criteria for 1570 of the most recent interventional trials from ClinicalTrials.gov for HIV-positive and pregnancy eligibility, and the classifiers were retrained and reevaluated using these data. Results On the cancer-HIV dataset, the baseline regex model, the bag-of-words ML classifier, and the ML classifier with named entity recognition (NER) achieved macro-averaged F2 scores of 0.77, 0.87, and 0.87, respectively; the addition of NER did not result in a significant performance improvement. On the general dataset, ML + NER achieved macro-averaged F2 scores of 0.91 and 0.85 for HIV and pregnancy, respectively. Discussion and Conclusion The eligibility status of specific patient populations, such as persons living with HIV and pregnant women, for clinical trials is of interest to both patients and clinicians. We show that it is feasible to develop a high-performing, automated trial classification system for eligibility status that can be integrated into consumer-facing search engines as well as patient-trial matching systems.
Collapse
Affiliation(s)
- Kevin Zhang
- College of Medicine and Life Sciences, University of Toledo, Toledo, OH, USA
| | - Dina Demner-Fushman
- Lister Hill National Center for Biomedical Communications, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
41
|
Conte C, Palmaro A, Grosclaude P, Daubisse-Marliac L, Despas F, Lapeyre-Mestre M. A novel approach for medical research on lymphomas: A study validation of claims-based algorithms to identify incident cases. Medicine (Baltimore) 2018; 97:e9418. [PMID: 29480830 PMCID: PMC5943849 DOI: 10.1097/md.0000000000009418] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
The use of claims database to study lymphomas in real-life conditions is a crucial issue in the future. In this way, it is essential to develop validated algorithms for the identification of lymphomas in these databases. The aim of this study was to assess the validity of diagnosis codes in the French health insurance database to identify incident cases of lymphomas according to results of a regional cancer registry, as the gold standard.Between 2010 and 2013, incident lymphomas were identified in hospital data through 2 algorithms of selection. The results of the identification process and characteristics of incident lymphomas cases were compared with data from the Tarn Cancer Registry. Each algorithm's performance was assessed by estimating sensitivity, predictive positive value, specificity (SPE), and negative predictive value.During the period, the registry recorded 476 incident cases of lymphomas, of which 52 were Hodgkin lymphomas and 424 non-Hodgkin lymphomas. For corresponding area and period, algorithm 1 provides a number of incident cases close to the Registry, whereas algorithm 2 overestimated the number of incident cases by approximately 30%. Both algorithms were highly specific (SPE = 99.9%) but moderately sensitive. The comparative analysis illustrates that similar distribution and characteristics are observed in both sources.Given these findings, the use of claims database can be consider as a pertinent and powerful tool to conduct medico-economic or pharmacoepidemiological studies in lymphomas.
Collapse
Affiliation(s)
- Cécile Conte
- LEASP-UMR 1027, Inserm-University of Toulouse
- Medical and Clinical Pharmacology Unit
| | - Aurore Palmaro
- LEASP-UMR 1027, Inserm-University of Toulouse
- Medical and Clinical Pharmacology Unit
- CIC 1436, Toulouse University Hospital
| | - Pascale Grosclaude
- LEASP-UMR 1027, Inserm-University of Toulouse
- Claudius Regaud Institute, IUCT-O, Tarn Cancer Registry, Toulouse, France
| | - Laetitia Daubisse-Marliac
- LEASP-UMR 1027, Inserm-University of Toulouse
- Claudius Regaud Institute, IUCT-O, Tarn Cancer Registry, Toulouse, France
| | - Fabien Despas
- LEASP-UMR 1027, Inserm-University of Toulouse
- Medical and Clinical Pharmacology Unit
- CIC 1436, Toulouse University Hospital
| | - Maryse Lapeyre-Mestre
- LEASP-UMR 1027, Inserm-University of Toulouse
- Medical and Clinical Pharmacology Unit
- CIC 1436, Toulouse University Hospital
| |
Collapse
|
42
|
Speich B, von Niederhäusern B, Schur N, Hemkens LG, Fürst T, Bhatnagar N, Alturki R, Agarwal A, Kasenda B, Pauli-Magnus C, Schwenkglenks M, Briel M. Systematic review on costs and resource use of randomized clinical trials shows a lack of transparent and comprehensive data. J Clin Epidemiol 2017; 96:1-11. [PMID: 29288136 DOI: 10.1016/j.jclinepi.2017.12.018] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Revised: 12/05/2017] [Accepted: 12/20/2017] [Indexed: 11/30/2022]
Abstract
OBJECTIVES Randomized clinical trials (RCTs) are costly. We aimed to provide a systematic overview of the available evidence on resource use and costs for RCTs to support budget planning. STUDY DESIGN AND SETTING We systematically searched MEDLINE, EMBASE, and HealthSTAR from inception until November 30, 2016 without language restrictions. We included any publication reporting empirical data on resource use and costs of RCTs and categorized them depending on whether they reported (i) resource and costs of all aspects at all study stages of an RCT (including conception, planning, preparation, conduct, and all tasks after the last patient has completed the RCT); (ii) on several aspects, (iii) on a single aspect (e.g., recruitment); or (iv) on overall costs for RCTs. Median costs of different recruitment strategies were calculated. Other results (e.g., overall costs) were listed descriptively. All cost data were converted into USD 2017. RESULTS A total of 56 articles that reported on cost or resource use of RCTs were included. None of the articles provided empirical resource use and cost data for all aspects of an entire RCT. Eight articles presented resource use and cost data on several aspects (e.g., aggregated cost data of different drug development phases, site-specific costs, selected cost components). Thirty-five articles assessed costs of one specific aspect of an RCT (i.e., 30 on recruitment; five others). The median costs per recruited patient were USD 409 (range: USD 41-6,990). Overall costs of an RCT, as provided in 16 articles, ranged from USD 43-103,254 per patient, and USD 0.2-611.5 Mio per RCT but the methodology of gathering these overall estimates remained unclear in 12 out of 16 articles (75%). CONCLUSION The usefulness of the available empirical evidence on resource use and costs of RCTs is limited. Transparent and comprehensive resource use and cost data are urgently needed to support budget planning for RCTs and help improve sustainability.
Collapse
Affiliation(s)
- Benjamin Speich
- Department of Clinical Research, Basel Institute for Clinical Epidemiology and Biostatistics, University of Basel and University Hospital Basel, Switzerland
| | - Belinda von Niederhäusern
- Clinical Trial Unit, Department of Clinical Research, University of Basel and University Hospital Basel, Basel, Switzerland
| | - Nadine Schur
- Institute of Pharmaceutical Medicine, University of Basel, Basel, Switzerland
| | - Lars G Hemkens
- Department of Clinical Research, Basel Institute for Clinical Epidemiology and Biostatistics, University of Basel and University Hospital Basel, Switzerland
| | - Thomas Fürst
- Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute, University of Basel, Basel, Switzerland; School of Public Health, Imperial College London, London, United Kingdom
| | - Neera Bhatnagar
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Reem Alturki
- Multi Organ Transplant Center, King Fahad Specialist Hospital Dammam, P.O. Box 15215, Dammam 31444, Saudi Arabia
| | - Arnav Agarwal
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada; School of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Benjamin Kasenda
- Department of Clinical Research, Basel Institute for Clinical Epidemiology and Biostatistics, University of Basel and University Hospital Basel, Switzerland; Department of Medical Oncology, University of Basel and University Hospital Basel, Switzerland
| | - Christiane Pauli-Magnus
- Clinical Trial Unit, Department of Clinical Research, University of Basel and University Hospital Basel, Basel, Switzerland
| | - Matthias Schwenkglenks
- Institute of Pharmaceutical Medicine, University of Basel, Basel, Switzerland; Epidemiology, Biostatistics and Prevention Institute, University of Zürich, Zürich, Switzerland
| | - Matthias Briel
- Department of Clinical Research, Basel Institute for Clinical Epidemiology and Biostatistics, University of Basel and University Hospital Basel, Switzerland; Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada.
| |
Collapse
|
43
|
Koopman B, Zuccon G, Bruza P. What makes an effective clinical query and querier? J Assoc Inf Sci Technol 2017. [DOI: 10.1002/asi.23959] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Bevan Koopman
- Australian e-Health Research Centre, CSIRO; Brisbane Australia
| | - Guido Zuccon
- School of Electrical Engineering & Computer Science; Queensland University of Technology; Brisbane Australia
| | - Peter Bruza
- School of Information Systems; Queensland University of Technology; Brisbane Australia
| |
Collapse
|
44
|
Shivade C, Hebert C, Regan K, Fosler-Lussier E, Lai AM. Automatic data source identification for clinical trial eligibility criteria resolution. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2017; 2016:1149-1158. [PMID: 28269912 PMCID: PMC5333255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Clinical trial coordinators refer to both structured and unstructured sources of data when evaluating a subject for eligibility. While some eligibility criteria can be resolved using structured data, some require manual review of clinical notes. An important step in automating the trial screening process is to be able to identify the right data source for resolving each criterion. In this work, we discuss the creation of an eligibility criteria dataset for clinical trials for patients with two disparate diseases, annotated with the preferred data source for each criterion (i.e., structured or unstructured) by annotators with medical training. The dataset includes 50 heart-failure trials with a total of 766 eligibility criteria and 50 trials for chronic lymphocytic leukemia (CLL) with 677 criteria. Further, we developed machine learning models to predict the preferred data source: kernel methods outperform simpler learning models when used with a combination of lexical, syntactic, semantic, and surface features. Evaluation of these models indicates that the performance is consistent across data from both diagnoses, indicating generalizability of our method. Our findings are an important step towards ongoing efforts for automation of clinical trial screening.
Collapse
Affiliation(s)
| | - Courtney Hebert
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH
| | - Kelly Regan
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH
| | | | - Albert M Lai
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH.; National Institute of Health, Rehabilitation Medicine Department, Mark O. Hatfield Clinical Research Center, Bethesda, MD
| |
Collapse
|
45
|
Massett HA, Mishkin G, Rubinstein L, Ivy SP, Denicoff A, Godwin E, DiPiazza K, Bolognese J, Zwiebel JA, Abrams JS. Challenges Facing Early Phase Trials Sponsored by the National Cancer Institute: An Analysis of Corrective Action Plans to Improve Accrual. Clin Cancer Res 2016; 22:5408-5416. [PMID: 27401246 DOI: 10.1158/1078-0432.ccr-16-0338] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Revised: 06/28/2016] [Accepted: 06/29/2016] [Indexed: 11/16/2022]
Abstract
Accruing patients in a timely manner represents a significant challenge to early phase cancer clinical trials. The NCI Cancer Therapy Evaluation Program analyzed 19 months of corrective action plans (CAP) received for slow-accruing phase I and II trials to identify slow accrual reasons, evaluate whether proposed corrective actions matched these reasons, and assess the CAP impact on trial accrual, duration, and likelihood of meeting primary scientific objectives. Of the 135 CAPs analyzed, 69 were for phase I trials and 66 for phase II trials. Primary reasons cited for slow accrual were safety/toxicity (phase I: 48%), design/protocol concerns (phase I: 42%, phase II: 33%), and eligibility criteria (phase I: 41%, phase II: 35%). The most commonly proposed corrective actions were adding institutions (phase I: 43%, phase II: 85%) and amending the trial to change eligibility or design (phase I: 55%, phase II: 44%). Only 40% of CAPs provided proposed corrective actions that matched the reasons given for slow accrual. Seventy percent of trials were closed to accrual at time of analysis (phase I = 48; phase II = 46). Of these, 67% of phase I and 70% of phase II trials met their primary objectives, but they were active three times longer than projected. Among closed trials, 24% had an accrual rate increase associated with a greater likelihood of meeting their primary scientific objectives. Ultimately, trials receiving CAPs saw improved accrual rates. Future trials may benefit from implementing CAPs early in trial life cycles, but it may be more beneficial to invest in earlier accrual planning. Clin Cancer Res; 22(22); 5408-16. ©2016 AACRSee related commentary by Mileham and Kim, p. 5397.
Collapse
Affiliation(s)
| | | | | | - S Percy Ivy
- National Cancer Institute, Bethesda, Maryland
| | | | | | | | | | | | | |
Collapse
|
46
|
Kondylakis H, Claerhout B, Keyur M, Koumakis L, van Leeuwen J, Marias K, Perez-Rey D, De Schepper K, Tsiknakis M, Bucur A. The INTEGRATE project: Delivering solutions for efficient multi-centric clinical research and trials. J Biomed Inform 2016; 62:32-47. [PMID: 27224847 DOI: 10.1016/j.jbi.2016.05.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2015] [Revised: 05/05/2016] [Accepted: 05/17/2016] [Indexed: 10/21/2022]
Abstract
The objective of the INTEGRATE project (http://www.fp7-integrate.eu/) that has recently concluded successfully was the development of innovative biomedical applications focused on streamlining the execution of clinical research, on enabling multidisciplinary collaboration, on management and large-scale sharing of multi-level heterogeneous datasets, and on the development of new methodologies and of predictive multi-scale models in cancer. In this paper, we present the way the INTEGRATE consortium has approached important challenges such as the integration of multi-scale biomedical data in the context of post-genomic clinical trials, the development of predictive models and the implementation of tools to facilitate the efficient execution of postgenomic multi-centric clinical trials in breast cancer. Furthermore, we provide a number of key "lessons learned" during the process and give directions for further future research and development.
Collapse
Affiliation(s)
- Haridimos Kondylakis
- Computational BioMedicine Laboratory, FORTH-ICS, N. Plastira 100, Heraklion, Greece.
| | - Brecht Claerhout
- Custodix NV, Kortrijksesteenweg 214b3, Sint-Martens-Latem, Belgium
| | - Mehta Keyur
- German Breast Group, GBG Forschungs GmbH, Geschaeftsfuehrer: Prof. Dr. med. Gunter von Minckwitz, Handelsregister: Amtsgericht Offenbach, HRB 40477 Sitz der Gesellschaft ist Neu-Isenburg, Germany
| | - Lefteris Koumakis
- Computational BioMedicine Laboratory, FORTH-ICS, N. Plastira 100, Heraklion, Greece
| | | | - Kostas Marias
- Computational BioMedicine Laboratory, FORTH-ICS, N. Plastira 100, Heraklion, Greece
| | - David Perez-Rey
- Biomedical Informatics Group, DLSIIS & DIA, Facultad de Informática, Universidad Politécnica de Madrid, Campus de Montegancedo S/N, 28660 Boadilla del Monte, Madrid, Spain
| | | | - Manolis Tsiknakis
- Computational BioMedicine Laboratory, FORTH-ICS, N. Plastira 100, Heraklion, Greece; Department of Informatics Engineering, Technological Educational Institute of Crete, Estavromenos 71004, Hearklion, Crete, Greece
| | - Anca Bucur
- PHILIPS Research Europe, High Tech Campus 34, Eindhoven, Netherlands
| |
Collapse
|
47
|
Eubank MH, Hyman DM, Kanakamedala AD, Gardos SM, Wills JM, Stetson PD. Automated eligibility screening and monitoring for genotype-driven precision oncology trials. J Am Med Inform Assoc 2016; 23:777-81. [PMID: 27016727 DOI: 10.1093/jamia/ocw020] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2015] [Accepted: 01/31/2016] [Indexed: 11/13/2022] Open
Abstract
The Information Systems Department at Memorial Sloan Kettering Cancer Center developed the DARWIN Cohort Management System (DCMS). The DCMS identifies and tracks cohorts of patients based on genotypic and clinical data. It assists researchers and treating physicians in enrolling patients to genotype-matched IRB-approved clinical trials. The DCMS sends automated, actionable, and secure email notifications to users with information about eligible or enrolled patients before their upcoming appointments. The system also captures investigators input via annotations on patient eligibility and preferences on future status updates. As of August 2015, the DCMS is tracking 159,893 patients on both clinical operations and research cohorts. 134 research cohorts have been established and track 64,473 patients. 51,192 of these have had one or more genomic tests including MSK-IMPACT, comprising the pool eligible for genotype-matched studies. This paper describes the design and evolution of this Informatics solution.
Collapse
Affiliation(s)
- Michael H Eubank
- Information Systems, Memorial Sloan Kettering Cancer Center, New York, NY
| | - David M Hyman
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
| | | | - Stuart M Gardos
- Information Systems, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Jonathan M Wills
- Information Systems, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Peter D Stetson
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY Division of Health Informatics, Memorial Sloan Kettering Cancer Center, New York, NY
| |
Collapse
|
48
|
Shivade C, Hebert C, Lopetegui M, de Marneffe MC, Fosler-Lussier E, Lai AM. Textual inference for eligibility criteria resolution in clinical trials. J Biomed Inform 2015; 58 Suppl:S211-S218. [PMID: 26376462 DOI: 10.1016/j.jbi.2015.09.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2015] [Revised: 09/02/2015] [Accepted: 09/04/2015] [Indexed: 10/23/2022]
Abstract
Clinical trials are essential for determining whether new interventions are effective. In order to determine the eligibility of patients to enroll into these trials, clinical trial coordinators often perform a manual review of clinical notes in the electronic health record of patients. This is a very time-consuming and exhausting task. Efforts in this process can be expedited if these coordinators are directed toward specific parts of the text that are relevant for eligibility determination. In this study, we describe the creation of a dataset that can be used to evaluate automated methods capable of identifying sentences in a note that are relevant for screening a patient's eligibility in clinical trials. Using this dataset, we also present results for four simple methods in natural language processing that can be used to automate this task. We found that this is a challenging task (maximum F-score=26.25), but it is a promising direction for further research.
Collapse
Affiliation(s)
- Chaitanya Shivade
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA.
| | - Courtney Hebert
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Marcelo Lopetegui
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA; Clínica Alemana de Santiago, Facultad de Medicina Clínica Alemana, Universidad del Desarrollo, Santiago, Chile
| | | | - Eric Fosler-Lussier
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
| | - Albert M Lai
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
49
|
Ni Y, Wright J, Perentesis J, Lingren T, Deleger L, Kaiser M, Kohane I, Solti I. Increasing the efficiency of trial-patient matching: automated clinical trial eligibility pre-screening for pediatric oncology patients. BMC Med Inform Decis Mak 2015; 15:28. [PMID: 25881112 PMCID: PMC4407835 DOI: 10.1186/s12911-015-0149-3] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2014] [Accepted: 03/24/2015] [Indexed: 11/22/2022] Open
Abstract
Background Manual eligibility screening (ES) for a clinical trial typically requires a labor-intensive review of patient records that utilizes many resources. Leveraging state-of-the-art natural language processing (NLP) and information extraction (IE) technologies, we sought to improve the efficiency of physician decision-making in clinical trial enrollment. In order to markedly reduce the pool of potential candidates for staff screening, we developed an automated ES algorithm to identify patients who meet core eligibility characteristics of an oncology clinical trial. Methods We collected narrative eligibility criteria from ClinicalTrials.gov for 55 clinical trials actively enrolling oncology patients in our institution between 12/01/2009 and 10/31/2011. In parallel, our ES algorithm extracted clinical and demographic information from the Electronic Health Record (EHR) data fields to represent profiles of all 215 oncology patients admitted to cancer treatment during the same period. The automated ES algorithm then matched the trial criteria with the patient profiles to identify potential trial-patient matches. Matching performance was validated on a reference set of 169 historical trial-patient enrollment decisions, and workload, precision, recall, negative predictive value (NPV) and specificity were calculated. Results Without automation, an oncologist would need to review 163 patients per trial on average to replicate the historical patient enrollment for each trial. This workload is reduced by 85% to 24 patients when using automated ES (precision/recall/NPV/specificity: 12.6%/100.0%/100.0%/89.9%). Without automation, an oncologist would need to review 42 trials per patient on average to replicate the patient-trial matches that occur in the retrospective data set. With automated ES this workload is reduced by 90% to four trials (precision/recall/NPV/specificity: 35.7%/100.0%/100.0%/95.5%). Conclusion By leveraging NLP and IE technologies, automated ES could dramatically increase the trial screening efficiency of oncologists and enable participation of small practices, which are often left out from trial enrollment. The algorithm has the potential to significantly reduce the effort to execute clinical research at a point in time when new initiatives of the cancer care community intend to greatly expand both the access to trials and the number of available trials. Electronic supplementary material The online version of this article (doi:10.1186/s12911-015-0149-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yizhao Ni
- Cincinnati Children's Hospital Medical Center, Department of Biomedical Informatics, 3333 Burnet Avenue, MLC 7024, Cincinnati, OH, USA.
| | - Jordan Wright
- Cancer and Blood Disease Institute, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - John Perentesis
- Cancer and Blood Disease Institute, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Todd Lingren
- Cincinnati Children's Hospital Medical Center, Department of Biomedical Informatics, 3333 Burnet Avenue, MLC 7024, Cincinnati, OH, USA
| | - Louise Deleger
- Cincinnati Children's Hospital Medical Center, Department of Biomedical Informatics, 3333 Burnet Avenue, MLC 7024, Cincinnati, OH, USA
| | - Megan Kaiser
- Cincinnati Children's Hospital Medical Center, Department of Biomedical Informatics, 3333 Burnet Avenue, MLC 7024, Cincinnati, OH, USA
| | - Isaac Kohane
- Center for Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Imre Solti
- Cincinnati Children's Hospital Medical Center, Department of Biomedical Informatics, 3333 Burnet Avenue, MLC 7024, Cincinnati, OH, USA.,James M Anderson Center for Health Systems Excellence, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| |
Collapse
|
50
|
Crowe CL, Tao C. Designing Ontology-based Patterns for the Representation of the Time-Relevant Eligibility Criteria of Clinical Protocols. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2015; 2015:173-7. [PMID: 26306263 PMCID: PMC4525239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The amount of time and money required to screen patients for clinical trial and guideline eligibility presents the need for an automated screening process to streamline clinical trial enrollment and guideline implementation. This paper introduces an ontology-based approach for defining a set of patterns that can be used to represent various types of time-relevant eligibility criteria that may appear in clinical protocols. With a focus only on temporal requirements, we examined the criteria of 600 protocols and extracted a set of 37 representative time-relevant eligibility criteria. 16 patterns were designed to represent these criteria. Using a test set of an additional 100 protocols, it was found that these 16 patterns could sufficiently represent 98.5% of the time-relevant criteria. After the time-relevant criteria are modeled by these patterns, it will allow the potential to (1) use natural language processing algorithms to automatically extract temporal constraints from criteria; and (2) develop computer rules and queries to automate the processing of the criteria.
Collapse
Affiliation(s)
| | - Cui Tao
- University of Texas School of Biomedical Informatics, Houston, TX
| |
Collapse
|