1
|
Lee K, Liu Z, Chandran U, Kalsekar I, Laxmanan B, Higashi MK, Jun T, Ma M, Li M, Mai Y, Gilman C, Wang T, Ai L, Aggarwal P, Pan Q, Oh W, Stolovitzky G, Schadt E, Wang X. Detecting Ground Glass Opacity Features in Patients With Lung Cancer: Automated Extraction and Longitudinal Analysis via Deep Learning-Based Natural Language Processing. JMIR AI 2023; 2:e44537. [PMID: 38875565 PMCID: PMC11041451 DOI: 10.2196/44537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 01/30/2023] [Accepted: 03/31/2023] [Indexed: 06/16/2024]
Abstract
BACKGROUND Ground-glass opacities (GGOs) appearing in computed tomography (CT) scans may indicate potential lung malignancy. Proper management of GGOs based on their features can prevent the development of lung cancer. Electronic health records are rich sources of information on GGO nodules and their granular features, but most of the valuable information is embedded in unstructured clinical notes. OBJECTIVE We aimed to develop, test, and validate a deep learning-based natural language processing (NLP) tool that automatically extracts GGO features to inform the longitudinal trajectory of GGO status from large-scale radiology notes. METHODS We developed a bidirectional long short-term memory with a conditional random field-based deep-learning NLP pipeline to extract GGO and granular features of GGO retrospectively from radiology notes of 13,216 lung cancer patients. We evaluated the pipeline with quality assessments and analyzed cohort characterization of the distribution of nodule features longitudinally to assess changes in size and solidity over time. RESULTS Our NLP pipeline built on the GGO ontology we developed achieved between 95% and 100% precision, 89% and 100% recall, and 92% and 100% F1-scores on different GGO features. We deployed this GGO NLP model to extract and structure comprehensive characteristics of GGOs from 29,496 radiology notes of 4521 lung cancer patients. Longitudinal analysis revealed that size increased in 16.8% (240/1424) of patients, decreased in 14.6% (208/1424), and remained unchanged in 68.5% (976/1424) in their last note compared to the first note. Among 1127 patients who had longitudinal radiology notes of GGO status, 815 (72.3%) were reported to have stable status, and 259 (23%) had increased/progressed status in the subsequent notes. CONCLUSIONS Our deep learning-based NLP pipeline can automatically extract granular GGO features at scale from electronic health records when this information is documented in radiology notes and help inform the natural history of GGO. This will open the way for a new paradigm in lung cancer prevention and early detection.
Collapse
Affiliation(s)
| | | | - Urmila Chandran
- Lung Cancer Initiative, Johnson & Johnson, New Brunswick, NJ, United States
| | - Iftekhar Kalsekar
- Lung Cancer Initiative, Johnson & Johnson, New Brunswick, NJ, United States
| | - Balaji Laxmanan
- Lung Cancer Initiative, Johnson & Johnson, New Brunswick, NJ, United States
| | | | - Tomi Jun
- Sema4, Stamford, CT, United States
| | - Meng Ma
- Sema4, Stamford, CT, United States
| | | | - Yun Mai
- Sema4, Stamford, CT, United States
| | | | | | - Lei Ai
- Sema4, Stamford, CT, United States
| | | | - Qi Pan
- Sema4, Stamford, CT, United States
| | - William Oh
- Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | | | - Eric Schadt
- Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | | |
Collapse
|
2
|
The Role of Artificial Intelligence in Early Cancer Diagnosis. Cancers (Basel) 2022; 14:cancers14061524. [PMID: 35326674 PMCID: PMC8946688 DOI: 10.3390/cancers14061524] [Citation(s) in RCA: 104] [Impact Index Per Article: 34.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 03/08/2022] [Accepted: 03/10/2022] [Indexed: 02/01/2023] Open
Abstract
Improving the proportion of patients diagnosed with early-stage cancer is a key priority of the World Health Organisation. In many tumour groups, screening programmes have led to improvements in survival, but patient selection and risk stratification are key challenges. In addition, there are concerns about limited diagnostic workforces, particularly in light of the COVID-19 pandemic, placing a strain on pathology and radiology services. In this review, we discuss how artificial intelligence algorithms could assist clinicians in (1) screening asymptomatic patients at risk of cancer, (2) investigating and triaging symptomatic patients, and (3) more effectively diagnosing cancer recurrence. We provide an overview of the main artificial intelligence approaches, including historical models such as logistic regression, as well as deep learning and neural networks, and highlight their early diagnosis applications. Many data types are suitable for computational analysis, including electronic healthcare records, diagnostic images, pathology slides and peripheral blood, and we provide examples of how these data can be utilised to diagnose cancer. We also discuss the potential clinical implications for artificial intelligence algorithms, including an overview of models currently used in clinical practice. Finally, we discuss the potential limitations and pitfalls, including ethical concerns, resource demands, data security and reporting standards.
Collapse
|
3
|
Short RG, Dondlinger S, Wildman-Tobriner B. Management of Incidental Thyroid Nodules on Chest CT: Using Natural Language Processing to Assess White Paper Adherence and Track Patient Outcomes. Acad Radiol 2022; 29:e18-e24. [PMID: 33757722 DOI: 10.1016/j.acra.2021.02.019] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 02/17/2021] [Accepted: 02/21/2021] [Indexed: 12/20/2022]
Abstract
OBJECTIVE The purpose of this study was to develop a natural language processing (NLP) pipeline to identify incidental thyroid nodules (ITNs) meeting criteria for sonographic follow-up and to assess both adherence rates to white paper recommendations and downstream outcomes related to these incidental findings. METHODS 21583 non-contrast chest CT reports from 2017 and 2018 were retrospectively evaluated to identify reports which included either an explicit recommendation for thyroid ultrasound, a description of a nodule ≥ 1.5 cm, or description of a nodule with suspicious features. Reports from 2018 were used to train an NLP algorithm called fastText for automated identification of such reports. Algorithm performance was then evaluated on the 2017 reports. Next, any patient from 2017 with a report meeting criteria for ultrasound follow-up was further evaluated with manual chart review to determine follow-up adherence rates and nodule-related outcomes. RESULTS NLP identified reports with ITNs meeting criteria for sonographic follow-up with an accuracy of 96.5% (95% CI 96.2-96.7) and sensitivity of 92.1% (95% CI 89.8-94.3). In 10006 chest CTs from 2017, ITN follow-up ultrasound was indicated according to white paper criteria in 81 patients (0.8%), explicitly recommended in 46.9% (38/81) of patients, and obtained in less than half of patients in which it was appropriately recommended (17/35, 48.6%). DISCUSSION NLP accurately identified chest CT reports meeting criteria for ITN ultrasound follow-up. Radiologist adherence to white paper guidelines and subsequent referrer adherence to radiologist recommendations showed room for improvement.
Collapse
Affiliation(s)
- Ryan G Short
- Mallinckrodt Institute of Radiology, Washington University School of Medicine in Saint Louis, 510 South Kingshighway Blvd., Saint Louis, MO 63110.
| | | | | |
Collapse
|
4
|
Hunter B, Reis S, Campbell D, Matharu S, Ratnakumar P, Mercuri L, Hindocha S, Kalsi H, Mayer E, Glampson B, Robinson EJ, Al-Lazikani B, Scerri L, Bloch S, Lee R. Development of a Structured Query Language and Natural Language Processing Algorithm to Identify Lung Nodules in a Cancer Centre. Front Med (Lausanne) 2021; 8:748168. [PMID: 34805217 PMCID: PMC8599820 DOI: 10.3389/fmed.2021.748168] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 10/07/2021] [Indexed: 12/04/2022] Open
Abstract
Importance: The stratification of indeterminate lung nodules is a growing problem, but the burden of lung nodules on healthcare services is not well-described. Manual service evaluation and research cohort curation can be time-consuming and potentially improved by automation. Objective: To automate lung nodule identification in a tertiary cancer centre. Methods: This retrospective cohort study used Electronic Healthcare Records to identify CT reports generated between 31st October 2011 and 24th July 2020. A structured query language/natural language processing tool was developed to classify reports according to lung nodule status. Performance was externally validated. Sentences were used to train machine-learning classifiers to predict concerning nodule features in 2,000 patients. Results: 14,586 patients with lung nodules were identified. The cancer types most commonly associated with lung nodules were lung (39%), neuro-endocrine (38%), skin (35%), colorectal (33%) and sarcoma (33%). Lung nodule patients had a greater proportion of metastatic diagnoses (45 vs. 23%, p < 0.001), a higher mean post-baseline scan number (6.56 vs. 1.93, p < 0.001), and a shorter mean scan interval (4.1 vs. 5.9 months, p < 0.001) than those without nodules. Inter-observer agreement for sentence classification was 0.94 internally and 0.98 externally. Sensitivity and specificity for nodule identification were 93 and 99% internally, and 100 and 100% at external validation, respectively. A linear-support vector machine model predicted concerning sentence features with 94% accuracy. Conclusion: We have developed and validated an accurate tool for automated lung nodule identification that is valuable for service evaluation and research data acquisition.
Collapse
Affiliation(s)
- Benjamin Hunter
- The Royal Marsden National Health Service (NHS) Foundation Trust, Lung Unit, London, United Kingdom.,Department of Surgery and Cancer, Imperial College London, London, United Kingdom
| | - Sara Reis
- The Royal Marsden National Health Service (NHS) Foundation Trust, Lung Unit, London, United Kingdom
| | - Des Campbell
- The Royal Marsden National Health Service (NHS) Foundation Trust, Lung Unit, London, United Kingdom
| | - Sheila Matharu
- The Royal Marsden National Health Service (NHS) Foundation Trust, Lung Unit, London, United Kingdom
| | | | - Luca Mercuri
- Imperial College Healthcare National Health Service (NHS) Trust, Imperial Clinical Analytics, Research and Evaluation, London, United Kingdom
| | - Sumeet Hindocha
- The Royal Marsden National Health Service (NHS) Foundation Trust, Lung Unit, London, United Kingdom.,Department of Surgery and Cancer, Imperial College London, London, United Kingdom
| | - Hardeep Kalsi
- The Royal Marsden National Health Service (NHS) Foundation Trust, Lung Unit, London, United Kingdom.,Department of Surgery and Cancer, Imperial College London, London, United Kingdom
| | - Erik Mayer
- Department of Surgery and Cancer, Imperial College London, London, United Kingdom.,Imperial College Healthcare National Health Service (NHS) Trust, Imperial Clinical Analytics, Research and Evaluation, London, United Kingdom
| | - Ben Glampson
- Imperial College Healthcare National Health Service (NHS) Trust, Imperial Clinical Analytics, Research and Evaluation, London, United Kingdom
| | - Emily J Robinson
- The Royal Marsden National Health Service (NHS) Foundation Trust, Royal Marsden Clinical Trials Unit, London, United Kingdom
| | - Bisan Al-Lazikani
- The Institute for Cancer Research, Computational Biology and Chromogenetics, London, United Kingdom
| | - Lisa Scerri
- The Royal Marsden National Health Service (NHS) Foundation Trust, Lung Unit, London, United Kingdom
| | - Susannah Bloch
- Imperial College Healthcare Trust, Respiratory Medicine, London, United Kingdom
| | - Richard Lee
- The Royal Marsden National Health Service (NHS) Foundation Trust, Lung Unit, London, United Kingdom.,Imperial College London, National Heart and Lung Institute, London, United Kingdom.,The Institute for Cancer Research, Early Diagnosis and Detection, Genetics and Epidemiology, London, United Kingdom
| |
Collapse
|
5
|
Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers. BMC Med Inform Decis Mak 2021; 21:262. [PMID: 34511100 PMCID: PMC8436473 DOI: 10.1186/s12911-021-01623-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 08/23/2021] [Indexed: 01/27/2023] Open
Abstract
Background It is essential for radiologists to communicate actionable findings to the referring clinicians reliably. Natural language processing (NLP) has been shown to help identify free-text radiology reports including actionable findings. However, the application of recent deep learning techniques to radiology reports, which can improve the detection performance, has not been thoroughly examined. Moreover, free-text that clinicians input in the ordering form (order information) has seldom been used to identify actionable reports. This study aims to evaluate the benefits of two new approaches: (1) bidirectional encoder representations from transformers (BERT), a recent deep learning architecture in NLP, and (2) using order information in addition to radiology reports. Methods We performed a binary classification to distinguish actionable reports (i.e., radiology reports tagged as actionable in actual radiological practice) from non-actionable ones (those without an actionable tag). 90,923 Japanese radiology reports in our hospital were used, of which 788 (0.87%) were actionable. We evaluated four methods, statistical machine learning with logistic regression (LR) and with gradient boosting decision tree (GBDT), and deep learning with a bidirectional long short-term memory (LSTM) model and a publicly available Japanese BERT model. Each method was used with two different inputs, radiology reports alone and pairs of order information and radiology reports. Thus, eight experiments were conducted to examine the performance. Results Without order information, BERT achieved the highest area under the precision-recall curve (AUPRC) of 0.5138, which showed a statistically significant improvement over LR, GBDT, and LSTM, and the highest area under the receiver operating characteristic curve (AUROC) of 0.9516. Simply coupling the order information with the radiology reports slightly increased the AUPRC of BERT but did not lead to a statistically significant improvement. This may be due to the complexity of clinical decisions made by radiologists. Conclusions BERT was assumed to be useful to detect actionable reports. More sophisticated methods are required to use order information effectively.
Collapse
|
6
|
Casey A, Davidson E, Poon M, Dong H, Duma D, Grivas A, Grover C, Suárez-Paniagua V, Tobin R, Whiteley W, Wu H, Alex B. A systematic review of natural language processing applied to radiology reports. BMC Med Inform Decis Mak 2021; 21:179. [PMID: 34082729 PMCID: PMC8176715 DOI: 10.1186/s12911-021-01533-7] [Citation(s) in RCA: 87] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 05/17/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Natural language processing (NLP) has a significant role in advancing healthcare and has been found to be key in extracting structured information from radiology reports. Understanding recent developments in NLP application to radiology is of significance but recent reviews on this are limited. This study systematically assesses and quantifies recent literature in NLP applied to radiology reports. METHODS We conduct an automated literature search yielding 4836 results using automated filtering, metadata enriching steps and citation search combined with manual review. Our analysis is based on 21 variables including radiology characteristics, NLP methodology, performance, study, and clinical application characteristics. RESULTS We present a comprehensive analysis of the 164 publications retrieved with publications in 2019 almost triple those in 2015. Each publication is categorised into one of 6 clinical application categories. Deep learning use increases in the period but conventional machine learning approaches are still prevalent. Deep learning remains challenged when data is scarce and there is little evidence of adoption into clinical practice. Despite 17% of studies reporting greater than 0.85 F1 scores, it is hard to comparatively evaluate these approaches given that most of them use different datasets. Only 14 studies made their data and 15 their code available with 10 externally validating results. CONCLUSIONS Automated understanding of clinical narratives of the radiology reports has the potential to enhance the healthcare process and we show that research in this field continues to grow. Reproducibility and explainability of models are important if the domain is to move applications into clinical use. More could be done to share code enabling validation of methods on different institutional data and to reduce heterogeneity in reporting of study properties allowing inter-study comparisons. Our results have significance for researchers in the field providing a systematic synthesis of existing work to build on, identify gaps, opportunities for collaboration and avoid duplication.
Collapse
Affiliation(s)
- Arlene Casey
- School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland
| | - Emma Davidson
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
| | - Michael Poon
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
| | - Hang Dong
- Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, Scotland
- Health Data Research UK, London, UK
| | - Daniel Duma
- School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland
| | - Andreas Grivas
- Institute for Language, Cognition and Computation, School of informatics, University of Edinburgh, Edinburgh, Scotland
| | - Claire Grover
- Institute for Language, Cognition and Computation, School of informatics, University of Edinburgh, Edinburgh, Scotland
| | - Víctor Suárez-Paniagua
- Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, Scotland
- Health Data Research UK, London, UK
| | - Richard Tobin
- Institute for Language, Cognition and Computation, School of informatics, University of Edinburgh, Edinburgh, Scotland
| | - William Whiteley
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Honghan Wu
- Health Data Research UK, London, UK
- Institute of Health Informatics, University College London, London, UK
| | - Beatrice Alex
- School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland
- Edinburgh Futures Institute, University of Edinburgh, Edinburgh, Scotland
| |
Collapse
|
7
|
Senders JT, Karhade AV, Cote DJ, Mehrtash A, Lamba N, DiRisio A, Muskens IS, Gormley WB, Smith TR, Broekman MLD, Arnaout O. Natural Language Processing for Automated Quantification of Brain Metastases Reported in Free-Text Radiology Reports. JCO Clin Cancer Inform 2020; 3:1-9. [PMID: 31002562 DOI: 10.1200/cci.18.00138] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
PURPOSE Although the bulk of patient-generated health data are increasing exponentially, their use is impeded because most data come in unstructured format, namely as free-text clinical reports. A variety of natural language processing (NLP) methods have emerged to automate the processing of free text ranging from statistical to deep learning-based models; however, the optimal approach for medical text analysis remains to be determined. The aim of this study was to provide a head-to-head comparison of novel NLP techniques and inform future studies about their utility for automated medical text analysis. PATIENTS AND METHODS Magnetic resonance imaging reports of patients with brain metastases treated in two tertiary centers were retrieved and manually annotated using a binary classification (single metastasis v two or more metastases). Multiple bag-of-words and sequence-based NLP models were developed and compared after randomly splitting the annotated reports into training and test sets in an 80:20 ratio. RESULTS A total of 1,479 radiology reports of patients diagnosed with brain metastases were retrieved. The least absolute shrinkage and selection operator (LASSO) regression model demonstrated the best overall performance on the hold-out test set with an area under the receiver operating characteristic curve of 0.92 (95% CI, 0.89 to 0.94), accuracy of 83% (95% CI, 80% to 87%), calibration intercept of -0.06 (95% CI, -0.14 to 0.01), and calibration slope of 1.06 (95% CI, 0.95 to 1.17). CONCLUSION Among various NLP techniques, the bag-of-words approach combined with a LASSO regression model demonstrated the best overall performance in extracting binary outcomes from free-text clinical reports. This study provides a framework for the development of machine learning-based NLP models as well as a clinical vignette of patients diagnosed with brain metastases.
Collapse
Affiliation(s)
- Joeky T Senders
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Aditya V Karhade
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - David J Cote
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Alireza Mehrtash
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Nayan Lamba
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Aislyn DiRisio
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Ivo S Muskens
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | | | - Timothy R Smith
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | | | - Omar Arnaout
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| |
Collapse
|
8
|
Sanchez R, Bailey G, Kaboli PJ, Zeliadt SB, Lang JA, Hoffman RM. Applying a Text-Search Algorithm to Radiology Reports Can Find More Patients With Pulmonary Nodules Than Radiology Coding Alone. Fed Pract 2020; 37:S32-S37. [PMID: 32952385 PMCID: PMC7497875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
INTRODUCTION Chest imaging often incidentally finds indeterminate nodules that need to be monitored to ensure early detection of lung cancers. Health care systems need effective approaches for identifying these lung nodules. We compared the diagnostic performance of 2 approaches for identifying patients with lung nodules on imaging studies (chest/abdomen): (1) relying on radiologists to code imaging studies with lung nodules; and (2) applying a text search algorithm to identify references to lung nodules in radiology reports. METHODS We assessed all radiology studies performed between January 1, 2016 and November 30, 2016 in a single Veterans Health Administration hospital. We first identified imaging reports with a diagnostic code for a pulmonary nodule. We then applied a text search algorithm to identify imaging reports with key words associated with lung nodules. We reviewed medical records for all patients with a suspicious radiology report based on either search strategy to confirm the presence of a lung nodule. We calculated the yield and the positive predictive value (PPV) of each search strategy for finding pulmonary nodules. RESULTS We identified 12,983 imaging studies with a potential lung nodule. Chart review confirmed 8,516 imaging studies with lung nodules, representing 2,912 unique patients. The text search algorithm identified all the patients with lung nodules identified by the radiology coding (n = 1,251) as well as an additional 1,661 patients. The PPV of the text search was 72% (2,912/4,071) and the PPV of the radiology code was 92% (1,251/1,363). Among the patients with nodules missed by radiology coding but identified by the text search algorithm, 130 had lung nodules > 8 mm in diameter. CONCLUSIONS The text search algorithm can identify additional patients with lung nodules compared to the radiology coding; however, this strategy requires substantial clinical review time to confirm nodules. Health care systems adopting nodule-tracking approaches should recognize that relying only on radiology coding might miss clinically important nodules.
Collapse
Affiliation(s)
- Rolando Sanchez
- is a Clinical Assistant Professor of Pulmonary and Critical Care Medicine; is a Professor of Internal Medicine; and is a Professor of Internal Medicine, all at the University of Iowa Carver College of Medicine in Iowa City. is a Research Data Manager; is a Registered Nurse and Research Coordinator; and Peter Kaboli is an Associate Investigator, all in the Center for Access and Delivery Research and Evaluation (CADRE) at the Iowa City VA Healthcare System. is a Research Professor of Public Health at the Seattle-Denver Center of Innovation for Veteran-Centered and Value-Driven Care, VA Puget Sound Health Care System and the University of Washington School of Public Health in Seattle
| | - George Bailey
- is a Clinical Assistant Professor of Pulmonary and Critical Care Medicine; is a Professor of Internal Medicine; and is a Professor of Internal Medicine, all at the University of Iowa Carver College of Medicine in Iowa City. is a Research Data Manager; is a Registered Nurse and Research Coordinator; and Peter Kaboli is an Associate Investigator, all in the Center for Access and Delivery Research and Evaluation (CADRE) at the Iowa City VA Healthcare System. is a Research Professor of Public Health at the Seattle-Denver Center of Innovation for Veteran-Centered and Value-Driven Care, VA Puget Sound Health Care System and the University of Washington School of Public Health in Seattle
| | - Peter J Kaboli
- is a Clinical Assistant Professor of Pulmonary and Critical Care Medicine; is a Professor of Internal Medicine; and is a Professor of Internal Medicine, all at the University of Iowa Carver College of Medicine in Iowa City. is a Research Data Manager; is a Registered Nurse and Research Coordinator; and Peter Kaboli is an Associate Investigator, all in the Center for Access and Delivery Research and Evaluation (CADRE) at the Iowa City VA Healthcare System. is a Research Professor of Public Health at the Seattle-Denver Center of Innovation for Veteran-Centered and Value-Driven Care, VA Puget Sound Health Care System and the University of Washington School of Public Health in Seattle
| | - Steven B Zeliadt
- is a Clinical Assistant Professor of Pulmonary and Critical Care Medicine; is a Professor of Internal Medicine; and is a Professor of Internal Medicine, all at the University of Iowa Carver College of Medicine in Iowa City. is a Research Data Manager; is a Registered Nurse and Research Coordinator; and Peter Kaboli is an Associate Investigator, all in the Center for Access and Delivery Research and Evaluation (CADRE) at the Iowa City VA Healthcare System. is a Research Professor of Public Health at the Seattle-Denver Center of Innovation for Veteran-Centered and Value-Driven Care, VA Puget Sound Health Care System and the University of Washington School of Public Health in Seattle
| | - Julie A Lang
- is a Clinical Assistant Professor of Pulmonary and Critical Care Medicine; is a Professor of Internal Medicine; and is a Professor of Internal Medicine, all at the University of Iowa Carver College of Medicine in Iowa City. is a Research Data Manager; is a Registered Nurse and Research Coordinator; and Peter Kaboli is an Associate Investigator, all in the Center for Access and Delivery Research and Evaluation (CADRE) at the Iowa City VA Healthcare System. is a Research Professor of Public Health at the Seattle-Denver Center of Innovation for Veteran-Centered and Value-Driven Care, VA Puget Sound Health Care System and the University of Washington School of Public Health in Seattle
| | - Richard M Hoffman
- is a Clinical Assistant Professor of Pulmonary and Critical Care Medicine; is a Professor of Internal Medicine; and is a Professor of Internal Medicine, all at the University of Iowa Carver College of Medicine in Iowa City. is a Research Data Manager; is a Registered Nurse and Research Coordinator; and Peter Kaboli is an Associate Investigator, all in the Center for Access and Delivery Research and Evaluation (CADRE) at the Iowa City VA Healthcare System. is a Research Professor of Public Health at the Seattle-Denver Center of Innovation for Veteran-Centered and Value-Driven Care, VA Puget Sound Health Care System and the University of Washington School of Public Health in Seattle
| |
Collapse
|
9
|
Rasmussen LV, Brandt PS, Jiang G, Kiefer RC, Pacheco JA, Adekkanattu P, Ancker JS, Wang F, Xu Z, Pathak J, Luo Y. Considerations for Improving the Portability of Electronic Health Record-Based Phenotype Algorithms. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2020; 2019:755-764. [PMID: 32308871 PMCID: PMC7153055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
With the increased adoption of electronic health records, data collected for routine clinical care is used for health outcomes and population sciences research, including the identification of phenotypes. In recent years, research networks, such as eMERGE, OHDSI and PCORnet, have been able to increase statistical power and population diversity by combining patient cohorts. These networks share phenotype algorithms that are executed at each participating site. Here we observe experiences with phenotype algorithm portability across seven research networks and propose a generalizable framework for phenotype algorithm portability. Several strategies exist to increase the portability of phenotype algorithms, reducing the implementation effort needed by each site. These include using a common data model, standardized representation of the phenotype algorithm logic, and technical solutions to facilitate federated execution of queries. Portability is achieved by tradeoffs across three domains: Data, Authoring and Implementation, and multiple approaches were observed in representing portable phenotype algorithms. Our proposed framework will help guide future research in operationalizing phenotype algorithm portability at scale.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Fei Wang
- Northwestern University, Chicago, IL
| | | | | | - Yuan Luo
- Northwestern University, Chicago, IL
| |
Collapse
|
10
|
Wadia R, Akgun K, Brandt C, Fenton BT, Levin W, Marple AH, Garla V, Rose MG, Taddei T, Taylor C. Comparison of Natural Language Processing and Manual Coding for the Identification of Cross-Sectional Imaging Reports Suspicious for Lung Cancer. JCO Clin Cancer Inform 2019; 2:1-7. [PMID: 30652545 PMCID: PMC6873962 DOI: 10.1200/cci.17.00069] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Purpose To compare the accuracy and reliability of a natural language processing (NLP) algorithm with manual coding by radiologists, and the combination of the two methods, for the identification of patients whose computed tomography (CT) reports raised the concern for lung cancer. Methods An NLP algorithm was developed using Clinical Text Analysis and Knowledge Extraction System (cTAKES) with the Yale cTAKES Extensions and trained to differentiate between language indicating benign lesions and lesions concerning for lung cancer. A random sample of 450 chest CT reports performed at Veterans Affairs Connecticut Healthcare System between January 2014 and July 2015 was selected. A reference standard was created by the manual review of reports to determine if the text stated that follow-up was needed for concern for cancer. The NLP algorithm was applied to all reports and compared with case identification using the manual coding by the radiologists. Results A total of 450 reports representing 428 patients were analyzed. NLP had higher sensitivity and lower specificity than manual coding (77.3% v 51.5% and 72.5% v 82.5%, respectively). NLP and manual coding had similar positive predictive values (88.4% v 88.9%), and NLP had a higher negative predictive value than manual coding (54% v 38.5%). When NLP and manual coding were combined, sensitivity increased to 92.3%, with a decrease in specificity to 62.85%. Combined NLP and manual coding had a positive predictive value of 87.0% and a negative predictive value of 75.2%. Conclusion Our NLP algorithm was more sensitive than manual coding of CT chest reports for the identification of patients who required follow-up for suspicion of lung cancer. The combination of NLP and manual coding is a sensitive way to identify patients who need further workup for lung cancer.
Collapse
Affiliation(s)
- Roxanne Wadia
- Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Andrew H. Marple, Vijay Garla, Michal G. Rose, and Tamar Taddei, Yale University School of Medicine, New Haven; and Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Woody Levin, Michal G. Rose, Tamar Taddei, and Caroline Taylor, Veterans Affairs Connecticut Healthcare System, West Haven, CT
| | - Kathleen Akgun
- Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Andrew H. Marple, Vijay Garla, Michal G. Rose, and Tamar Taddei, Yale University School of Medicine, New Haven; and Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Woody Levin, Michal G. Rose, Tamar Taddei, and Caroline Taylor, Veterans Affairs Connecticut Healthcare System, West Haven, CT
| | - Cynthia Brandt
- Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Andrew H. Marple, Vijay Garla, Michal G. Rose, and Tamar Taddei, Yale University School of Medicine, New Haven; and Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Woody Levin, Michal G. Rose, Tamar Taddei, and Caroline Taylor, Veterans Affairs Connecticut Healthcare System, West Haven, CT
| | - Brenda T Fenton
- Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Andrew H. Marple, Vijay Garla, Michal G. Rose, and Tamar Taddei, Yale University School of Medicine, New Haven; and Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Woody Levin, Michal G. Rose, Tamar Taddei, and Caroline Taylor, Veterans Affairs Connecticut Healthcare System, West Haven, CT
| | - Woody Levin
- Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Andrew H. Marple, Vijay Garla, Michal G. Rose, and Tamar Taddei, Yale University School of Medicine, New Haven; and Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Woody Levin, Michal G. Rose, Tamar Taddei, and Caroline Taylor, Veterans Affairs Connecticut Healthcare System, West Haven, CT
| | - Andrew H Marple
- Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Andrew H. Marple, Vijay Garla, Michal G. Rose, and Tamar Taddei, Yale University School of Medicine, New Haven; and Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Woody Levin, Michal G. Rose, Tamar Taddei, and Caroline Taylor, Veterans Affairs Connecticut Healthcare System, West Haven, CT
| | - Vijay Garla
- Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Andrew H. Marple, Vijay Garla, Michal G. Rose, and Tamar Taddei, Yale University School of Medicine, New Haven; and Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Woody Levin, Michal G. Rose, Tamar Taddei, and Caroline Taylor, Veterans Affairs Connecticut Healthcare System, West Haven, CT
| | - Michal G Rose
- Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Andrew H. Marple, Vijay Garla, Michal G. Rose, and Tamar Taddei, Yale University School of Medicine, New Haven; and Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Woody Levin, Michal G. Rose, Tamar Taddei, and Caroline Taylor, Veterans Affairs Connecticut Healthcare System, West Haven, CT
| | - Tamar Taddei
- Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Andrew H. Marple, Vijay Garla, Michal G. Rose, and Tamar Taddei, Yale University School of Medicine, New Haven; and Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Woody Levin, Michal G. Rose, Tamar Taddei, and Caroline Taylor, Veterans Affairs Connecticut Healthcare System, West Haven, CT
| | - Caroline Taylor
- Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Andrew H. Marple, Vijay Garla, Michal G. Rose, and Tamar Taddei, Yale University School of Medicine, New Haven; and Roxanne Wadia, Kathleen Akgun, Cynthia Brandt, Brenda T. Fenton, Woody Levin, Michal G. Rose, Tamar Taddei, and Caroline Taylor, Veterans Affairs Connecticut Healthcare System, West Haven, CT
| |
Collapse
|
11
|
Kang SK, Garry K, Chung R, Moore WH, Iturrate E, Swartz JL, Kim DC, Horwitz LI, Blecker S. Natural Language Processing for Identification of Incidental Pulmonary Nodules in Radiology Reports. J Am Coll Radiol 2019; 16:1587-1594. [PMID: 31132331 DOI: 10.1016/j.jacr.2019.04.026] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Accepted: 04/29/2019] [Indexed: 12/26/2022]
Abstract
PURPOSE To develop natural language processing (NLP) to identify incidental lung nodules (ILNs) in radiology reports for assessment of management recommendations. METHODS AND MATERIALS We searched the electronic health records for patients who underwent chest CT during 2014 and 2017, before and after implementation of a department-wide dictation macro of the Fleischner Society recommendations. We randomly selected 950 unstructured chest CT reports and reviewed manually for ILNs. An NLP tool was trained and validated against the manually reviewed set, for the task of automated detection of ILNs with exclusion of previously known or definitively benign nodules. For ILNs found in the training and validation sets, we assessed whether reported management recommendations agreed with Fleischner Society guidelines. The guideline concordance of management recommendations was compared between 2014 and 2017. RESULTS The NLP tool identified ILNs with sensitivity and specificity of 91.1% and 82.2%, respectively, in the validation set. Positive and negative predictive values were 59.7% and 97.0%. In reports of ILNs in the training and validation sets before versus after introduction of a Fleischner reporting macro, there was no difference in the proportion of reports with ILNs (108 of 500 [21.6%] versus 101 of 450 [22.4%]; P = .8), or in the proportion of reports with ILNs containing follow-up recommendations (75 of 108 [69.4%] versus 80 of 101 [79.2%]; P = .2]. Rates of recommendation guideline concordance were not significantly different before and after implementation of the standardized macro (52 of 75 [69.3%] versus 60 of 80 [75.0%]; P = .43). CONCLUSION NLP reliably automates identification of ILNs in unstructured reports, pertinent to quality improvement efforts for ILN management.
Collapse
Affiliation(s)
- Stella K Kang
- Department of Radiology, NYU Langone Health, New York, New York; Department of Population Health, NYU Langone Health, New York, New York.
| | - Kira Garry
- Department of Population Health, NYU Langone Health, New York, New York
| | - Ryan Chung
- Department of Radiology, NYU Langone Health, New York, New York
| | - William H Moore
- Department of Radiology, NYU Langone Health, New York, New York
| | | | - Jordan L Swartz
- Department of Emergency Medicine, NYU Langone Health, New York, New York
| | - Danny C Kim
- Department of Radiology, NYU Langone Health, New York, New York
| | - Leora I Horwitz
- Department of Population Health, NYU Langone Health, New York, New York; Department of Medicine, NYU Langone Health, New York, New York
| | - Saul Blecker
- Department of Population Health, NYU Langone Health, New York, New York; Department of Medicine, NYU Langone Health, New York, New York
| |
Collapse
|
12
|
Monitoring Lung Cancer Screening Use and Outcomes at Four Cancer Research Network Sites. Ann Am Thorac Soc 2018; 14:1827-1835. [PMID: 28683215 DOI: 10.1513/annalsats.201703-237oc] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
RATIONALE Lung cancer screening registries can monitor screening outcomes and improve quality of care. OBJECTIVES To describe nascent lung cancer screening programs and share efficient data collection approaches for mandatory registry reporting in four integrated health care systems of the National Cancer Institute-funded Cancer Research Network. METHODS We documented the distinctive characteristics of lung cancer screening programs, and we provide examples of strategies to facilitate data collection and describe early challenges and possible solutions. In addition, we report preliminary data on use and outcomes of screening with low-dose computed tomography at each of the participating sites. RESULTS Programs varied in approaches to confirming patient eligibility, ordering screening low-dose computed tomographic scans, and coordinating follow-up care. Most data elements were collected from structured fields in electronic health records, but sites also made use of standardized order templates, local procedure codes, identifiable hashtags in radiology reports, and natural language processing algorithms. Common challenges included incomplete documentation of tobacco smoking history, difficulty distinguishing between scans performed for screening versus diagnosis or surveillance, and variable adherence with use of standardized templates. Adherence with eligibility criteria as well as the accuracy and completeness of data collection appeared to depend at least partly on availability of personnel and other resources to support the successful implementation of screening. CONCLUSIONS To maximize the effectiveness of lung cancer screening, minimize the burden of data collection, and facilitate research and quality improvement, clinical workflow and information technology should be purposefully designed to ensure that patients meet eligibility criteria and receive appropriate follow-up testing.
Collapse
|