Natural Language Processing for the Identification of Incidental Lung Nodules in Computed Tomography Reports: A Quality Control Tool.
JCO Glob Oncol 2023;
9:e2300191. [PMID:
37769221 PMCID:
PMC10581645 DOI:
10.1200/go.23.00191]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 07/09/2023] [Accepted: 08/22/2023] [Indexed: 09/30/2023] Open
Abstract
PURPOSE
To evaluate the diagnostic performance of a natural language processing (NLP) model in detecting incidental lung nodules (ILNs) in unstructured chest computed tomography (CT) reports.
METHODS
All unstructured consecutive reports of chest CT scans performed at a tertiary hospital between 2020 and 2021 were retrospectively reviewed (n = 21,542) to train the NLP tool. Internal validation was performed using reference readings by two radiologists of both CT scans and reports, using a different external cohort of 300 chest CT scans. Second, external validation was performed in a cohort of all random unstructured chest CT reports from 57 different hospitals conducted in May 2022. A review by the same thoracic radiologists was used as the gold standard. The sensitivity, specificity, and accuracy were calculated.
RESULTS
Of 21,542 CT reports, 484 mentioned at least one ILN (mean age, 71 ± 17.6 [standard deviation] years; women, 52%) and were included in the training set. In the internal validation (n = 300), the NLP tool detected ILN with a sensitivity of 100.0% (95% CI, 97.6 to 100.0), a specificity of 95.9% (95% CI, 91.3 to 98.5), and an accuracy of 98.0% (95% CI, 95.7 to 99.3). In the external validation (n = 977), the NLP tool yielded a sensitivity of 98.4% (95% CI, 94.5 to 99.8), a specificity of 98.6% (95% CI, 97.5 to 99.3), and an accuracy of 98.6% (95% CI, 97.6 to 99.2). Twelve months after the initial reports, 8 (8.60%) patients had a final diagnosis of lung cancer, among which 2 (2.15%) would have been lost to follow-up without the NLP tool.
CONCLUSION
NLP can be used to identify ILNs in unstructured reports with high accuracy, allowing a timely recall of patients and a potential diagnosis of early-stage lung cancer that might have been lost to follow-up.
Collapse