1
|
Wang TW, Hong JS, Chiu HY, Chao HS, Chen YM, Wu YT. Standalone deep learning versus experts for diagnosis lung cancer on chest computed tomography: a systematic review. Eur Radiol 2024; 34:7397-7407. [PMID: 38777902 PMCID: PMC11519296 DOI: 10.1007/s00330-024-10804-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Revised: 03/10/2024] [Accepted: 04/01/2024] [Indexed: 05/25/2024]
Abstract
PURPOSE To compare the diagnostic performance of standalone deep learning (DL) algorithms and human experts in lung cancer detection on chest computed tomography (CT) scans. MATERIALS AND METHODS This study searched for studies on PubMed, Embase, and Web of Science from their inception until November 2023. We focused on adult lung cancer patients and compared the efficacy of DL algorithms and expert radiologists in disease diagnosis on CT scans. Quality assessment was performed using QUADAS-2, QUADAS-C, and CLAIM. Bivariate random-effects and subgroup analyses were performed for tasks (malignancy classification vs invasiveness classification), imaging modalities (CT vs low-dose CT [LDCT] vs high-resolution CT), study region, software used, and publication year. RESULTS We included 20 studies on various aspects of lung cancer diagnosis on CT scans. Quantitatively, DL algorithms exhibited superior sensitivity (82%) and specificity (75%) compared to human experts (sensitivity 81%, specificity 69%). However, the difference in specificity was statistically significant, whereas the difference in sensitivity was not statistically significant. The DL algorithms' performance varied across different imaging modalities and tasks, demonstrating the need for tailored optimization of DL algorithms. Notably, DL algorithms matched experts in sensitivity on standard CT, surpassing them in specificity, but showed higher sensitivity with lower specificity on LDCT scans. CONCLUSION DL algorithms demonstrated improved accuracy over human readers in malignancy and invasiveness classification on CT scans. However, their performance varies by imaging modality, underlining the importance of continued research to fully assess DL algorithms' diagnostic effectiveness in lung cancer. CLINICAL RELEVANCE STATEMENT DL algorithms have the potential to refine lung cancer diagnosis on CT, matching human sensitivity and surpassing in specificity. These findings call for further DL optimization across imaging modalities, aiming to advance clinical diagnostics and patient outcomes. KEY POINTS Lung cancer diagnosis by CT is challenging and can be improved with AI integration. DL shows higher accuracy in lung cancer detection on CT than human experts. Enhanced DL accuracy could lead to improved lung cancer diagnosis and outcomes.
Collapse
Affiliation(s)
- Ting-Wei Wang
- Institute of Biophotonics, National Yang-Ming Chiao Tung University, Taipei, Taiwan
- School of Medicine, National Yang-Ming Chiao Tung University, Taipei, Taiwan
| | - Jia-Sheng Hong
- Institute of Biophotonics, National Yang-Ming Chiao Tung University, Taipei, Taiwan
| | - Hwa-Yen Chiu
- Institute of Biophotonics, National Yang-Ming Chiao Tung University, Taipei, Taiwan
- School of Medicine, National Yang-Ming Chiao Tung University, Taipei, Taiwan
- Department of Chest Medicine, Taipei Veteran General Hospital, Taipei, Taiwan
| | - Heng-Sheng Chao
- Department of Chest Medicine, Taipei Veteran General Hospital, Taipei, Taiwan
| | - Yuh-Min Chen
- School of Medicine, National Yang-Ming Chiao Tung University, Taipei, Taiwan
- Department of Chest Medicine, Taipei Veteran General Hospital, Taipei, Taiwan
| | - Yu-Te Wu
- Institute of Biophotonics, National Yang-Ming Chiao Tung University, Taipei, Taiwan.
| |
Collapse
|
2
|
Liu J, Qi L, Wang Y, Li F, Chen J, Cheng S, Zhou Z, Yu Y, Wang J. Diagnostic performance of a deep learning-based method in differentiating malignant from benign subcentimeter (≤10 mm) solid pulmonary nodules. J Thorac Dis 2023; 15:5475-5484. [PMID: 37969262 PMCID: PMC10636433 DOI: 10.21037/jtd-23-985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 09/08/2023] [Indexed: 11/17/2023]
Abstract
Background This study assessed the diagnostic performance of a deep learning (DL)-based model for differentiating malignant subcentimeter (≤10 mm) solid pulmonary nodules (SSPNs) from benign ones in computed tomography (CT) images compared against radiologists with 10 and 15 years of experience in thoracic imaging (medium-senior seniority). Methods Overall, 200 SSPNs (100 benign and 100 malignant) were retrospectively collected. Malignancy was confirmed by pathology, and benignity was confirmed by follow-up or pathology. CT images were fed into the DL model to obtain the probability of malignancy (range, 0-100%) for each nodule. According to the diagnostic results, enrolled nodules were classified into benign, malignant, or indeterminate. The accuracy and diagnostic composition of the model were compared with those of the radiologists using the McNemar-Bowker test. Enrolled nodules were divided into 3-6-, 6-8-, and 8-10-mm subgroups. For each subgroup, the diagnostic results of the model were compared with those of the radiologists. Results The accuracy of the DL model, in differentiating malignant and benign SSPNs, was significantly higher than that of the radiologists (71.5% vs. 38.5%, P<0.001). The DL model reported more benign or malignant deterministic results and fewer indeterminate results. In subgroup analysis of nodule size, the DL model also yielded higher performance in comparison with that of the radiologists, providing fewer indeterminate results. The accuracy of the two methods in the 3-6-, 6-8-, and 8-10-mm subgroups was 75.5% vs. 28.3% (P<0.001), 62.0% vs. 28.2% (P<0.001), and 77.6% vs. 55.3% (P=0.001), respectively, and the indeterminate results were 3.8% vs. 66.0%, 8.5% vs. 66.2%, and 2.6% vs. 35.5% (all P<0.001), respectively. Conclusions The DL-based method yielded higher performance in comparison with that of the radiologists in differentiating malignant and benign SSPNs. This DL model may reduce uncertainty in diagnosis and improve diagnostic accuracy, especially for SSPNs smaller than 8 mm.
Collapse
Affiliation(s)
- Jianing Liu
- Department of Diagnostic Radiology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Linlin Qi
- Department of Diagnostic Radiology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yawen Wang
- Department of Diagnostic Radiology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Fenglan Li
- Department of Diagnostic Radiology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Jiaqi Chen
- Department of Diagnostic Radiology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Sainan Cheng
- Department of Diagnostic Radiology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Zhen Zhou
- Beijing Deepwise & League of PhD Technology Co., Ltd., Beijing, China
| | - Yizhou Yu
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Jianwei Wang
- Department of Diagnostic Radiology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
3
|
Ewals LJS, van der Wulp K, van den Borne BEEM, Pluyter JR, Jacobs I, Mavroeidis D, van der Sommen F, Nederend J. The Effects of Artificial Intelligence Assistance on the Radiologists' Assessment of Lung Nodules on CT Scans: A Systematic Review. J Clin Med 2023; 12:jcm12103536. [PMID: 37240643 DOI: 10.3390/jcm12103536] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 04/19/2023] [Accepted: 05/16/2023] [Indexed: 05/28/2023] Open
Abstract
To reduce the number of missed or misdiagnosed lung nodules on CT scans by radiologists, many Artificial Intelligence (AI) algorithms have been developed. Some algorithms are currently being implemented in clinical practice, but the question is whether radiologists and patients really benefit from the use of these novel tools. This study aimed to review how AI assistance for lung nodule assessment on CT scans affects the performances of radiologists. We searched for studies that evaluated radiologists' performances in the detection or malignancy prediction of lung nodules with and without AI assistance. Concerning detection, radiologists achieved with AI assistance a higher sensitivity and AUC, while the specificity was slightly lower. Concerning malignancy prediction, radiologists achieved with AI assistance generally a higher sensitivity, specificity and AUC. The radiologists' workflows of using the AI assistance were often only described in limited detail in the papers. As recent studies showed improved performances of radiologists with AI assistance, AI assistance for lung nodule assessment holds great promise. To achieve added value of AI tools for lung nodule assessment in clinical practice, more research is required on the clinical validation of AI tools, impact on follow-up recommendations and ways of using AI tools.
Collapse
Affiliation(s)
- Lotte J S Ewals
- Department of Radiology, Catharina Cancer Institute, Catharina Hospital Eindhoven, 5623 EJ Eindhoven, The Netherlands
- Department of Electrical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands
| | - Kasper van der Wulp
- Department of Radiology, Catharina Cancer Institute, Catharina Hospital Eindhoven, 5623 EJ Eindhoven, The Netherlands
| | - Ben E E M van den Borne
- Department of Pulmonology, Catharina Cancer Institute, Catharina Hospital Eindhoven, 5623 EJ Eindhoven, The Netherlands
| | - Jon R Pluyter
- Department of Experience Design, Royal Philips, 5656 AE Eindhoven, The Netherlands
| | - Igor Jacobs
- Department of Hospital Services and Informatics, Philips Research, 5656 AE Eindhoven, The Netherlands
| | - Dimitrios Mavroeidis
- Department of Data Science, Philips Research, 5656 AE Eindhoven, The Netherlands
| | - Fons van der Sommen
- Department of Electrical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands
| | - Joost Nederend
- Department of Radiology, Catharina Cancer Institute, Catharina Hospital Eindhoven, 5623 EJ Eindhoven, The Netherlands
| |
Collapse
|
4
|
Liu Z, Ran H, Yu X, Wu Q, Zhang C. Immunocyte count combined with CT features for distinguishing pulmonary tuberculoma from malignancy among non-calcified solitary pulmonary solid nodules. J Thorac Dis 2023; 15:386-398. [PMID: 36910060 PMCID: PMC9992615 DOI: 10.21037/jtd-22-1024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 12/02/2022] [Indexed: 02/04/2023]
Abstract
Background Tuberculoma is the most common type of surgically removed benign solid solitary pulmonary nodule (SPN) and can lead to a high risk of misdiagnoses for clinicians. This study aimed to discuss the value of the immunocyte count combined with computed tomography (CT) features in distinguishing pulmonary tuberculoma from malignancy among non-calcified solid SPNs. Methods Forty-eight patients with pulmonary tuberculoma and 52 patients with lung cancer were retrospectively included in our study. Univariate and multivariate analyses were conducted to screen the independent predictors. Receiver operating characteristic (ROC) analysis was performed to investigate the validity of the predictive model. Results The univariate and multivariate analyses revealed that a coarse margin, vacuole, lobulation, pleural indentation, cluster of differentiation (CD)3+ T-lymphocyte count, and CD4+ T-lymphocyte count were independent predictors for distinguishing pulmonary tuberculoma from malignancy. The sensitivity, specificity, accuracy, and the area under the ROC curve of the model comprising the CD3+ T-lymphocyte count were 79.2%, 75%, 74.5%, and 0.845 [95% confidence interval (CI), 0.759-0.910], respectively, and those of the model involving the CD4+ T-lymphocyte count were 77.1%, 78.8%, 77.1%, and 0.857 (95% CI, 0.773-0.919), respectively. Conclusions Immunocyte count combined with CT features is efficient in distinguishing pulmonary tuberculoma from malignancy among non-calcified solid SPNs and has applicable clinical value.
Collapse
Affiliation(s)
- Zihao Liu
- Department of Cardiothoracic Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Haoyu Ran
- Department of Cardiothoracic Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Xiran Yu
- Department of Cardiothoracic Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Qingchen Wu
- Department of Cardiothoracic Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Cheng Zhang
- Department of Cardiothoracic Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| |
Collapse
|
5
|
Li X, Zhang G, Gao S, Xue Q, He J. Knowledge mapping visualization of the pulmonary ground-glass opacity published in the web of science. Front Oncol 2022; 12:1075350. [PMID: 36620580 PMCID: PMC9815441 DOI: 10.3389/fonc.2022.1075350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 12/01/2022] [Indexed: 12/24/2022] Open
Abstract
Objectives With low-dose computed tomography(CT) lung cancer screening, many studies with an increasing number of patients with ground-glass opacity (GGO) are published. Hence, the present study aimed to analyze the published studies on GGO using bibliometric analysis. The findings could provide a basis for future research in GGO and for understanding past advances and trends in the field. Methods Published studies on GGO were obtained from the Web of Science Core Collection. A bibliometric analysis was conducted using the R package and VOSviewer for countries, institutions, journals, authors, keywords, and articles relevant to GGO. In addition, a bibliometric map was created to visualize the relationship. Results The number of publications on GGO has been increasing since 2011. China is ranked as the most prolific country; however, Japan has the highest number of citations for its published articles. Seoul National University and Professor Jin Mo Goo from Korea had the highest publications. Most top 10 journals specialized in the field of lung diseases. Radiology is a comprehensive journal with the greatest number of citations and highest H-index than other journals. Using bibliometric analysis, research topics on "prognosis and diagnosis," "artificial intelligence," "treatment," "preoperative positioning and minimally invasive surgery," and "pathology of GGO" were identified. Artificial intelligence diagnosis and minimally invasive treatment may be the future of GGO. In addition, most top 10 literatures in this field were guidelines for lung cancer and pulmonary nodules. Conclusions The publication volume of GGO has increased rapidly. The top three countries with the highest number of published articles were China, Japan, and the United States. Japan had the most significant number of citations for published articles. Most key journals specialized in the field of lung diseases. Artificial intelligence diagnosis and minimally invasive treatment may be the future of GGO.
Collapse
Affiliation(s)
| | | | | | - Qi Xue
- *Correspondence: Qi Xue, ; Jie He,
| | - Jie He
- *Correspondence: Qi Xue, ; Jie He,
| |
Collapse
|
6
|
Radiomics based on enhanced CT for differentiating between pulmonary tuberculosis and pulmonary adenocarcinoma presenting as solid nodules or masses. J Cancer Res Clin Oncol 2022:10.1007/s00432-022-04256-y. [PMID: 35939114 DOI: 10.1007/s00432-022-04256-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 08/02/2022] [Indexed: 10/15/2022]
Abstract
PURPOSE To investigate the incremental value of enhanced CT-based radiomics in discriminating between pulmonary tuberculosis (PTB) and pulmonary adenocarcinoma (PAC) presenting as solid nodules or masses and to develop an optimal radiomics model. METHODS A total of 128 lesions (from 123 patients) from three hospitals were retrospectively analyzed and were randomly divided into training and test datasets at a ratio of 7:3. Independent predictors in subjective image features were used to develop the subjective image model (SIM). The plain CT-based and enhanced CT-based radiomics features were screened by the correlation coefficient method, univariate analysis, and the least absolute shrinkage and selection operator, then used to build the plain CT radiomics model (PRM) and enhanced CT radiomics model (ERM), respectively. Finally, the combined model (CM) combining PRM and ERM was established. In addition, the performance of three radiologists and one respiratory physician was evaluated. The areas under the receiver operating characteristic curve (AUCs) were used to assess the performance of each model. RESULTS The differential diagnostic capability of the ERM (training: AUC = 0.933; test: AUC = 0.881) was better than that of the PRM (training: AUC = 0.861; test: AUC = 0.756) and the SIM (training: AUC = 0.760; test: AUC = 0.611). The CM was optimal (training: AUC = 0.948; test: AUC = 0.917) and outperformed the respiratory physician and most radiologists. CONCLUSIONS The ERM was more helpful than the PRM for identifying PTB and PAC that present as solid nodules or masses, and the CM was the best.
Collapse
|