1
|
van der Linden LR, Vavliakis I, de Groot TM, Jutte PC, Doornberg JN, Lozano-Calderon SA, Groot OQ. Artificial Intelligence in bone Metastases: A systematic review in guideline adherence of 92 studies. J Bone Oncol 2025; 52:100682. [PMID: 40337637 PMCID: PMC12056386 DOI: 10.1016/j.jbo.2025.100682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 02/09/2025] [Accepted: 04/15/2025] [Indexed: 05/09/2025] Open
Abstract
Background The last decade has witnessed a surge in artificial intelligence (AI). With bone metastases becoming more prevalent, there is an increasing call for personalized treatment options, a domain where AI can greatly contribute. However, integrating AI into clinical settings has proven to be difficult. Therefore, we aimed to provide an overview of AI modalities for treating bone metastases and recommend implementation-worthy models based on TRIPOD, CLAIM, and UPM scores. Methods This systematic review included 92 studies on AI models in bone metastases between 2008 and 2024. Using three assessment tools we provided a reliable foundation for recommending AI modalities fit for clinical use (TRIPOD or CLAIM ≥ 70 % and UPM score ≥ 10). Results Most models focused on survival prediction (44/92;48%), followed by imaging studies (37/92;40%). Median TRIPOD completeness was 70% (IQR 64-81%), CLAIM completeness was 57% (IQR 48-67%), and UPM score was 7 (IQR 5-9). In total, 10% (9/92) AI modalities were deemed fit for clinical use. Conclusion Transparent reporting, utilizing the aforementioned three evaluation tools, is essential for effectively integrating AI models into clinical practice, as currently, only 10% of AI models for bone metastases are deemed fit for clinical use. Such transparency ensures that both patients and clinicians can benefit from clinically useful AI models, potentially enhancing AI-driven personalized cancer treatment.
Collapse
Affiliation(s)
- Lotte R. van der Linden
- Department of Orthopaedic Surgery, University Medical Center Groningen, Groningen, the Netherlands
- Department of Orthopaedic Surgery, Massachusetts General Hospital, Boston, MA, USA
| | - Ioannis Vavliakis
- Department of Orthopaedic Surgery, University Medical Center Groningen, Groningen, the Netherlands
| | - Tom M. de Groot
- Department of Orthopaedic Surgery, University Medical Center Groningen, Groningen, the Netherlands
- Department of Orthopaedic Surgery, Massachusetts General Hospital, Boston, MA, USA
| | - Paul C. Jutte
- Department of Orthopaedic Surgery, University Medical Center Groningen, Groningen, the Netherlands
| | - Job N. Doornberg
- Department of Orthopaedic Surgery, University Medical Center Groningen, Groningen, the Netherlands
| | | | - Olivier Q. Groot
- Department of Orthopaedic Surgery, Massachusetts General Hospital, Boston, MA, USA
- Department of Orthopaedic Surgery, University Medical Center Utrecht, Utrecht, the Netherlands
| |
Collapse
|
2
|
Pak S, Son HJ, Kim D, Woo JY, Yang I, Hwang HS, Rim D, Choi MS, Lee SH. Comparison of CNNs and Transformer Models in Diagnosing Bone Metastases in Bone Scans Using Grad-CAM. Clin Nucl Med 2025:00003072-990000000-01645. [PMID: 40237349 DOI: 10.1097/rlu.0000000000005898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2025] [Accepted: 03/09/2025] [Indexed: 04/18/2025]
Abstract
PURPOSE Convolutional neural networks (CNNs) have been studied for detecting bone metastases on bone scans; however, the application of ConvNeXt and transformer models has not yet been explored. This study aims to evaluate the performance of various deep learning models, including the ConvNeXt and transformer models, in diagnosing metastatic lesions from bone scans. MATERIALS AND METHODS We retrospectively analyzed bone scans from patients with cancer obtained at 2 institutions: the training and validation sets (n=4626) were from Hospital 1 and the test set (n=1428) was from Hospital 2. The deep learning models evaluated included ResNet18, the Data-Efficient Image Transformer (DeiT), the Vision Transformer (ViT Large 16), the Swin Transformer (Swin Base), and ConvNeXt Large. Gradient-weighted class activation mapping (Grad-CAM) was used for visualization. RESULTS Both the validation set and the test set demonstrated that the ConvNeXt large model (0.969 and 0.885, respectively) exhibited the best performance, followed by the Swin Base model (0.965 and 0.840, respectively), both of which significantly outperformed ResNet (0.892 and 0.725, respectively). Subgroup analyses revealed that all the models demonstrated greater diagnostic accuracy for patients with polymetastasis compared with those with oligometastasis. Grad-CAM visualization revealed that the ConvNeXt Large model focused more on identifying local lesions, whereas the Swin Base model focused on global areas such as the axial skeleton and pelvis. CONCLUSIONS Compared with traditional CNN and transformer models, the ConvNeXt model demonstrated superior diagnostic performance in detecting bone metastases from bone scans, especially in cases of polymetastasis, suggesting its potential in medical image analysis.
Collapse
Affiliation(s)
- Sehyun Pak
- Department of Medicine, Hallym University College of Medicine, Chuncheon, Gangwon, Republic of Korea
| | - Hye Joo Son
- Department of Nuclear Medicine, Dankook University Medical Center, Cheonan, Chungnam, Republic of Korea
| | - Dongwoo Kim
- Department of Nuclear Medicine, Hallym University Sacred Heart Hospital, Hallym University College of Medicine, Anyang, Gyeonggi, Republic of Korea
| | - Ji Young Woo
- Department of Radiology, Hallym University Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul, Republic of Korea
| | - Ik Yang
- Department of Radiology, Hallym University Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul, Republic of Korea
| | - Hee Sung Hwang
- Department of Nuclear Medicine, Hallym University Sacred Heart Hospital, Hallym University College of Medicine, Anyang, Gyeonggi, Republic of Korea
| | | | - Min Seok Choi
- PE Data Solution, SK hynix, Icheon, Gyeonggi, Republic of Korea
| | - Suk Hyun Lee
- Department of Radiology, Hallym University Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
3
|
McKinney AM, Moore JA, Campbell K, Braga TA, Rykken JB, Jagadeesan BD, McKinney ZJ. Automated vs. manual coding of neuroimaging reports via natural language processing, using the international classification of diseases, tenth revision. Heliyon 2024; 10:e30106. [PMID: 38799748 PMCID: PMC11126795 DOI: 10.1016/j.heliyon.2024.e30106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 04/19/2024] [Accepted: 04/19/2024] [Indexed: 05/29/2024] Open
Abstract
Objective Natural language processing (NLP) can generate diagnoses codes from imaging reports. Meanwhile, the International Classification of Diseases (ICD-10) codes are the United States' standard for billing/coding, which enable tracking disease burden and outcomes. This cross-sectional study aimed to test feasibility of an NLP algorithm's performance and comparison to radiologists' and physicians' manual coding. Methods Three neuroradiologists and one non-radiologist physician reviewers manually coded a randomly-selected pool of 200 craniospinal CT and MRI reports from a pool of >10,000. The NLP algorithm (Radnosis, VEEV, Inc., Minneapolis, MN) subdivided each report's Impression into "phrases", with multiple ICD-10 matches for each phrase. Only viewing the Impression, the physician reviewers selected the single best ICD-10 code for each phrase. Codes selected by the physicians and algorithm were compared for agreement. Results The algorithm extracted the reports' Impressions into 645 phrases, each having ranked ICD-10 matches. Regarding the reviewers' selected codes, pairwise agreement was unreliable (Krippendorff α = 0.39-0.63). Using unanimous reviewer agreement as "ground truth", the algorithm's sensitivity/specificity/F2 for top 5 codes was 0.88/0.80/0.83, and for the single best code was 0.67/0.82/0.67. The engine tabulated "pertinent negatives" as negative codes for stated findings (e.g. "no intracranial hemorrhage"). The engine's matching was more specific for shorter than full-length ICD-10 codes (p = 0.00582x10-3). Conclusions Manual coding by physician reviewers has significant variability and is time-consuming, while the NLP algorithm's top 5 diagnosis codes are relatively accurate. This preliminary work demonstrates the feasibility and potential for generating codes with reliability and consistency. Future works may include correlating diagnosis codes with clinical encounter codes to evaluate imaging's impact on, and relevance to care.
Collapse
Affiliation(s)
- Alexander M. McKinney
- Department of Radiology, University of Miami-Miller School of Medicine, Miami, FL, USA
| | | | | | - Thiago A. Braga
- Department of Radiology, University of Miami-Miller School of Medicine, Miami, FL, USA
| | - Jeffrey B. Rykken
- Department of Radiology, University of Minnesota School of Medicine, Minneapolis, MN, USA
| | - Bharathi D. Jagadeesan
- Departments of Radiology and Neurosurgery, University of Minnesota School of Medicine, Minneapolis, MN, USA
| | - Zeke J. McKinney
- HealthPartners Occupational and Environmental Medicine Residency, Minneapolis, MN, USA
- University of Minnesota School of Public Health, Minneapolis, MN, USA
- HealthPartners Institute, Minneapolis, MN, USA
| |
Collapse
|
4
|
Wu Z, Guo K, Luo E, Wang T, Wang S, Yang Y, Zhu X, Ding R. Medical long-tailed learning for imbalanced data: Bibliometric analysis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 247:108106. [PMID: 38452661 DOI: 10.1016/j.cmpb.2024.108106] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 02/24/2024] [Accepted: 02/26/2024] [Indexed: 03/09/2024]
Abstract
BACKGROUND In the last decade, long-tail learning has become a popular research focus in deep learning applications in medicine. However, no scientometric reports have provided a systematic overview of this scientific field. We utilized bibliometric techniques to identify and analyze the literature on long-tailed learning in deep learning applications in medicine and investigate research trends, core authors, and core journals. We expanded our understanding of the primary components and principal methodologies of long-tail learning research in the medical field. METHODS Web of Science was utilized to collect all articles on long-tailed learning in medicine published until December 2023. The suitability of all retrieved titles and abstracts was evaluated. For bibliometric analysis, all numerical data were extracted. CiteSpace was used to create clustered and visual knowledge graphs based on keywords. RESULTS A total of 579 articles met the evaluation criteria. Over the last decade, the annual number of publications and citation frequency both showed significant growth, following a power-law and exponential trend, respectively. Noteworthy contributors to this field include Husanbir Singh Pannu, Fadi Thabtah, and Talha Mahboob Alam, while leading journals such as IEEE ACCESS, COMPUTERS IN BIOLOGY AND MEDICINE, IEEE TRANSACTIONS ON MEDICAL IMAGING, and COMPUTERIZED MEDICAL IMAGING AND GRAPHICS have emerged as pivotal platforms for disseminating research in this area. The core of long-tailed learning research within the medical domain is encapsulated in six principal themes: deep learning for imbalanced data, model optimization, neural networks in image analysis, data imbalance in health records, CNN in diagnostics and risk assessment, and genetic information in disease mechanisms. CONCLUSION This study summarizes recent advancements in applying long-tail learning to deep learning in medicine through bibliometric analysis and visual knowledge graphs. It explains new trends, sources, core authors, journals, and research hotspots. Although this field has shown great promise in medical deep learning research, our findings will provide pertinent and valuable insights for future research and clinical practice.
Collapse
Affiliation(s)
- Zheng Wu
- School of Information Engineering, Hunan University of Science and Engineering, Yongzhou 425199, China.
| | - Kehua Guo
- School of Computer Science and Engineering, Central South University, Changsha 410083, China.
| | - Entao Luo
- School of Information Engineering, Hunan University of Science and Engineering, Yongzhou 425199, China.
| | - Tian Wang
- BNU-UIC Institute of Artificial Intelligence and Future Networks, Beijing Normal University (BNU Zhuhai), Zhuhai, China.
| | - Shoujin Wang
- Data Science Institute, University of Technology Sydney, Sydney, Australia.
| | - Yi Yang
- Department of Computer Science, Northeastern Illinois University, Chicago, IL 60625, USA.
| | - Xiangyuan Zhu
- School of Computer Science and Engineering, Central South University, Changsha 410083, China.
| | - Rui Ding
- School of Computer Science and Engineering, Central South University, Changsha 410083, China.
| |
Collapse
|