1
|
González G, Galant J, Salinas JM, Benítez E, Sánchez-Valverde MD, Calbo J, Cerrolaza N. Classification and segmentation of hip fractures in x-rays: highlighting fracture regions for interpretable diagnosis. Insights Imaging 2025; 16:86. [PMID: 40232323 PMCID: PMC12000489 DOI: 10.1186/s13244-025-01958-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2024] [Accepted: 03/20/2025] [Indexed: 04/16/2025] Open
Abstract
OBJECTIVE To develop an artificial intelligence (AI) system capable of classifying and segmenting femoral fractures. To compare its performance against existing state-of-the-art methods. METHODS This Institutional Review Board (IRB)-approved retrospective study did not require informed consent. 10,308 hip x-rays from 2618 patients were retrieved from the hospital PACS. 986 were randomly selected for annotation and randomly split into training, validation, and test sets at the patient level. Two radiologists segmented and classified femoral fractures based on their location (femoral neck, pertrochanteric region, or subtrochanteric region) and grade, using the Evans and Garden scales for neck and pertrochanteric regions, respectively. A YOLOv8 segmentation convolutional neural network (CNN) was trained to generate fracture masks and indicate their class and grade. Classification CNNs were trained in the same dataset for method comparison. RESULTS On the test set, YOLOv8 achieved a Dice coefficient of 0.77 (95% CI: 0.56-0.98) for segmenting fractures, an accuracy of 86.2% (95% CI: 80.77-90.55) for classification and grading, and an AUC of 0.981 (95% CI: 0.965-0.997) for fracture detection. These metrics are on par with or exceed those of previously published AI methods, demonstrating the efficacy of our approach. CONCLUSIONS The high accuracy and AUC values demonstrate the potential of the proposed neural network as a reliable tool in clinical settings. Further, it is the first to provide a precise segmentation of femoral fractures, as indicated by the Dice scores, which may enhance interpretability. A formal evaluation is planned to further assess its clinical applicability. CRITICAL RELEVANCE STATEMENT The proposed system offers high granularity in fracture classification and is the first to segment femoral fractures, ensuring interpretability. KEY POINTS We present the first AI method that segments and grades femoral fractures. The method classifies fractures with fracture location and type. High accuracy and interpretability promise utility in clinical practice.
Collapse
Affiliation(s)
- Germán González
- Robotics, Vision and Intelligent Technologies, Department of Computational Sciences and Artificial Intelligence, University of Alicante, Alicante, Spain.
| | - Joaquín Galant
- Radiology Service, Hospital of San Juan de Alicante, Alicante, Spain
| | - José María Salinas
- Robotics, Vision and Intelligent Technologies, Department of Computational Sciences and Artificial Intelligence, University of Alicante, Alicante, Spain
- IT Service, Hospital of San Juan de Alicante, Alicante, Spain
| | - Emilia Benítez
- Radiology Service, Hospital de la Vega Baja, Alicante, Spain
| | | | - Jorge Calbo
- Radiology Service, Hospital of San Juan de Alicante, Alicante, Spain
| | - Nicolás Cerrolaza
- Orthopedics Surgery, Hospital of San Juan de Alicante, Alicante, Spain
| |
Collapse
|
2
|
Lo Mastro A, Grassi E, Berritto D, Russo A, Reginelli A, Guerra E, Grassi F, Boccia F. Artificial intelligence in fracture detection on radiographs: a literature review. Jpn J Radiol 2025; 43:551-585. [PMID: 39538068 DOI: 10.1007/s11604-024-01702-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Accepted: 11/04/2024] [Indexed: 11/16/2024]
Abstract
Fractures are one of the most common reasons of admission to emergency department affecting individuals of all ages and regions worldwide that can be misdiagnosed during radiologic examination. Accurate and timely diagnosis of fracture is crucial for patients, and artificial intelligence that uses algorithms to imitate human intelligence to aid or enhance human performs is a promising solution to address this issue. In the last few years, numerous commercially available algorithms have been developed to enhance radiology practice and a large number of studies apply artificial intelligence to fracture detection. Recent contributions in literature have described numerous advantages showing how artificial intelligence performs better than doctors who have less experience in interpreting musculoskeletal X-rays, and assisting radiologists increases diagnostic accuracy and sensitivity, improves efficiency, and reduces interpretation time. Furthermore, algorithms perform better when they are trained with big data on a wide range of fracture patterns and variants and can provide standardized fracture identification across different radiologist, thanks to the structured report. In this review article, we discuss the use of artificial intelligence in fracture identification and its benefits and disadvantages. We also discuss its current potential impact on the field of radiology and radiomics.
Collapse
Affiliation(s)
- Antonio Lo Mastro
- Department of Radiology, University of Campania "Luigi Vanvitelli", Naples, Italy.
| | - Enrico Grassi
- Department of Orthopaedics, University of Florence, Florence, Italy
| | - Daniela Berritto
- Department of Clinical and Experimental Medicine, University of Foggia, Foggia, Italy
| | - Anna Russo
- Department of Radiology, University of Campania "Luigi Vanvitelli", Naples, Italy
| | - Alfonso Reginelli
- Department of Radiology, University of Campania "Luigi Vanvitelli", Naples, Italy
| | - Egidio Guerra
- Emergency Radiology Department, "Policlinico Riuniti Di Foggia", Foggia, Italy
| | - Francesca Grassi
- Department of Radiology, University of Campania "Luigi Vanvitelli", Naples, Italy
| | - Francesco Boccia
- Department of Radiology, University of Campania "Luigi Vanvitelli", Naples, Italy
| |
Collapse
|
3
|
van der Gaast N, Bagave P, Assink N, Broos S, Jaarsma RL, Edwards MJR, Hermans E, IJpma FFA, Ding AY, Doornberg JN, Oosterhoff JHF. Deep learning for tibial plateau fracture detection and classification. Knee 2025; 54:81-89. [PMID: 40023913 DOI: 10.1016/j.knee.2025.02.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 01/17/2025] [Accepted: 02/04/2025] [Indexed: 03/04/2025]
Abstract
BACKGROUND Deep learning (DL) has been shown to be successful in interpreting radiographs and aiding in fracture detection and classification. However, no study has aimed to develop a computer vision model for tibia plateau fractures using the Schatzker classification. Therefore, this study aims to develop a deep learning model for (1) detection of tibial plateau fractures and (2) classification according to the Schatzker classification. METHODS A multicenter approach was performed for the collection of radiographs of patients with tibia plateau fractures. Both anteroposterior and lateral images were uploaded into an annotation software and manually labelled and annotated. The dataset was balanced for optimizing model development and split into a training set and a test set. We trained two convolutional neural networks (GoogleNet and ResNet) for the detection and classification of tibia plateau fractures following the Schatzker classification. RESULTS A total of 1506 knee radiographs from 753 patients, including 368 tibial plateau fractures and 385 healthy knees, were used to create the algorithm. The GoogleNet algorithm demonstrated high sensitivity (92.7%) but intermediate accuracy (70.4%) and positive predictive value (64.4%) in detecting tibial plateau fractures, indicating reliable detection of fractured cases. It exhibited limited success in accurately classifying fractures according to the Schatzker system, achieving an accuracy of only 34.6% and a sensitivity of 32.1%. CONCLUSION This study shows that detection of tibial plateau fractures is a task that a DL algorithm can grasp; further refinement is necessary to enhance their accuracy in fracture classification. Computer vision models might improve using different classification systems, as the current Schatzker classification suffers from a low interobserver agreement on conventional radiographs.
Collapse
Affiliation(s)
- N van der Gaast
- Department of Orthopaedics & Trauma Surgery, Flinders Medical Centre and Flinders University, Adelaide, SA, Australia; Department of Trauma Surgery, Radboud University Medical Center, Radboud University Nijmegen, the Netherlands.
| | - P Bagave
- Department of Engineering Systems and Services, Faculty of Technology Policy and Management, Delft University of Technology, Delft, the Netherlands
| | - N Assink
- Department of Orthopaedic and Trauma Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
| | - S Broos
- Department of Orthopaedics & Trauma Surgery, Flinders Medical Centre and Flinders University, Adelaide, SA, Australia
| | - R L Jaarsma
- Department of Orthopaedics & Trauma Surgery, Flinders Medical Centre and Flinders University, Adelaide, SA, Australia
| | - M J R Edwards
- Department of Trauma Surgery, Radboud University Medical Center, Radboud University Nijmegen, the Netherlands
| | - E Hermans
- Department of Trauma Surgery, Radboud University Medical Center, Radboud University Nijmegen, the Netherlands
| | - F F A IJpma
- Department of Orthopaedic and Trauma Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
| | - A Y Ding
- Department of Engineering Systems and Services, Faculty of Technology Policy and Management, Delft University of Technology, Delft, the Netherlands
| | - J N Doornberg
- Department of Orthopaedics & Trauma Surgery, Flinders Medical Centre and Flinders University, Adelaide, SA, Australia; Department of Orthopaedic and Trauma Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
| | - J H F Oosterhoff
- Department of Engineering Systems and Services, Faculty of Technology Policy and Management, Delft University of Technology, Delft, the Netherlands; Department of Orthopaedic and Trauma Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
| |
Collapse
|
4
|
Cheng CT, Ooyang CH, Liao CH, Kang SC. Applications of deep learning in trauma radiology: A narrative review. Biomed J 2025; 48:100743. [PMID: 38679199 PMCID: PMC11751421 DOI: 10.1016/j.bj.2024.100743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 03/26/2024] [Accepted: 04/24/2024] [Indexed: 05/01/2024] Open
Abstract
Diagnostic imaging is essential in modern trauma care for initial evaluation and identifying injuries requiring intervention. Deep learning (DL) has become mainstream in medical image analysis and has shown promising efficacy for classification, segmentation, and lesion detection. This narrative review provides the fundamental concepts for developing DL algorithms in trauma imaging and presents an overview of current progress in each modality. DL has been applied to detect free fluid on Focused Assessment with Sonography for Trauma (FAST), traumatic findings on chest and pelvic X-rays, and computed tomography (CT) scans, identify intracranial hemorrhage on head CT, detect vertebral fractures, and identify injuries to organs like the spleen, liver, and lungs on abdominal and chest CT. Future directions involve expanding dataset size and diversity through federated learning, enhancing model explainability and transparency to build clinician trust, and integrating multimodal data to provide more meaningful insights into traumatic injuries. Though some commercial artificial intelligence products are Food and Drug Administration-approved for clinical use in the trauma field, adoption remains limited, highlighting the need for multi-disciplinary teams to engineer practical, real-world solutions. Overall, DL shows immense potential to improve the efficiency and accuracy of trauma imaging, but thoughtful development and validation are critical to ensure these technologies positively impact patient care.
Collapse
Affiliation(s)
- Chi-Tung Cheng
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Taoyuan, Taiwan; School of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Chun-Hsiang Ooyang
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Taoyuan, Taiwan
| | - Chien-Hung Liao
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Taoyuan, Taiwan
| | - Shih-Ching Kang
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Taoyuan, Taiwan.
| |
Collapse
|
5
|
Liu XS, Nie R, Duan AW, Yang L, Li X, Zhang LT, Guo GK, Guo QS, Zhao DC, Li Y, Zhang HH. YOLOX-SwinT algorithm improves the accuracy of AO/OTA classification of intertrochanteric fractures by orthopedic trauma surgeons. Chin J Traumatol 2025; 28:69-75. [PMID: 38762418 PMCID: PMC11840343 DOI: 10.1016/j.cjtee.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 03/18/2024] [Accepted: 04/09/2024] [Indexed: 05/20/2024] Open
Abstract
PURPOSE Intertrochanteric fracture (ITF) classification is crucial for surgical decision-making. However, orthopedic trauma surgeons have shown lower accuracy in ITF classification than expected. The objective of this study was to utilize an artificial intelligence (AI) method to improve the accuracy of ITF classification. METHODS We trained a network called YOLOX-SwinT, which is based on the You Only Look Once X (YOLOX) object detection network with Swin Transformer (SwinT) as the backbone architecture, using 762 radiographic ITF examinations as the training set. Subsequently, we recruited 5 senior orthopedic trauma surgeons (SOTS) and 5 junior orthopedic trauma surgeons (JOTS) to classify the 85 original images in the test set, as well as the images with the prediction results of the network model in sequence. Statistical analysis was performed using the SPSS 20.0 (IBM Corp., Armonk, NY, USA) to compare the differences among the SOTS, JOTS, SOTS + AI, JOTS + AI, SOTS + JOTS, and SOTS + JOTS + AI groups. All images were classified according to the AO/OTA 2018 classification system by 2 experienced trauma surgeons and verified by another expert in this field. Based on the actual clinical needs, after discussion, we integrated 8 subgroups into 5 new subgroups, and the dataset was divided into training, validation, and test sets by the ratio of 8:1:1. RESULTS The mean average precision at the intersection over union (IoU) of 0.5 (mAP50) for subgroup detection reached 90.29%. The classification accuracy values of SOTS, JOTS, SOTS + AI, and JOTS + AI groups were 56.24% ± 4.02%, 35.29% ± 18.07%, 79.53% ± 7.14%, and 71.53% ± 5.22%, respectively. The paired t-test results showed that the difference between the SOTS and SOTS + AI groups was statistically significant, as well as the difference between the JOTS and JOTS + AI groups, and the SOTS + JOTS and SOTS + JOTS + AI groups. Moreover, the difference between the SOTS + JOTS and SOTS + JOTS + AI groups in each subgroup was statistically significant, with all p < 0.05. The independent samples t-test results showed that the difference between the SOTS and JOTS groups was statistically significant, while the difference between the SOTS + AI and JOTS + AI groups was not statistically significant. With the assistance of AI, the subgroup classification accuracy of both SOTS and JOTS was significantly improved, and JOTS achieved the same level as SOTS. CONCLUSION In conclusion, the YOLOX-SwinT network algorithm enhances the accuracy of AO/OTA subgroups classification of ITF by orthopedic trauma surgeons.
Collapse
Affiliation(s)
- Xue-Si Liu
- Department of Medical Engineering, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Rui Nie
- Department of Medical Engineering, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Ao-Wen Duan
- Department of Medical Engineering, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Li Yang
- Department of Medical Engineering, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Xiang Li
- Department of Information, Southwest Hospital, Army Medical University, Chongqing, 400038, China
| | - Le-Tian Zhang
- Department of Radiology, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Guang-Kuo Guo
- Department of Radiology, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Qing-Shan Guo
- Division of Trauma and War Injury, Daping Hospital, Army Medical University of PLA, State Key Laboratory of Trauma and Chemical Poisoning, Chongqing, 400042, China
| | - Dong-Chu Zhao
- Division of Trauma and War Injury, Daping Hospital, Army Medical University of PLA, State Key Laboratory of Trauma and Chemical Poisoning, Chongqing, 400042, China
| | - Yang Li
- Division of Trauma and War Injury, Daping Hospital, Army Medical University of PLA, State Key Laboratory of Trauma and Chemical Poisoning, Chongqing, 400042, China.
| | - He-Hua Zhang
- Department of Medical Engineering, Daping Hospital, Army Medical University, Chongqing, 400042, China.
| |
Collapse
|
6
|
Noda M, Takahara S, Hayashi S, Inui A, Oe K, Matsushita T. Evaluating ChatGPT's Performance in Classifying Pertrochanteric Fractures Based on Arbeitsgemeinschaft für Osteosynthesefragen/Orthopedic Trauma Association (AO/OTA) Standards. Cureus 2025; 17:e78068. [PMID: 40018458 PMCID: PMC11865862 DOI: 10.7759/cureus.78068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/26/2025] [Indexed: 03/01/2025] Open
Abstract
Introduction Generative Pre-Training Transformer (ChatGPT) has become widely recognized for its capability to generate text, synthesize complex information, and perform a variety of tasks without requiring human specialists for data collection. The latest iteration, ChatGPT-4, is a large multimodal model capable of integrating both text and image inputs, rendering it particularly promising for medical applications. However, its efficacy in analyzing radiographic images remains largely unexplored. Aim This study aims to (i) address the lack of data on the accuracy of ChatGPT in radiographic fracture classification into stable or unstable under the revised Arbeitsgemeinschaft für Osteosynthesefragen/Orthopedic Trauma Association (AO/OTA) classification system, and this procedure is also performed by surgeons, and (ii) compare the agreement between surgeons or ChatGPT-based performance. The study hypothesizes that the use of ChatGPT would achieve moderate agreement with orthopedic surgeons. Materials and methods Patients diagnosed with pertrochanteric fractures were retrospectively collected. Patients with both preoperative two-directional plain radiographs and CT scans (3D-CT) images were conditioned for enrollment into the study. Two orthopedic surgeons (observer 1 and observer 2, respectively) and one resident (observer 3) were once assigned to dichotomized groups into A1 (stable) or A2 (unstable) based on AO/OTA classification using two-directional plain radiographs. Prior to the ChatGPT study, all the anteroposterior images trimmed at the fractured side, attached with figure names including gender, and age, were inputted into OpenAI ChatGPT-4. Radiological evaluation prompts were designed to initiate ChatGPT's classification analysis of the uploaded radiographic images. A single observer (MN) decided the classification patterns by examining 3D CT scan images as well as plain radiographs. This judgment of A1 (stable) and A2 (unstable) was set as a benchmark to mark the results of observers and ChatGPT based on plain radiographs. Results The cohort consisted of 29 males and 90 females, with a mean age of 87 years after the data exclusion. The fractures were classified into A1 (stable) and A2 (unstable) groups based on CT imaging. The A1 group included 50 patients (13 males, 37 females; mean age: 86.2 ± 7.8 years), while the A2 group included 69 patients (16 males, 53 females; mean age: 87.0 ± 7.9 years). Kappa values for fracture classification between plain radiographs evaluated by the three observers and ChatGPT, compared to the CT-based gold standard, showed fair to moderate agreement: Observer 1: 0.494 (95% CI: 0.337-0.650), Observer 2: 0.390 (95% CI: 0.227-0.553), Observer 3: 0.360 (95% CI: 0.198-0.521), and ChatGPT: 0.420 (95% CI: 0.255-0.585). ChatGPT demonstrated accuracy, sensitivity, specificity, and positive and negative predictable values comparable to the human observers, suggesting moderate reliability. Conclusion This study demonstrates that ChatGPT can classify pertrochanteric fractures into A1 (stable) and A2 (unstable) under the Revised AO/OTA Classification System. Its moderate agreement with CT-based assessments (κ = 0.420) is comparable to the performance of orthopedic surgeons. Moreover, ChatGPT is straightforward to integrate into clinical workflows, requiring minimal data collection for training.
Collapse
Affiliation(s)
| | - Shunsuke Takahara
- Orthopedics, Hyogo Prefectural Kakogawa Medical Center, Kakogawa, JPN
| | - Shinya Hayashi
- Orthopedics, Kobe University Graduate School of Medicine, Kobe, JPN
| | - Atsuyuki Inui
- Orthopedics, Kobe University Graduate School of Medicine, Kobe, JPN
| | - Keisuke Oe
- Orthopedics, Kobe University Graduate School of Medicine, Kobe, JPN
| | | |
Collapse
|
7
|
Zhang JY, Yang JM, Wang XM, Wang HL, Zhou H, Yan ZN, Xie Y, Liu PR, Hao ZW, Ye ZW. Application and Prospects of Deep Learning Technology in Fracture Diagnosis. Curr Med Sci 2024; 44:1132-1140. [PMID: 39551854 DOI: 10.1007/s11596-024-2928-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Accepted: 08/18/2024] [Indexed: 11/19/2024]
Abstract
Artificial intelligence (AI) is an interdisciplinary field that combines computer technology, mathematics, and several other fields. Recently, with the rapid development of machine learning (ML) and deep learning (DL), significant progress has been made in the field of AI. As one of the fastest-growing branches, DL can effectively extract features from big data and optimize the performance of various tasks. Moreover, with advancements in digital imaging technology, DL has become a key tool for processing high-dimensional medical image data and conducting medical image analysis in clinical applications. With the development of this technology, the diagnosis of orthopedic diseases has undergone significant changes. In this review, we describe recent research progress on DL in fracture diagnosis and discuss the value of DL in this field, providing a reference for better integration and development of DL technology in orthopedics.
Collapse
Affiliation(s)
- Jia-Yao Zhang
- Department of Orthopedics, Fuzhou University Affiliated Provincial Hospital, Fuzhou, 350013, China
- Department of Orthopedics, Fujian Provincial Hospital, Fuzhou, 350013, China
- Intelligent Medical Laboratory, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Jia-Ming Yang
- Intelligent Medical Laboratory, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
- Department of Orthopedics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Xin-Meng Wang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Dali University, Dali, 671000, China
| | - Hong-Lin Wang
- Intelligent Medical Laboratory, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
- Department of Orthopedics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Hong Zhou
- Intelligent Medical Laboratory, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
- Department of Orthopedics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Zi-Neng Yan
- Intelligent Medical Laboratory, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
- Department of Orthopedics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Yi Xie
- Intelligent Medical Laboratory, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
- Department of Orthopedics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Peng-Ran Liu
- Intelligent Medical Laboratory, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
- Department of Orthopedics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
| | - Zhi-Wei Hao
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Zhe-Wei Ye
- Intelligent Medical Laboratory, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
- Department of Orthopedics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
| |
Collapse
|
8
|
Breu R, Avelar C, Bertalan Z, Grillari J, Redl H, Ljuhar R, Quadlbauer S, Hausner T. Artificial intelligence in traumatology. Bone Joint Res 2024; 13:588-595. [PMID: 39417424 PMCID: PMC11484119 DOI: 10.1302/2046-3758.1310.bjr-2023-0275.r3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/19/2024] Open
Abstract
Aims The aim of this study was to create artificial intelligence (AI) software with the purpose of providing a second opinion to physicians to support distal radius fracture (DRF) detection, and to compare the accuracy of fracture detection of physicians with and without software support. Methods The dataset consisted of 26,121 anonymized anterior-posterior (AP) and lateral standard view radiographs of the wrist, with and without DRF. The convolutional neural network (CNN) model was trained to detect the presence of a DRF by comparing the radiographs containing a fracture to the inconspicuous ones. A total of 11 physicians (six surgeons in training and five hand surgeons) assessed 200 pairs of randomly selected digital radiographs of the wrist (AP and lateral) for the presence of a DRF. The same images were first evaluated without, and then with, the support of the CNN model, and the diagnostic accuracy of the two methods was compared. Results At the time of the study, the CNN model showed an area under the receiver operating curve of 0.97. AI assistance improved the physician's sensitivity (correct fracture detection) from 80% to 87%, and the specificity (correct fracture exclusion) from 91% to 95%. The overall error rate (combined false positive and false negative) was reduced from 14% without AI to 9% with AI. Conclusion The use of a CNN model as a second opinion can improve the diagnostic accuracy of DRF detection in the study setting.
Collapse
Affiliation(s)
- Rosmarie Breu
- Orthopedic Hospital Vienna-Speising, Vienna, Austria
- AUVA Trauma Hospital Lorenz Böhler, Vienna, Austria
- Ludwig Boltzmann Institute for Traumatology, the Research Center in Cooperation with AUVA, Vienna, Austria
| | | | | | - Johannes Grillari
- Ludwig Boltzmann Institute for Traumatology, the Research Center in Cooperation with AUVA, Vienna, Austria
- Institute of Molecular Biotechnology, University of Natural Resources and Life Sciences, Vienna, Austria
- Austrian Cluster for Tissue Regeneration, Vienna, Austria
| | - Heinz Redl
- Ludwig Boltzmann Institute for Traumatology, the Research Center in Cooperation with AUVA, Vienna, Austria
- Austrian Cluster for Tissue Regeneration, Vienna, Austria
| | - Richard Ljuhar
- ImageBiopsy Lab, Vienna, Austria
- Institute of Molecular Biotechnology, University of Natural Resources and Life Sciences, Vienna, Austria
| | | | - Thomas Hausner
- AUVA Trauma Hospital Lorenz Böhler, Vienna, Austria
- Ludwig Boltzmann Institute for Traumatology, the Research Center in Cooperation with AUVA, Vienna, Austria
- Austrian Cluster for Tissue Regeneration, Vienna, Austria
- Department for Orthopedic Surgery and Traumatology, Paracelsus Medical University, Salzburg, Austria
| |
Collapse
|
9
|
Drogt J, Milota M, van den Brink A, Jongsma K. Ethical guidance for reporting and evaluating claims of AI outperforming human doctors. NPJ Digit Med 2024; 7:271. [PMID: 39358556 PMCID: PMC11447248 DOI: 10.1038/s41746-024-01255-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Accepted: 09/08/2024] [Indexed: 10/04/2024] Open
Affiliation(s)
| | - Megan Milota
- University Medical Center, Utrecht, The Netherlands
| | | | | |
Collapse
|
10
|
Nguyen HH, Le DT, Shore-Lorenti C, Chen C, Schilcher J, Eklund A, Zebaze R, Milat F, Sztal-Mazer S, Girgis CM, Clifton-Bligh R, Cai J, Ebeling PR. AFFnet - a deep convolutional neural network for the detection of atypical femur fractures from anteriorposterior radiographs. Bone 2024; 187:117215. [PMID: 39074569 DOI: 10.1016/j.bone.2024.117215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Revised: 07/14/2024] [Accepted: 07/26/2024] [Indexed: 07/31/2024]
Abstract
Despite well-defined criteria for radiographic diagnosis of atypical femur fractures (AFFs), missed and delayed diagnosis is common. An AFF diagnostic software could provide timely AFF detection to prevent progression of incomplete or development of contralateral AFFs. In this study, we investigated the ability for an artificial intelligence (AI)-based application, using deep learning models (DLMs), particularly convolutional neural networks (CNNs), to detect AFFs from femoral radiographs. A labelled Australian dataset of pre-operative complete AFF (cAFF), incomplete AFF (iAFF), typical femoral shaft fracture (TFF), and non-fractured femoral (NFF) X-ray images in anterior-posterior view were used for training (N = 213, 49, 394, 1359, respectively). An AFFnet model was developed using a pretrained (ImageNet dataset) ResNet-50 backbone, and a novel Box Attention Guide (BAG) module to guide the model's scanning patterns to enhance its learning. All images were used to train and internally test the model using a 5-fold cross validation approach, and further validated by an external dataset. External validation of the model's performance was conducted on a Sweden dataset comprising 733 TFF and 290 AFF images. Precision, sensitivity, specificity, F1-score and AUC were measured and compared between AFFnet and a global approach with ResNet-50. Excellent diagnostic performance was recorded in both models (all AUC >0.97), however AFFnet recorded lower number of prediction errors, and improved sensitivity, F1-score and precision compared to ResNet-50 in both internal and external testing. Sensitivity in the detection of iAFF was higher for AFFnet than ResNet-50 (82 % vs 56 %). In conclusion, AFFnet achieved excellent diagnostic performance on internal and external validation, which was superior to a pre-existing model. Accurate AI-based AFF diagnostic software has the potential to improve AFF diagnosis, reduce radiologist error, and allow urgent intervention, thus improving patient outcomes.
Collapse
Affiliation(s)
- Hanh H Nguyen
- Department of Medicine, School of Clinical Sciences, Monash University, Victoria, Australia; Department of Endocrinology, Monash Health, Victoria, Australia; Department of Endocrinology and Diabetes, Western Health, Victoria, Australia; Department of Medicine, The University of Melbourne, Victoria, Australia.
| | - Duy Tho Le
- Department of Medicine, School of Clinical Sciences, Monash University, Victoria, Australia; Department of Information Technology, Monash University, Victoria, Australia
| | - Cat Shore-Lorenti
- Department of Medicine, School of Clinical Sciences, Monash University, Victoria, Australia
| | - Colin Chen
- Department of Medicine, School of Clinical Sciences, Monash University, Victoria, Australia
| | - Jorg Schilcher
- Department of Biomedical and Clinical Sciences, and the Wallenberg Centre for Molecular Medicine, Linköping University, Linköping, Sweden
| | - Anders Eklund
- Department of Biomedical Engineering, Linköping University, Linköping, Sweden
| | - Roger Zebaze
- Department of Medicine, School of Clinical Sciences, Monash University, Victoria, Australia
| | - Frances Milat
- Department of Medicine, School of Clinical Sciences, Monash University, Victoria, Australia; Department of Endocrinology, Monash Health, Victoria, Australia
| | - Shoshana Sztal-Mazer
- Department of Endocrinology and Diabetes, Alfred Health, Victoria, Australia; Department of Public Health and Preventative Medicine, Monash University, Melbourne, Australia
| | - Christian M Girgis
- Department of Endocrinology, Royal North Shore Hospital, New South Wales, Australia; Department of Diabetes and Endocrinology, Westmead Hospital, New South Wales, Australia; Faculty of Medicine and Health, The University of Sydney, New South Wales, Australia
| | - Roderick Clifton-Bligh
- Department of Endocrinology, Royal North Shore Hospital, New South Wales, Australia; Department of Diabetes and Endocrinology, Westmead Hospital, New South Wales, Australia
| | - Jianfei Cai
- Department of Information Technology, Monash University, Victoria, Australia
| | - Peter R Ebeling
- Department of Medicine, School of Clinical Sciences, Monash University, Victoria, Australia; Department of Endocrinology, Monash Health, Victoria, Australia
| |
Collapse
|
11
|
Lee SH, Jeon J, Lee GJ, Park JY, Kim YJ, Kim KG. Automated Association for Osteosynthesis Foundation and Orthopedic Trauma Association classification of pelvic fractures on pelvic radiographs using deep learning. Sci Rep 2024; 14:20548. [PMID: 39232189 PMCID: PMC11374898 DOI: 10.1038/s41598-024-71654-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 08/29/2024] [Indexed: 09/06/2024] Open
Abstract
High-energy impacts, like vehicle crashes or falls, can lead to pelvic ring injuries. Rapid diagnosis and treatment are crucial due to the risks of severe bleeding and organ damage. Pelvic radiography promptly assesses fracture extent and location, but struggles to diagnose bleeding. The AO/OTA classification system grades pelvic instability, but its complexity limits its use in emergency settings. This study develops and evaluates a deep learning algorithm to classify pelvic fractures on radiographs per the AO/OTA system. Pelvic radiographs of 773 patients with pelvic fractures and 167 patients without pelvic fractures were retrospectively analyzed at a single center. Pelvic fractures were classified into types A, B, and C using medical records categorized by an orthopedic surgeon according to the AO/OTA classification system. Accuracy, Dice Similarity Coefficient (DSC), and F1 score were measured to evaluate the diagnostic performance of the deep learning algorithms. The segmentation model showed high performance with 0.98 accuracy and 0.96-0.97 DSC. The AO/OTA classification model demonstrated effective performance with a 0.47-0.80 F1 score and 0.69-0.88 accuracy. Additionally, the classification model had a macro average of 0.77-0.94. Performance evaluation of the models showed relatively favorable results, which can aid in early classification of pelvic fractures.
Collapse
Affiliation(s)
- Seung Hwan Lee
- Department of Trauma Surgery, Gachon University Gil Medical Center, Incheon, Republic of Korea.
- Department of Traumatology, Gachon University College of Medicine, 38-13, Dokjeom-ro 3beon-gil, Namdong-gu, Incheon, 21565, Republic of Korea.
| | - Jisu Jeon
- Deptartment of Health Science and Technology, Gachon Advanced Institute for Health Science and Technology (GAIHST), Lee Gil Ya Cancer and Diabetes Institute, Gachon University, Incheon, Republic of Korea
| | - Gil Jae Lee
- Department of Trauma Surgery, Gachon University Gil Medical Center, Incheon, Republic of Korea
- Department of Traumatology, Gachon University College of Medicine, 38-13, Dokjeom-ro 3beon-gil, Namdong-gu, Incheon, 21565, Republic of Korea
| | - Jun Young Park
- Deptartment of Health Science and Technology, Gachon Advanced Institute for Health Science and Technology (GAIHST), Lee Gil Ya Cancer and Diabetes Institute, Gachon University, Incheon, Republic of Korea
| | - Young Jae Kim
- Deptartment of Health Science and Technology, Gachon Advanced Institute for Health Science and Technology (GAIHST), Lee Gil Ya Cancer and Diabetes Institute, Gachon University, Incheon, Republic of Korea
- Medical Devices R&D Center, Gachon University Gil Medical Center, Incheon, Republic of Korea
- Deptartment of Biomedical Engineering, Pre-medical Course, Gil Medical Center, College of Medicine, Gachon University, 38-13, Dokjeom-ro 3beon-gil, Namdong-gu, Incheon, 21565, Republic of Korea
| | - Kwang Gi Kim
- Deptartment of Health Science and Technology, Gachon Advanced Institute for Health Science and Technology (GAIHST), Lee Gil Ya Cancer and Diabetes Institute, Gachon University, Incheon, Republic of Korea.
- Medical Devices R&D Center, Gachon University Gil Medical Center, Incheon, Republic of Korea.
- Deptartment of Biomedical Engineering, Pre-medical Course, Gil Medical Center, College of Medicine, Gachon University, 38-13, Dokjeom-ro 3beon-gil, Namdong-gu, Incheon, 21565, Republic of Korea.
| |
Collapse
|
12
|
Alzubaidi L, Al-Dulaimi K, Salhi A, Alammar Z, Fadhel MA, Albahri AS, Alamoodi AH, Albahri OS, Hasan AF, Bai J, Gilliland L, Peng J, Branni M, Shuker T, Cutbush K, Santamaría J, Moreira C, Ouyang C, Duan Y, Manoufali M, Jomaa M, Gupta A, Abbosh A, Gu Y. Comprehensive review of deep learning in orthopaedics: Applications, challenges, trustworthiness, and fusion. Artif Intell Med 2024; 155:102935. [PMID: 39079201 DOI: 10.1016/j.artmed.2024.102935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 03/18/2024] [Accepted: 07/22/2024] [Indexed: 08/24/2024]
Abstract
Deep learning (DL) in orthopaedics has gained significant attention in recent years. Previous studies have shown that DL can be applied to a wide variety of orthopaedic tasks, including fracture detection, bone tumour diagnosis, implant recognition, and evaluation of osteoarthritis severity. The utilisation of DL is expected to increase, owing to its ability to present accurate diagnoses more efficiently than traditional methods in many scenarios. This reduces the time and cost of diagnosis for patients and orthopaedic surgeons. To our knowledge, no exclusive study has comprehensively reviewed all aspects of DL currently used in orthopaedic practice. This review addresses this knowledge gap using articles from Science Direct, Scopus, IEEE Xplore, and Web of Science between 2017 and 2023. The authors begin with the motivation for using DL in orthopaedics, including its ability to enhance diagnosis and treatment planning. The review then covers various applications of DL in orthopaedics, including fracture detection, detection of supraspinatus tears using MRI, osteoarthritis, prediction of types of arthroplasty implants, bone age assessment, and detection of joint-specific soft tissue disease. We also examine the challenges for implementing DL in orthopaedics, including the scarcity of data to train DL and the lack of interpretability, as well as possible solutions to these common pitfalls. Our work highlights the requirements to achieve trustworthiness in the outcomes generated by DL, including the need for accuracy, explainability, and fairness in the DL models. We pay particular attention to fusion techniques as one of the ways to increase trustworthiness, which have also been used to address the common multimodality in orthopaedics. Finally, we have reviewed the approval requirements set forth by the US Food and Drug Administration to enable the use of DL applications. As such, we aim to have this review function as a guide for researchers to develop a reliable DL application for orthopaedic tasks from scratch for use in the market.
Collapse
Affiliation(s)
- Laith Alzubaidi
- School of Mechanical, Medical, and Process Engineering, Queensland University of Technology, Brisbane, QLD 4000, Australia; QUASR/ARC Industrial Transformation Training Centre-Joint Biomechanics, Queensland University of Technology, Brisbane, QLD 4000, Australia; Research and Development department, Akunah Med Technology Pty Ltd Co, Brisbane, QLD 4120, Australia.
| | - Khamael Al-Dulaimi
- Computer Science Department, College of Science, Al-Nahrain University, Baghdad, Baghdad 10011, Iraq; School of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, QLD 4000, Australia
| | - Asma Salhi
- QUASR/ARC Industrial Transformation Training Centre-Joint Biomechanics, Queensland University of Technology, Brisbane, QLD 4000, Australia; Research and Development department, Akunah Med Technology Pty Ltd Co, Brisbane, QLD 4120, Australia
| | - Zaenab Alammar
- School of Computer Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
| | - Mohammed A Fadhel
- Research and Development department, Akunah Med Technology Pty Ltd Co, Brisbane, QLD 4120, Australia
| | - A S Albahri
- Technical College, Imam Ja'afar Al-Sadiq University, Baghdad, Iraq
| | - A H Alamoodi
- Institute of Informatics and Computing in Energy, Universiti Tenaga Nasional, Kajang 43000, Malaysia
| | - O S Albahri
- Australian Technical and Management College, Melbourne, Australia
| | - Amjad F Hasan
- Faculty of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Jinshuai Bai
- School of Mechanical, Medical, and Process Engineering, Queensland University of Technology, Brisbane, QLD 4000, Australia; QUASR/ARC Industrial Transformation Training Centre-Joint Biomechanics, Queensland University of Technology, Brisbane, QLD 4000, Australia
| | - Luke Gilliland
- QUASR/ARC Industrial Transformation Training Centre-Joint Biomechanics, Queensland University of Technology, Brisbane, QLD 4000, Australia; Research and Development department, Akunah Med Technology Pty Ltd Co, Brisbane, QLD 4120, Australia
| | - Jing Peng
- Research and Development department, Akunah Med Technology Pty Ltd Co, Brisbane, QLD 4120, Australia
| | - Marco Branni
- QUASR/ARC Industrial Transformation Training Centre-Joint Biomechanics, Queensland University of Technology, Brisbane, QLD 4000, Australia; Research and Development department, Akunah Med Technology Pty Ltd Co, Brisbane, QLD 4120, Australia
| | - Tristan Shuker
- QUASR/ARC Industrial Transformation Training Centre-Joint Biomechanics, Queensland University of Technology, Brisbane, QLD 4000, Australia; St Andrew's War Memorial Hospital, Brisbane, QLD 4000, Australia
| | - Kenneth Cutbush
- QUASR/ARC Industrial Transformation Training Centre-Joint Biomechanics, Queensland University of Technology, Brisbane, QLD 4000, Australia; St Andrew's War Memorial Hospital, Brisbane, QLD 4000, Australia
| | - Jose Santamaría
- Department of Computer Science, University of Jaén, Jaén 23071, Spain
| | - Catarina Moreira
- Data Science Institute, University of Technology Sydney, Australia
| | - Chun Ouyang
- School of Information Systems, Queensland University of Technology, Brisbane, QLD 4000, Australia
| | - Ye Duan
- School of Computing, Clemson University, Clemson, 29631, SC, USA
| | - Mohamed Manoufali
- CSIRO, Kensington, WA 6151, Australia; School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, QLD 4067, Australia
| | - Mohammad Jomaa
- QUASR/ARC Industrial Transformation Training Centre-Joint Biomechanics, Queensland University of Technology, Brisbane, QLD 4000, Australia; St Andrew's War Memorial Hospital, Brisbane, QLD 4000, Australia
| | - Ashish Gupta
- School of Mechanical, Medical, and Process Engineering, Queensland University of Technology, Brisbane, QLD 4000, Australia; QUASR/ARC Industrial Transformation Training Centre-Joint Biomechanics, Queensland University of Technology, Brisbane, QLD 4000, Australia; Research and Development department, Akunah Med Technology Pty Ltd Co, Brisbane, QLD 4120, Australia
| | - Amin Abbosh
- School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, QLD 4067, Australia
| | - Yuantong Gu
- School of Mechanical, Medical, and Process Engineering, Queensland University of Technology, Brisbane, QLD 4000, Australia; QUASR/ARC Industrial Transformation Training Centre-Joint Biomechanics, Queensland University of Technology, Brisbane, QLD 4000, Australia
| |
Collapse
|
13
|
Ruitenbeek HC, Oei EHG, Visser JJ, Kijowski R. Artificial intelligence in musculoskeletal imaging: realistic clinical applications in the next decade. Skeletal Radiol 2024; 53:1849-1868. [PMID: 38902420 DOI: 10.1007/s00256-024-04684-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/06/2024] [Accepted: 04/15/2024] [Indexed: 06/22/2024]
Abstract
This article will provide a perspective review of the most extensively investigated deep learning (DL) applications for musculoskeletal disease detection that have the best potential to translate into routine clinical practice over the next decade. Deep learning methods for detecting fractures, estimating pediatric bone age, calculating bone measurements such as lower extremity alignment and Cobb angle, and grading osteoarthritis on radiographs have been shown to have high diagnostic performance with many of these applications now commercially available for use in clinical practice. Many studies have also documented the feasibility of using DL methods for detecting joint pathology and characterizing bone tumors on magnetic resonance imaging (MRI). However, musculoskeletal disease detection on MRI is difficult as it requires multi-task, multi-class detection of complex abnormalities on multiple image slices with different tissue contrasts. The generalizability of DL methods for musculoskeletal disease detection on MRI is also challenging due to fluctuations in image quality caused by the wide variety of scanners and pulse sequences used in routine MRI protocols. The diagnostic performance of current DL methods for musculoskeletal disease detection must be further evaluated in well-designed prospective studies using large image datasets acquired at different institutions with different imaging parameters and imaging hardware before they can be fully implemented in clinical practice. Future studies must also investigate the true clinical benefits of current DL methods and determine whether they could enhance quality, reduce error rates, improve workflow, and decrease radiologist fatigue and burnout with all of this weighed against the costs.
Collapse
Affiliation(s)
- Huibert C Ruitenbeek
- Department of Radiology and Nuclear Medicine, Erasmus MC, University Medical Center, P.O. Box 2040, 3000 CA, Rotterdam, The Netherlands
| | - Edwin H G Oei
- Department of Radiology and Nuclear Medicine, Erasmus MC, University Medical Center, P.O. Box 2040, 3000 CA, Rotterdam, The Netherlands
| | - Jacob J Visser
- Department of Radiology and Nuclear Medicine, Erasmus MC, University Medical Center, P.O. Box 2040, 3000 CA, Rotterdam, The Netherlands
| | - Richard Kijowski
- Department of Radiology, New York University Grossman School of Medicine, 660 First Avenue, 3rd Floor, New York, NY, 10016, USA.
| |
Collapse
|
14
|
Kutbi M. Artificial Intelligence-Based Applications for Bone Fracture Detection Using Medical Images: A Systematic Review. Diagnostics (Basel) 2024; 14:1879. [PMID: 39272664 PMCID: PMC11394268 DOI: 10.3390/diagnostics14171879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 08/19/2024] [Accepted: 08/26/2024] [Indexed: 09/15/2024] Open
Abstract
Artificial intelligence (AI) is making notable advancements in the medical field, particularly in bone fracture detection. This systematic review compiles and assesses existing research on AI applications aimed at identifying bone fractures through medical imaging, encompassing studies from 2010 to 2023. It evaluates the performance of various AI models, such as convolutional neural networks (CNNs), in diagnosing bone fractures, highlighting their superior accuracy, sensitivity, and specificity compared to traditional diagnostic methods. Furthermore, the review explores the integration of advanced imaging techniques like 3D CT and MRI with AI algorithms, which has led to enhanced diagnostic accuracy and improved patient outcomes. The potential of Generative AI and Large Language Models (LLMs), such as OpenAI's GPT, to enhance diagnostic processes through synthetic data generation, comprehensive report creation, and clinical scenario simulation is also discussed. The review underscores the transformative impact of AI on diagnostic workflows and patient care, while also identifying research gaps and suggesting future research directions to enhance data quality, model robustness, and ethical considerations.
Collapse
Affiliation(s)
- Mohammed Kutbi
- College of Computing and Informatics, Saudi Electronic University, Riyadh 13316, Saudi Arabia
| |
Collapse
|
15
|
Nowroozi A, Salehi MA, Shobeiri P, Agahi S, Momtazmanesh S, Kaviani P, Kalra MK. Artificial intelligence diagnostic accuracy in fracture detection from plain radiographs and comparing it with clinicians: a systematic review and meta-analysis. Clin Radiol 2024; 79:579-588. [PMID: 38772766 DOI: 10.1016/j.crad.2024.04.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/09/2024] [Accepted: 04/15/2024] [Indexed: 05/23/2024]
Abstract
PURPOSE Fracture detection is one of the most commonly used and studied aspects of artificial intelligence (AI) in medicine. In this systematic review and meta-analysis, we aimed to summarize available literature and data regarding AI performance in fracture detection on plain radiographs and various factors affecting it. METHODS We systematically reviewed studies evaluating AI algorithms in detecting bone fractures in plain radiographs, combined their performance using meta-analysis (a bivariate regression approach), and compared it with that of clinicians. We also analyzed the factors potentially affecting algorithm performance using meta-regression. RESULTS Our analysis included 100 studies. In 83 studies with confusion matrices, AI algorithms showed a sensitivity of 91.43% and a specificity of 92.12% (Area under the summary receiver operator curve = 0.968). After adjustment and false discovery rate correction, tibia/fibula (excluding ankle) fractures were associated with higher (7.0%, p=0.004) AI sensitivity, while more recent publications (5.5%, p=0.003) and Xception architecture (6.6%, p<0.001) were associated with higher specificity. Clinicians and AI showed similar specificity in fracture identification, although AI leaned to higher sensitivity (7.6%, p=0.07). Radiologists, on the other hand, were more specific than AI overall and in several subgroups, and more sensitive to hip fractures before FDR correction. CONCLUSIONS Currently available AI aids could result in a significant improvement in care where radiologists are not readily available. Moreover, identifying factors affecting algorithm performance could guide AI development teams in their process of optimizing their products.
Collapse
Affiliation(s)
- A Nowroozi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - M A Salehi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - P Shobeiri
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - S Agahi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - S Momtazmanesh
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - P Kaviani
- Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - M K Kalra
- Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA.
| |
Collapse
|
16
|
Rakhshankhah N, Abbaszadeh M, Kazemi A, Rezaei SS, Roozpeykar S, Arabfard M. Deep learning approach to femoral AVN detection in digital radiography: differentiating patients and pre-collapse stages. BMC Musculoskelet Disord 2024; 25:547. [PMID: 39010001 PMCID: PMC11251364 DOI: 10.1186/s12891-024-07669-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 07/08/2024] [Indexed: 07/17/2024] Open
Abstract
OBJECTIVE This study aimed to evaluate a new deep-learning model for diagnosing avascular necrosis of the femoral head (AVNFH) by analyzing pelvic anteroposterior digital radiography. METHODS The study sample included 1167 hips. The radiographs were independently classified into 6 stages by a radiologist using their simultaneous MRIs. After that, the radiographs were given to train and test the deep learning models of the project including SVM and ANFIS layer using the Python programming language and TensorFlow library. In the last step, the test set of hip radiographs was provided to two independent radiologists with different work experiences to compare their diagnosis performance to the deep learning models' performance using the F1 score and Mcnemar test analysis. RESULTS The performance of SVM for AVNFH detection (AUC = 82.88%) was slightly higher than less experienced radiologists (79.68%) and slightly lower than experienced radiologists (88.4%) without reaching significance (p-value > 0.05). Evaluation of the performance of SVM for pre-collapse AVNFH detection with an AUC of 73.58% showed significantly higher performance than less experienced radiologists (AUC = 60.70%, p-value < 0.001). On the other hand, no significant difference is noted between experienced radiologists and SVM for pre-collapse detection. ANFIS algorithm for AVNFH detection with an AUC of 86.60% showed significantly higher performance than less experienced radiologists (AUC = 79.68%, p-value = 0.04). Although reaching less performance compared to experienced radiologists statistically not significant (AUC = 88.40%, p-value = 0.20). CONCLUSIONS Our study has shed light on the remarkable capabilities of SVM and ANFIS as diagnostic tools for AVNFH detection in radiography. Their ability to achieve high accuracy with remarkable efficiency makes them promising candidates for early detection and intervention, ultimately contributing to improved patient outcomes.
Collapse
Affiliation(s)
- Nima Rakhshankhah
- Department of Radiology and Health Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Mahdi Abbaszadeh
- Department of Orthopedic Surgery, Faculty of Medicine, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Atefeh Kazemi
- Department of Radiology and Health Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Soroush Soltan Rezaei
- Student Research Committee, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Saeid Roozpeykar
- Department of Radiology and Health Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| | - Masoud Arabfard
- Chemical Injuries Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
17
|
Akbarian E, Mohammadi M, Tiala E, Ljungberg O, Sharif Razavian A, Magnéli M, Gordon M. Development and validation of an artificial intelligence model for the classification of hip fractures using the AO-OTA framework. Acta Orthop 2024; 95:340-347. [PMID: 38888052 PMCID: PMC11184710 DOI: 10.2340/17453674.2024.40949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 05/15/2024] [Indexed: 06/20/2024] Open
Abstract
BACKGROUND AND PURPOSE Artificial intelligence (AI) has the potential to aid in the accurate diagnosis of hip fractures and reduce the workload of clinicians. We primarily aimed to develop and validate a convolutional neural network (CNN) for the automated classification of hip fractures based on the 2018 AO-OTA classification system. The secondary aim was to incorporate the model's assessment of additional radiographic findings that often accompany such injuries. METHODS 6,361 plain radiographs of the hip taken between 2002 and 2016 at Danderyd University Hospital were used to train the CNN. A separate set of 343 radiographs representing 324 unique patients was used to test the performance of the network. Performance was evaluated using area under the curve (AUC), sensitivity, specificity, and Youden's index. RESULTS The CNN demonstrated high performance in identifying and classifying hip fracture, with AUCs ranging from 0.76 to 0.99 for different fracture categories. The AUC for hip fractures ranged from 0.86 to 0.99, for distal femur fractures from 0.76 to 0.99, and for pelvic fractures from 0.91 to 0.94. For 29 of 39 fracture categories, the AUC was ≥ 0.95. CONCLUSION We found that AI has the potential for accurate and automated classification of hip fractures based on the AO-OTA classification system. Further training and modification of the CNN may enable its use in clinical settings.
Collapse
Affiliation(s)
- Ehsan Akbarian
- Department of Clinical Sciences, Karolinska Institutet, Danderyd University Hospital, Stockholm, Sweden.
| | - Mehrgan Mohammadi
- Department of Clinical Sciences, Karolinska Institutet, Danderyd University Hospital, Stockholm, Sweden
| | - Emilia Tiala
- Department of Clinical Sciences, Karolinska Institutet, Danderyd University Hospital, Stockholm, Sweden
| | - Oscar Ljungberg
- Department of Clinical Sciences, Karolinska Institutet, Danderyd University Hospital, Stockholm, Sweden
| | - Ali Sharif Razavian
- Department of Clinical Sciences, Karolinska Institutet, Danderyd University Hospital, Stockholm, Sweden
| | - Martin Magnéli
- Department of Clinical Sciences, Karolinska Institutet, Danderyd University Hospital, Stockholm, Sweden
| | - Max Gordon
- Department of Clinical Sciences, Karolinska Institutet, Danderyd University Hospital, Stockholm, Sweden
| |
Collapse
|
18
|
Lee JM, Park JY, Kim YJ, Kim KG. Deep-learning-based pelvic automatic segmentation in pelvic fractures. Sci Rep 2024; 14:12258. [PMID: 38806582 PMCID: PMC11133416 DOI: 10.1038/s41598-024-63093-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 05/24/2024] [Indexed: 05/30/2024] Open
Abstract
With the recent increase in traffic accidents, pelvic fractures are increasing, second only to skull fractures, in terms of mortality and risk of complications. Research is actively being conducted on the treatment of intra-abdominal bleeding, the primary cause of death related to pelvic fractures. Considerable preliminary research has also been performed on segmenting tumors and organs. However, studies on clinically useful algorithms for bone and pelvic segmentation, based on developed models, are limited. In this study, we explored the potential of deep-learning models presented in previous studies to accurately segment pelvic regions in X-ray images. Data were collected from X-ray images of 940 patients aged 18 or older at Gachon University Gil Hospital from January 2015 to December 2022. To segment the pelvis, Attention U-Net, Swin U-Net, and U-Net were trained, thereby comparing and analyzing the results using five-fold cross-validation. The Swin U-Net model displayed relatively high performance compared to Attention U-Net and U-Net models, achieving an average sensitivity, specificity, accuracy, and dice similarity coefficient of 96.77%, of 98.50%, 98.03%, and 96.32%, respectively.
Collapse
Affiliation(s)
- Jung Min Lee
- Department of Computer Engineering, College of IT Convergence, Gachon University, Seongnam, Republic of Korea
| | - Jun Young Park
- Department of Health Sciences and Technology, Gachon Advanced Institute for Health Sciences and Technology, Gachon University, Incheon, Republic of Korea
| | - Young Jae Kim
- Department of Computer Engineering, College of IT Convergence, Gachon University, Seongnam, Republic of Korea
- Department of Biomedical Engineering, College of Medicine, Gachon University, Incheon, Republic of Korea
- Medical Device R&D Center, Gachon University Gil Hospital, Incheon, Republic of Korea
| | - Kwang Gi Kim
- Department of Computer Engineering, College of IT Convergence, Gachon University, Seongnam, Republic of Korea.
- Department of Biomedical Engineering, College of Medicine, Gachon University, Incheon, Republic of Korea.
- Medical Device R&D Center, Gachon University Gil Hospital, Incheon, Republic of Korea.
- Department of Health Sciences and Technology, Gachon Advanced Institute for Health Sciences and Technology, Gachon University, Incheon, Republic of Korea.
| |
Collapse
|
19
|
Wilhelm NJ, von Schacky CE, Lindner FJ, Feucht MJ, Ehmann Y, Pogorzelski J, Haddadin S, Neumann J, Hinterwimmer F, von Eisenhart-Rothe R, Jung M, Russe MF, Izadpanah K, Siebenlist S, Burgkart R, Rupp MC. Multicentric development and validation of a multi-scale and multi-task deep learning model for comprehensive lower extremity alignment analysis. Artif Intell Med 2024; 150:102843. [PMID: 38553152 DOI: 10.1016/j.artmed.2024.102843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 03/11/2024] [Accepted: 03/11/2024] [Indexed: 04/02/2024]
Abstract
Osteoarthritis of the knee, a widespread cause of knee disability, is commonly treated in orthopedics due to its rising prevalence. Lower extremity misalignment, pivotal in knee injury etiology and management, necessitates comprehensive mechanical alignment evaluation via frequently-requested weight-bearing long leg radiographs (LLR). Despite LLR's routine use, current analysis techniques are error-prone and time-consuming. To address this, we conducted a multicentric study to develop and validate a deep learning (DL) model for fully automated leg alignment assessment on anterior-posterior LLR, targeting enhanced reliability and efficiency. The DL model, developed using 594 patients' LLR and a 60%/10%/30% data split for training, validation, and testing, executed alignment analyses via a multi-step process, employing a detection network and nine specialized networks. It was designed to assess all vital anatomical and mechanical parameters for standard clinical leg deformity analysis and preoperative planning. Accuracy, reliability, and assessment duration were compared with three specialized orthopedic surgeons across two distinct institutional datasets (136 and 143 radiographs). The algorithm exhibited equivalent performance to the surgeons in terms of alignment accuracy (DL: 0.21 ± 0.18°to 1.06 ± 1.3°vs. OS: 0.21 ± 0.16°to 1.72 ± 1.96°), interrater reliability (ICC DL: 0.90 ± 0.05 to 1.0 ± 0.0 vs. ICC OS: 0.90 ± 0.03 to 1.0 ± 0.0), and clinically acceptable accuracy (DL: 53.9%-100% vs OS 30.8%-100%). Further, automated analysis significantly reduced analysis time compared to manual annotation (DL: 22 ± 0.6 s vs. OS; 101.7 ± 7 s, p ≤ 0.01). By demonstrating that our algorithm not only matches the precision of expert surgeons but also significantly outpaces them in both speed and consistency of measurements, our research underscores a pivotal advancement in harnessing AI to enhance clinical efficiency and decision-making in orthopaedics.
Collapse
Affiliation(s)
- Nikolas J Wilhelm
- Department of Orthopedics and Sports Orthopedics, Klinikum rechts der Isar, School of Medicine, Munich, Germany; Munich Institute of Robotics and Machine Intelligence, Department of Electrical and Computer Engineering, Technical University of Munich, Munich, Germany.
| | - Claudio E von Schacky
- Department of Radiology, Klinikum rechts der Isar, School of Medicine, Munich, Germany
| | - Felix J Lindner
- Department of Orthopedic Sports Medicine , Klinikum rechts der Isar, School of Medicine, Munich, Germany
| | - Matthias J Feucht
- Department of Orthopedics and Trauma Surgery, Medical Center, Faculty of Medicine, Albert-Ludwigs-University of Freiburg, Freiburg, Germany; Orthopedic Clinic Paulinenhilfe, Diakonie-Hospital, Stuttgart, Germany
| | - Yannick Ehmann
- Department of Orthopedic Sports Medicine , Klinikum rechts der Isar, School of Medicine, Munich, Germany
| | - Jonas Pogorzelski
- Department of Orthopedic Sports Medicine , Klinikum rechts der Isar, School of Medicine, Munich, Germany
| | - Sami Haddadin
- Munich Institute of Robotics and Machine Intelligence, Department of Electrical and Computer Engineering, Technical University of Munich, Munich, Germany
| | - Jan Neumann
- Department of Radiology, Klinikum rechts der Isar, School of Medicine, Munich, Germany
| | - Florian Hinterwimmer
- Department of Orthopedics and Sports Orthopedics, Klinikum rechts der Isar, School of Medicine, Munich, Germany
| | - Rüdiger von Eisenhart-Rothe
- Department of Orthopedics and Sports Orthopedics, Klinikum rechts der Isar, School of Medicine, Munich, Germany
| | - Matthias Jung
- Department of Radiology, Medical Center, Faculty of Medicine, Albert-Ludwigs-University of Freiburg, Freiburg, Germany
| | - Maximilian F Russe
- Department of Radiology, Medical Center, Faculty of Medicine, Albert-Ludwigs-University of Freiburg, Freiburg, Germany
| | - Kaywan Izadpanah
- Department of Radiology, Medical Center, Faculty of Medicine, Albert-Ludwigs-University of Freiburg, Freiburg, Germany
| | - Sebastian Siebenlist
- Department of Orthopedic Sports Medicine , Klinikum rechts der Isar, School of Medicine, Munich, Germany
| | - Rainer Burgkart
- Department of Orthopedics and Sports Orthopedics, Klinikum rechts der Isar, School of Medicine, Munich, Germany
| | - Marco-Christopher Rupp
- Department of Orthopedic Sports Medicine , Klinikum rechts der Isar, School of Medicine, Munich, Germany
| |
Collapse
|
20
|
Tieu A, Kroen E, Kadish Y, Liu Z, Patel N, Zhou A, Yilmaz A, Lee S, Deyer T. The Role of Artificial Intelligence in the Identification and Evaluation of Bone Fractures. Bioengineering (Basel) 2024; 11:338. [PMID: 38671760 PMCID: PMC11047896 DOI: 10.3390/bioengineering11040338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 03/23/2024] [Accepted: 03/26/2024] [Indexed: 04/28/2024] Open
Abstract
Artificial intelligence (AI), particularly deep learning, has made enormous strides in medical imaging analysis. In the field of musculoskeletal radiology, deep-learning models are actively being developed for the identification and evaluation of bone fractures. These methods provide numerous benefits to radiologists such as increased diagnostic accuracy and efficiency while also achieving standalone performances comparable or superior to clinician readers. Various algorithms are already commercially available for integration into clinical workflows, with the potential to improve healthcare delivery and shape the future practice of radiology. In this systematic review, we explore the performance of current AI methods in the identification and evaluation of fractures, particularly those in the ankle, wrist, hip, and ribs. We also discuss current commercially available products for fracture detection and provide an overview of the current limitations of this technology and future directions of the field.
Collapse
Affiliation(s)
- Andrew Tieu
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Ezriel Kroen
- New York Medical College, Valhalla, NY 10595, USA
| | | | - Zelong Liu
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Nikhil Patel
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Alexander Zhou
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | | | | - Timothy Deyer
- East River Medical Imaging, New York, NY 10021, USA
- Department of Radiology, Cornell Medicine, New York, NY 10021, USA
| |
Collapse
|
21
|
Patel R, Mcconaghie G, Webb J, Laing G, Philpott M, Roach R, Wagner W, Rhee SJ, Banerjee R. Five historical innovations that have shaped modern orthopaedic surgery. J Perioper Pract 2024; 34:84-92. [PMID: 37596805 DOI: 10.1177/17504589231179302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/20/2023]
Abstract
Throughout history, many innovations have contributed to the development of modern orthopaedic surgery, improving patient outcomes and expanding the range of treatment options available to patients. This article explores five key historical innovations that have shaped modern orthopaedic surgery: X-ray imaging, bone cement, the Thomas splint, the Pneumatic tourniquet and robotic-assisted surgery. We will review the development, impact and significance of each innovation, highlighting their contributions to the field of orthopaedic surgery and their ongoing relevance in contemporary and perioperative practice.
Collapse
Affiliation(s)
- Ravi Patel
- Department of Trauma and Orthopaedics, The Princess Royal Hospital, The Shrewsbury and Telford Trust, Telford, UK
- Department of Trauma and Orthopaedics, The Robert Jones and Agnes Hunt Orthopaedic Hospital, Oswestry, UK
| | - Greg Mcconaghie
- Department of Trauma and Orthopaedics, The Robert Jones and Agnes Hunt Orthopaedic Hospital, Oswestry, UK
| | - Jeremy Webb
- Department of Trauma and Orthopaedics, The Robert Jones and Agnes Hunt Orthopaedic Hospital, Oswestry, UK
| | - Georgina Laing
- Department of Trauma and Orthopaedics, The Robert Jones and Agnes Hunt Orthopaedic Hospital, Oswestry, UK
| | - Matthew Philpott
- Department of Trauma and Orthopaedics, The Robert Jones and Agnes Hunt Orthopaedic Hospital, Oswestry, UK
| | - Richard Roach
- Department of Trauma and Orthopaedics, The Princess Royal Hospital, The Shrewsbury and Telford Trust, Telford, UK
| | - Wilhelm Wagner
- Department of Trauma and Orthopaedics, The Princess Royal Hospital, The Shrewsbury and Telford Trust, Telford, UK
| | - Shin-Jae Rhee
- Department of Trauma and Orthopaedics, The Princess Royal Hospital, The Shrewsbury and Telford Trust, Telford, UK
| | - Robin Banerjee
- Department of Trauma and Orthopaedics, The Robert Jones and Agnes Hunt Orthopaedic Hospital, Oswestry, UK
| |
Collapse
|
22
|
Zhang Z, Ke C, Zhang Z, Chen Y, Weng H, Dong J, Hao M, Liu B, Zheng M, Li J, Ding S, Dong Y, Peng Z. Re-tear after arthroscopic rotator cuff repair can be predicted using deep learning algorithm. Front Artif Intell 2024; 7:1331853. [PMID: 38487743 PMCID: PMC10938848 DOI: 10.3389/frai.2024.1331853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 02/12/2024] [Indexed: 03/17/2024] Open
Abstract
The application of artificial intelligence technology in the medical field has become increasingly prevalent, yet there remains significant room for exploration in its deep implementation. Within the field of orthopedics, which integrates closely with AI due to its extensive data requirements, rotator cuff injuries are a commonly encountered condition in joint motion. One of the most severe complications following rotator cuff repair surgery is the recurrence of tears, which has a significant impact on both patients and healthcare professionals. To address this issue, we utilized the innovative EV-GCN algorithm to train a predictive model. We collected medical records of 1,631 patients who underwent rotator cuff repair surgery at a single center over a span of 5 years. In the end, our model successfully predicted postoperative re-tear before the surgery using 62 preoperative variables with an accuracy of 96.93%, and achieved an accuracy of 79.55% on an independent external dataset of 518 cases from other centers. This model outperforms human doctors in predicting outcomes with high accuracy. Through this methodology and research, our aim is to utilize preoperative prediction models to assist in making informed medical decisions during and after surgery, leading to improved treatment effectiveness. This research method and strategy can be applied to other medical fields, and the research findings can assist in making healthcare decisions.
Collapse
Affiliation(s)
- Zhewei Zhang
- Ningbo University affiliated Li Huili Hospital, Ningbo University, Ningbo, China
- Health Science Center, Ningbo University, Ningbo, China
| | - Chunhai Ke
- Ningbo University affiliated Li Huili Hospital, Ningbo University, Ningbo, China
| | - Zhibin Zhang
- Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, China
- Key Laboratory of Mobile Network Application Technology of Zhejiang Province, Ningbo University, Ningbo, China
| | - Yujiong Chen
- Ningbo University affiliated Li Huili Hospital, Ningbo University, Ningbo, China
- Health Science Center, Ningbo University, Ningbo, China
| | - Hangbin Weng
- Ningbo University affiliated Li Huili Hospital, Ningbo University, Ningbo, China
- Health Science Center, Ningbo University, Ningbo, China
| | - Jieyang Dong
- Ningbo University affiliated Li Huili Hospital, Ningbo University, Ningbo, China
- Health Science Center, Ningbo University, Ningbo, China
| | - Mingming Hao
- Ningbo University affiliated Li Huili Hospital, Ningbo University, Ningbo, China
| | - Botao Liu
- Ningbo University affiliated Li Huili Hospital, Ningbo University, Ningbo, China
- Health Science Center, Ningbo University, Ningbo, China
| | - Minzhe Zheng
- Ningbo University affiliated Li Huili Hospital, Ningbo University, Ningbo, China
| | - Jin Li
- Ningbo University affiliated Li Huili Hospital, Ningbo University, Ningbo, China
| | - Shaohua Ding
- Ningbo University affiliated Li Huili Hospital, Ningbo University, Ningbo, China
| | - Yihong Dong
- Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, China
- Key Laboratory of Mobile Network Application Technology of Zhejiang Province, Ningbo University, Ningbo, China
| | - Zhaoxiang Peng
- Ningbo University affiliated Li Huili Hospital, Ningbo University, Ningbo, China
| |
Collapse
|
23
|
Magnéli M, Borjali A, Takahashi E, Axenhus M, Malchau H, Moratoglu OK, Varadarajan KM. Application of deep learning for automated diagnosis and classification of hip dysplasia on plain radiographs. BMC Musculoskelet Disord 2024; 25:117. [PMID: 38336666 PMCID: PMC10854089 DOI: 10.1186/s12891-024-07244-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Accepted: 01/30/2024] [Indexed: 02/12/2024] Open
Abstract
BACKGROUND Hip dysplasia is a condition where the acetabulum is too shallow to support the femoral head and is commonly considered a risk factor for hip osteoarthritis. The objective of this study was to develop a deep learning model to diagnose hip dysplasia from plain radiographs and classify dysplastic hips based on their severity. METHODS We collected pelvic radiographs of 571 patients from two single-center cohorts and one multicenter cohort. The radiographs were split in half to create hip radiographs (n = 1022). One orthopaedic surgeon and one resident assessed the radiographs for hip dysplasia on either side. We used the center edge (CE) angle as the primary diagnostic criteria. Hips with a CE angle < 20°, 20° to 25°, and > 25° were labeled as dysplastic, borderline, and normal, respectively. The dysplastic hips were also classified with both Crowe and Hartofilakidis classification of dysplasia. The dataset was divided into train, validation, and test subsets using 80:10:10 split-ratio that were used to train two deep learning models to classify images into normal, borderline and (1) Crowe grade 1-4 or (2) Hartofilakidis grade 1-3. A pre-trained on Imagenet VGG16 convolutional neural network (CNN) was utilized by performing layer-wise fine-turning. RESULTS Both models struggled with distinguishing between normal and borderline hips. However, achieved high accuracy (Model 1: 92.2% and Model 2: 83.3%) in distinguishing between normal/borderline vs. dysplastic hips. The overall accuracy of Model 1 was 68% and for Model 2 73.5%. Most misclassifications for the Crowe and Hartofilakidis classifications were +/- 1 class from the correct class. CONCLUSIONS This pilot study shows promising results that a deep learning model distinguish between normal and dysplastic hips with high accuracy. Future research and external validation are warranted regarding the ability of deep learning models to perform complex tasks such as identifying and classifying disorders using plain radiographs. LEVEL OF EVIDENCE Diagnostic level IV.
Collapse
Affiliation(s)
- Martin Magnéli
- Department of Orthopaedic Surgery, Harvard Medical School, Boston, MA, USA
- Department of Orthopaedic Surgery, Harris Orthopaedics Laboratory, Massachusetts General Hospital, Boston, MA, USA
- Karolinska Institutet, Department of Clinical Sciences, Danderyd Hospital, Stockholm, Sweden
| | - Alireza Borjali
- Department of Orthopaedic Surgery, Harvard Medical School, Boston, MA, USA
- Department of Orthopaedic Surgery, Harris Orthopaedics Laboratory, Massachusetts General Hospital, Boston, MA, USA
| | - Eiji Takahashi
- Department of Orthopaedic Surgery, Harvard Medical School, Boston, MA, USA
- Department of Orthopaedic Surgery, Harris Orthopaedics Laboratory, Massachusetts General Hospital, Boston, MA, USA
- Department of Orthopaedic Surgery, Kanazawa Medical University, Uchinada, Japan
| | - Michael Axenhus
- Karolinska Institutet, Department of Clinical Sciences, Danderyd Hospital, Stockholm, Sweden.
- Department of Orthopaedic Surgery, Danderyd Hospital, Stockholm, Sweden.
| | - Henrik Malchau
- Department of Orthopaedic Surgery, Harris Orthopaedics Laboratory, Massachusetts General Hospital, Boston, MA, USA
- Department of Orthopaedic Surgery, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Orhun K Moratoglu
- Department of Orthopaedic Surgery, Harvard Medical School, Boston, MA, USA
- Department of Orthopaedic Surgery, Harris Orthopaedics Laboratory, Massachusetts General Hospital, Boston, MA, USA
| | | |
Collapse
|
24
|
Hoy MK, Desai V, Mutasa S, Hoy RC, Gorniak R, Belair JA. Deep Learning-Assisted Identification of Femoroacetabular Impingement (FAI) on Routine Pelvic Radiographs. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:339-346. [PMID: 38343231 PMCID: PMC10976936 DOI: 10.1007/s10278-023-00920-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 08/08/2023] [Accepted: 08/22/2023] [Indexed: 03/02/2024]
Abstract
To use a novel deep learning system to localize the hip joints and detect findings of cam-type femoroacetabular impingement (FAI). A retrospective search of hip/pelvis radiographs obtained in patients to evaluate for FAI yielded 3050 total studies. Each hip was classified separately by the original interpreting radiologist in the following manner: 724 hips had severe cam-type FAI morphology, 962 moderate cam-type FAI morphology, 846 mild cam-type FAI morphology, and 518 hips were normal. The anteroposterior (AP) view from each study was anonymized and extracted. After localization of the hip joints by a novel convolutional neural network (CNN) based on the focal loss principle, a second CNN classified the images of the hip as cam positive, or no FAI. Accuracy was 74% for diagnosing normal vs. abnormal cam-type FAI morphology, with aggregate sensitivity and specificity of 0.821 and 0.669, respectively, at the chosen operating point. The aggregate AUC was 0.736. A deep learning system can be applied to detect FAI-related changes on single view pelvic radiographs. Deep learning is useful for quickly identifying and categorizing pathology on imaging, which may aid the interpreting radiologist.
Collapse
Affiliation(s)
| | - Vishal Desai
- Thomas Jefferson University, Philadelphia, PA, USA
| | | | - Robert C Hoy
- Temple University Hospital, Philadelphia, PA, USA
| | | | | |
Collapse
|
25
|
Mathis D, Ackermann J, Günther D, Laky B, Deichsel A, Schüttler KF, Wafaisade A, Eggeling L, Kopf S, Münch L, Herbst E. Künstliche Intelligenz in der Orthopädie. ARTHROSKOPIE 2024; 37:52-64. [DOI: 10.1007/s00142-023-00657-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/19/2023] [Indexed: 01/06/2025]
Abstract
ZusammenfassungWir befinden uns in einer Phase exponentiellen Wachstums bei der Nutzung von künstlicher Intelligenz (KI). Knapp 90 % der KI-Forschung in der Orthopädie und Unfallchirurgie wurde in den letzten 3 Jahren veröffentlicht. In der Mehrzahl der Untersuchungen wurde KI zur Bildinterpretation oder als klinisches Entscheidungsinstrument eingesetzt. Die am häufigsten untersuchten Körperregionen waren dabei Wirbelsäule, Knie und Hüfte. Mit der Verbesserung der Datenerfassung verbessern sich auch die mit KI assoziierten Möglichkeiten einer genaueren Diagnostik, von patientenspezifischen Behandlungsansätzen, verbesserter Ergebnisvorhersage und erweiterter Ausbildung. KI bietet einen potenziellen Weg, um Ärztinnen und Ärzte zu unterstützen und gleichzeitig den Wert der Behandlung zu maximieren. Ein grundlegendes Verständnis dafür, was KI beinhaltet und wie sie sich auf die Orthopädie und die Patientenversorgung auswirken kann, ist unerlässlich. Dieser Artikel gibt einen Überblick über die Anwendungsbereiche von KI-Systemen in der Orthopädie und stellt sie in den komplexen Gesamtkontext bestehend aus Interessensvertretern aus Politik, Industrie, Behörden und Medizin.
Collapse
|
26
|
Jung J, Dai J, Liu B, Wu Q. Artificial intelligence in fracture detection with different image modalities and data types: A systematic review and meta-analysis. PLOS DIGITAL HEALTH 2024; 3:e0000438. [PMID: 38289965 PMCID: PMC10826962 DOI: 10.1371/journal.pdig.0000438] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 12/25/2023] [Indexed: 02/01/2024]
Abstract
Artificial Intelligence (AI), encompassing Machine Learning and Deep Learning, has increasingly been applied to fracture detection using diverse imaging modalities and data types. This systematic review and meta-analysis aimed to assess the efficacy of AI in detecting fractures through various imaging modalities and data types (image, tabular, or both) and to synthesize the existing evidence related to AI-based fracture detection. Peer-reviewed studies developing and validating AI for fracture detection were identified through searches in multiple electronic databases without time limitations. A hierarchical meta-analysis model was used to calculate pooled sensitivity and specificity. A diagnostic accuracy quality assessment was performed to evaluate bias and applicability. Of the 66 eligible studies, 54 identified fractures using imaging-related data, nine using tabular data, and three using both. Vertebral fractures were the most common outcome (n = 20), followed by hip fractures (n = 18). Hip fractures exhibited the highest pooled sensitivity (92%; 95% CI: 87-96, p< 0.01) and specificity (90%; 95% CI: 85-93, p< 0.01). Pooled sensitivity and specificity using image data (92%; 95% CI: 90-94, p< 0.01; and 91%; 95% CI: 88-93, p < 0.01) were higher than those using tabular data (81%; 95% CI: 77-85, p< 0.01; and 83%; 95% CI: 76-88, p < 0.01), respectively. Radiographs demonstrated the highest pooled sensitivity (94%; 95% CI: 90-96, p < 0.01) and specificity (92%; 95% CI: 89-94, p< 0.01). Patient selection and reference standards were major concerns in assessing diagnostic accuracy for bias and applicability. AI displays high diagnostic accuracy for various fracture outcomes, indicating potential utility in healthcare systems for fracture diagnosis. However, enhanced transparency in reporting and adherence to standardized guidelines are necessary to improve the clinical applicability of AI. Review Registration: PROSPERO (CRD42021240359).
Collapse
Affiliation(s)
- Jongyun Jung
- Department of Biomedical Informatics (Dr. Qing Wu, Jongyun Jung, and Jingyuan Dai), College of Medicine, The Ohio State University, Columbus, Ohio, United States of America
| | - Jingyuan Dai
- Department of Biomedical Informatics (Dr. Qing Wu, Jongyun Jung, and Jingyuan Dai), College of Medicine, The Ohio State University, Columbus, Ohio, United States of America
| | - Bowen Liu
- Department of Mathematics and Statistics, Division of Computing, Analytics, and Mathematics, School of Science and Engineering (Bowen Liu), University of Missouri-Kansas City, Kansas City, Missouri, United States of America
| | - Qing Wu
- Department of Biomedical Informatics (Dr. Qing Wu, Jongyun Jung, and Jingyuan Dai), College of Medicine, The Ohio State University, Columbus, Ohio, United States of America
| |
Collapse
|
27
|
Kolasa K, Admassu B, Hołownia-Voloskova M, Kędzior KJ, Poirrier JE, Perni S. Systematic reviews of machine learning in healthcare: a literature review. Expert Rev Pharmacoecon Outcomes Res 2024; 24:63-115. [PMID: 37955147 DOI: 10.1080/14737167.2023.2279107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 10/31/2023] [Indexed: 11/14/2023]
Abstract
INTRODUCTION The increasing availability of data and computing power has made machine learning (ML) a viable approach to faster, more efficient healthcare delivery. METHODS A systematic literature review (SLR) of published SLRs evaluating ML applications in healthcare settings published between1 January 2010 and 27 March 2023 was conducted. RESULTS In total 220 SLRs covering 10,462 ML algorithms were reviewed. The main application of AI in medicine related to the clinical prediction and disease prognosis in oncology and neurology with the use of imaging data. Accuracy, specificity, and sensitivity were provided in 56%, 28%, and 25% SLRs respectively. Internal and external validation was reported in 53% and less than 1% of the cases respectively. The most common modeling approach was neural networks (2,454 ML algorithms), followed by support vector machine and random forest/decision trees (1,578 and 1,522 ML algorithms, respectively). EXPERT OPINION The review indicated considerable reporting gaps in terms of the ML's performance, both internal and external validation. Greater accessibility to healthcare data for developers can ensure the faster adoption of ML algorithms into clinical practice.
Collapse
Affiliation(s)
- Katarzyna Kolasa
- Division of Health Economics and Healthcare Management, Kozminski University, Warsaw, Poland
| | - Bisrat Admassu
- Division of Health Economics and Healthcare Management, Kozminski University, Warsaw, Poland
| | | | | | | | | |
Collapse
|
28
|
Joo MW, Ko T, Kim MS, Lee YS, Shin SH, Chung YG, Lee HK. Development and Validation of a Convolutional Neural Network Model to Predict a Pathologic Fracture in the Proximal Femur Using Abdomen and Pelvis CT Images of Patients With Advanced Cancer. Clin Orthop Relat Res 2023; 481:2247-2256. [PMID: 37615504 PMCID: PMC10566917 DOI: 10.1097/corr.0000000000002771] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 06/14/2023] [Indexed: 08/25/2023]
Abstract
BACKGROUND Improvement in survival in patients with advanced cancer is accompanied by an increased probability of bone metastasis and related pathologic fractures (especially in the proximal femur). The few systems proposed and used to diagnose impending fractures owing to metastasis and to ultimately prevent future fractures have practical limitations; thus, novel screening tools are essential. A CT scan of the abdomen and pelvis is a standard modality for staging and follow-up in patients with cancer, and radiologic assessments of the proximal femur are possible with CT-based digitally reconstructed radiographs. Deep-learning models, such as convolutional neural networks (CNNs), may be able to predict pathologic fractures from digitally reconstructed radiographs, but to our knowledge, they have not been tested for this application. QUESTIONS/PURPOSES (1) How accurate is a CNN model for predicting a pathologic fracture in a proximal femur with metastasis using digitally reconstructed radiographs of the abdomen and pelvis CT images in patients with advanced cancer? (2) Do CNN models perform better than clinicians with varying backgrounds and experience levels in predicting a pathologic fracture on abdomen and pelvis CT images without any knowledge of the patients' histories, except for metastasis in the proximal femur? METHODS A total of 392 patients received radiation treatment of the proximal femur at three hospitals from January 2011 to December 2021. The patients had 2945 CT scans of the abdomen and pelvis for systemic evaluation and follow-up in relation to their primary cancer. In 33% of the CT scans (974), it was impossible to identify whether a pathologic fracture developed within 3 months after each CT image was acquired, and these were excluded. Finally, 1971 cases with a mean age of 59 ± 12 years were included in this study. Pathologic fractures developed within 3 months after CT in 3% (60 of 1971) of cases. A total of 47% (936 of 1971) were women. Sixty cases had an established pathologic fracture within 3 months after each CT scan, and another group of 1911 cases had no established pathologic fracture within 3 months after CT scan. The mean age of the cases in the former and latter groups was 64 ± 11 years and 59 ± 12 years, respectively, and 32% (19 of 60) and 53% (1016 of 1911) of cases, respectively, were female. Digitally reconstructed radiographs were generated with perspective projections of three-dimensional CT volumes onto two-dimensional planes. Then, 1557 images from one hospital were used for a training set. To verify that the deep-learning models could consistently operate even in hospitals with a different medical environment, 414 images from other hospitals were used for external validation. The number of images in the groups with and without a pathologic fracture within 3 months after each CT scan increased from 1911 to 22,932 and from 60 to 720, respectively, using data augmentation methods that are known to be an effective way to boost the performance of deep-learning models. Three CNNs (VGG16, ResNet50, and DenseNet121) were fine-tuned using digitally reconstructed radiographs. For performance measures, the area under the receiver operating characteristic curve, accuracy, sensitivity, specificity, precision, and F1 score were determined. The area under the receiver operating characteristic curve was used to evaluate three CNN models mainly, and the optimal accuracy, sensitivity, and specificity were calculated using the Youden J statistic. Accuracy refers to the proportion of fractures in the groups with and without a pathologic fracture within 3 months after each CT scan that were accurately predicted by the CNN model. Sensitivity and specificity represent the proportion of accurately predicted fractures among those with and without a pathologic fracture within 3 months after each CT scan, respectively. Precision is a measure of how few false-positives the model produces. The F1 score is a harmonic mean of sensitivity and precision, which have a tradeoff relationship. Gradient-weighted class activation mapping images were created to check whether the CNN model correctly focused on potential pathologic fracture regions. The CNN model with the best performance was compared with the performance of clinicians. RESULTS DenseNet121 showed the best performance in identifying pathologic fractures; the area under the receiver operating characteristic curve for DenseNet121 was larger than those for VGG16 (0.77 ± 0.07 [95% CI 0.75 to 0.79] versus 0.71 ± 0.08 [95% CI 0.69 to 0.73]; p = 0.001) and ResNet50 (0.77 ± 0.07 [95% CI 0.75 to 0.79] versus 0.72 ± 0.09 [95% CI 0.69 to 0.74]; p = 0.001). Specifically, DenseNet121 scored the highest in sensitivity (0.22 ± 0.07 [95% CI 0.20 to 0.24]), precision (0.72 ± 0.19 [95% CI 0.67 to 0.77]), and F1 score (0.34 ± 0.10 [95% CI 0.31 to 0.37]), and it focused accurately on the region with the expected pathologic fracture. Further, DenseNet121 was less likely than clinicians to mispredict cases in which there was no pathologic fracture than cases in which there was a fracture; the performance of DenseNet121 was better than clinician performance in terms of specificity (0.98 ± 0.01 [95% CI 0.98 to 0.99] versus 0.86 ± 0.09 [95% CI 0.81 to 0.91]; p = 0.01), precision (0.72 ± 0.19 [95% CI 0.67 to 0.77] versus 0.11 ± 0.10 [95% CI 0.05 to 0.17]; p = 0.0001), and F1 score (0.34 ± 0.10 [95% CI 0.31 to 0.37] versus 0.17 ± 0.15 [95% CI 0.08 to 0.26]; p = 0.0001). CONCLUSION CNN models may be able to accurately predict impending pathologic fractures from digitally reconstructed radiographs of the abdomen and pelvis CT images that clinicians may not anticipate; this can assist medical, radiation, and orthopaedic oncologists clinically. To achieve better performance, ensemble-learning models using knowledge of the patients' histories should be developed and validated. The code for our model is publicly available online at https://github.com/taehoonko/CNN_path_fx_prediction . LEVEL OF EVIDENCE Level III, diagnostic study.
Collapse
Affiliation(s)
- Min Wook Joo
- Department of Orthopedic Surgery, St. Vincent’s Hospital, College of Medicine, the Catholic University of Korea, Seoul, Republic of Korea
| | - Taehoon Ko
- Department of Medical Informatics, College of Medicine, the Catholic University of Korea, Seoul, Republic of Korea
| | - Min Seob Kim
- The City Hall Station St. Mary’s Psychiatric Clinic, Seoul, Republic of Korea
| | - Yong-Suk Lee
- Department of Orthopedic Surgery, Incheon St. Mary’s Hospital, College of Medicine, the Catholic University of Korea, Seoul, Republic of Korea
| | - Seung Han Shin
- Department of Orthopedic Surgery, Seoul St. Mary’s Hospital, College of Medicine, the Catholic University of Korea, Seoul, Republic of Korea
| | - Yang-Guk Chung
- Department of Orthopedic Surgery, Seoul St. Mary’s Hospital, College of Medicine, the Catholic University of Korea, Seoul, Republic of Korea
| | - Hong Kwon Lee
- Department of Orthopedic Surgery, St. Vincent’s Hospital, College of Medicine, the Catholic University of Korea, Seoul, Republic of Korea
| |
Collapse
|
29
|
Shen X, Luo J, Tang X, Chen B, Qin Y, Zhou Y, Xiao J. Deep Learning Approach for Diagnosing Early Osteonecrosis of the Femoral Head Based on Magnetic Resonance Imaging. J Arthroplasty 2023; 38:2044-2050. [PMID: 36243276 DOI: 10.1016/j.arth.2022.10.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 09/28/2022] [Accepted: 10/03/2022] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND The diagnosis of early osteonecrosis of the femoral head (ONFH) based on magnetic resonance imaging (MRI) is challenging due to variability in the surgeon's experience level. This study developed an MRI-based deep learning system to detect early ONFH and evaluated its feasibility in the clinic. METHODS We retrospectively evaluated clinical MRIs of the hips that were performed in our institution from January 2019 to June 2022 and collected all MRIs diagnosed with early ONFH. An advanced convolutional neural network (CNN) was trained and optimized; then, the diagnostic performance of the CNN was evaluated according to its accuracy, sensitivity, and specificity. We also further compared the CNN's performance with that of orthopaedic surgeons. RESULTS Overall, 11,061 images were retrospectively included in the present study and were divided into three datasets with ratio 7:2:1. The area under the receiver operating characteristic curve, accuracy, sensitivity, and specificity of the CNN model for identifying early ONFH were 0.98, 98.4, 97.6, and 98.6%, respectively. In our review panel, the averaged accuracy, sensitivity, and specificity for identifying ONFH were 91.7, 87.0, and 94.1% for attending orthopaedic surgeons; 87.1, 84.0, and 89.3% for resident orthopaedic surgeons; and 97.1, 96.0, and 97.9% for deputy chief orthopaedic surgeons, respectively. CONCLUSION The deep learning system showed a comparable performance to that of deputy chief orthopaedic surgeons in identifying early ONFH. The success of deep learning diagnosis of ONFH might be conducive to assisting less-experienced surgeons, especially in large-scale medical imaging screening and community scenarios lacking consulting experts.
Collapse
Affiliation(s)
- Xianyue Shen
- Department of Orthopedics, The Second Hospital of Jilin University
| | - Jia Luo
- College of software, Jilin University
| | - Xiongfeng Tang
- Department of Orthopedics, The Second Hospital of Jilin University
| | - Bo Chen
- Department of Orthopedics, The Second Hospital of Jilin University
| | - Yanguo Qin
- Department of Orthopedics, The Second Hospital of Jilin University
| | - You Zhou
- College of software, Jilin University
| | - Jianlin Xiao
- Department of Orthopedics, China-Japan Union Hospital of Jilin University, Changchun, Jilin province, PR China
| |
Collapse
|
30
|
Ackermann J, Hoch A, Snedeker JG, Zingg PO, Esfandiari H, Fürnstahl P. Automatic 3D Postoperative Evaluation of Complex Orthopaedic Interventions. J Imaging 2023; 9:180. [PMID: 37754944 PMCID: PMC10532700 DOI: 10.3390/jimaging9090180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 08/21/2023] [Accepted: 08/27/2023] [Indexed: 09/28/2023] Open
Abstract
In clinical practice, image-based postoperative evaluation is still performed without state-of-the-art computer methods, as these are not sufficiently automated. In this study we propose a fully automatic 3D postoperative outcome quantification method for the relevant steps of orthopaedic interventions on the example of Periacetabular Osteotomy of Ganz (PAO). A typical orthopaedic intervention involves cutting bone, anatomy manipulation and repositioning as well as implant placement. Our method includes a segmentation based deep learning approach for detection and quantification of the cuts. Furthermore, anatomy repositioning was quantified through a multi-step registration method, which entailed a coarse alignment of the pre- and postoperative CT images followed by a fine fragment alignment of the repositioned anatomy. Implant (i.e., screw) position was identified by 3D Hough transform for line detection combined with fast voxel traversal based on ray tracing. The feasibility of our approach was investigated on 27 interventions and compared against manually performed 3D outcome evaluations. The results show that our method can accurately assess the quality and accuracy of the surgery. Our evaluation of the fragment repositioning showed a cumulative error for the coarse and fine alignment of 2.1 mm. Our evaluation of screw placement accuracy resulted in a distance error of 1.32 mm for screw head location and an angular deviation of 1.1° for screw axis. As a next step we will explore generalisation capabilities by applying the method to different interventions.
Collapse
Affiliation(s)
- Joëlle Ackermann
- Research in Orthopedic Computer Science, Balgrist University Hospital, University of Zurich, 8008 Zurich, Switzerland
- Laboratory for Orthopaedic Biomechanics, ETH Zurich, 8093 Zurich, Switzerland
| | - Armando Hoch
- Department of Orthopedics, Balgrist University Hospital, University of Zurich, 8008 Zurich, Switzerland
| | - Jess Gerrit Snedeker
- Laboratory for Orthopaedic Biomechanics, ETH Zurich, 8093 Zurich, Switzerland
- Department of Orthopedics, Balgrist University Hospital, University of Zurich, 8008 Zurich, Switzerland
| | - Patrick Oliver Zingg
- Department of Orthopedics, Balgrist University Hospital, University of Zurich, 8008 Zurich, Switzerland
| | - Hooman Esfandiari
- Research in Orthopedic Computer Science, Balgrist University Hospital, University of Zurich, 8008 Zurich, Switzerland
| | - Philipp Fürnstahl
- Research in Orthopedic Computer Science, Balgrist University Hospital, University of Zurich, 8008 Zurich, Switzerland
| |
Collapse
|
31
|
Magnéli M, Ling P, Gislén J, Fagrell J, Demir Y, Arverud ED, Hallberg K, Salomonsson B, Gordon M. Deep learning classification of shoulder fractures on plain radiographs of the humerus, scapula and clavicle. PLoS One 2023; 18:e0289808. [PMID: 37647274 PMCID: PMC10468075 DOI: 10.1371/journal.pone.0289808] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 07/26/2023] [Indexed: 09/01/2023] Open
Abstract
In this study, we present a deep learning model for fracture classification on shoulder radiographs using a convolutional neural network (CNN). The primary aim was to evaluate the classification performance of the CNN for proximal humeral fractures (PHF) based on the AO/OTA classification system. Secondary objectives included evaluating the model's performance for diaphyseal humerus, clavicle, and scapula fractures. The training dataset consisted of 6,172 examinations, including 2-7 radiographs per examination. The overall area under the curve (AUC) for fracture classification was 0.89, indicating good performance. For PHF classification, 12 out of 16 classes achieved an AUC of 0.90 or greater. Additionally, the CNN model had excellent overall AUC for diaphyseal humerus fractures (0.97), clavicle fractures (0.96), and good AUC for scapula fractures (0.87). Despite the limitations of the study, such as the reliance on ground truth labels provided by students with limited radiographic assessment experience, our findings are in concordance with previous studies, further consolidating CNN as potent fracture classifiers in plain radiographs. The inclusion of multiple radiographs with different views from each examination, as well as the generally unselected nature of the sample, contributed to the overall generalizability of the study. This is the fifth study published by our group on AI in orthopaedic radiographs, which has consistently shown promising results. The next challenge for the orthopaedic research community will be to transfer these results from the research setting into clinical practice. External validation of the CNN model should be conducted in the future before it is considered for use in a clinical setting.
Collapse
Affiliation(s)
- Martin Magnéli
- Department of Clinical Sciences at Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden
| | - Petter Ling
- Department of Clinical Sciences at Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden
| | - Jacob Gislén
- Department of Clinical Sciences at Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden
| | - Johan Fagrell
- Department of Clinical Sciences at Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden
| | - Yilmaz Demir
- Department of Clinical Sciences at Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden
| | - Erica Domeij Arverud
- Department of Clinical Sciences at Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden
| | - Kristofer Hallberg
- Department of Clinical Sciences at Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden
| | - Björn Salomonsson
- Department of Clinical Sciences at Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden
| | - Max Gordon
- Department of Clinical Sciences at Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
32
|
Martin RK, Wastvedt S, Pareek A, Persson A, Visnes H, Fenstad AM, Moatshe G, Wolfson J, Lind M, Engebretsen L. Ceiling Effect of the Combined Norwegian and Danish Knee Ligament Registers Limits Anterior Cruciate Ligament Reconstruction Outcome Prediction. Am J Sports Med 2023; 51:2324-2332. [PMID: 37289071 DOI: 10.1177/03635465231177905] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
BACKGROUND Clinical tools based on machine learning analysis now exist for outcome prediction after primary anterior cruciate ligament reconstruction (ACLR). Relying partly on data volume, the general principle is that more data may lead to improved model accuracy. PURPOSE/HYPOTHESIS The purpose was to apply machine learning to a combined data set from the Norwegian and Danish knee ligament registers (NKLR and DKRR, respectively), with the aim of producing an algorithm that can predict revision surgery with improved accuracy relative to a previously published model developed using only the NKLR. The hypothesis was that the additional patient data would result in an algorithm that is more accurate. STUDY DESIGN Cohort study; Level of evidence, 3. METHODS Machine learning analysis was performed on combined data from the NKLR and DKRR. The primary outcome was the probability of revision ACLR within 1, 2, and 5 years. Data were split randomly into training sets (75%) and test sets (25%). There were 4 machine learning models examined: Cox lasso, random survival forest, gradient boosting, and super learner. Concordance and calibration were calculated for all 4 models. RESULTS The data set included 62,955 patients in which 5% underwent a revision surgical procedure with a mean follow-up of 7.6 ± 4.5 years. The 3 nonparametric models (random survival forest, gradient boosting, and super learner) performed best, demonstrating moderate concordance (0.67 [95% CI, 0.64-0.70]), and were well calibrated at 1 and 2 years. Model performance was similar to that of the previously published model (NKLR-only model: concordance, 0.67-0.69; well calibrated). CONCLUSION Machine learning analysis of the combined NKLR and DKRR enabled prediction of the revision ACLR risk with moderate accuracy. However, the resulting algorithms were less user-friendly and did not demonstrate superior accuracy in comparison with the previously developed model based on patients from the NKLR alone, despite the analysis of nearly 63,000 patients. This ceiling effect suggests that simply adding more patients to current national knee ligament registers is unlikely to improve predictive capability and may prompt future changes to increase variable inclusion.
Collapse
Affiliation(s)
- R Kyle Martin
- Department of Orthopedic Surgery, University of Minnesota, Minneapolis, Minnesota, USA
- Department of Orthopedics, CentraCare, St Cloud, Minnesota, USA
| | - Solvejg Wastvedt
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| | - Ayoosh Pareek
- Department of Orthopedic Surgery, Hospital for Special Surgery, New York, New York, USA
| | - Andreas Persson
- Department of Orthopaedic Surgery, Oslo University Hospital Ullevål, Oslo, Norway
- Oslo Sports Trauma Research Center, Norwegian School of Sport Sciences, Oslo, Norway
- Norwegian Knee Ligament Register, Haukeland University Hospital, Bergen, Norway
| | - Håvard Visnes
- Norwegian Knee Ligament Register, Haukeland University Hospital, Bergen, Norway
| | - Anne Marie Fenstad
- Norwegian Knee Ligament Register, Haukeland University Hospital, Bergen, Norway
| | - Gilbert Moatshe
- Department of Orthopaedic Surgery, Oslo University Hospital Ullevål, Oslo, Norway
- Oslo Sports Trauma Research Center, Norwegian School of Sport Sciences, Oslo, Norway
| | - Julian Wolfson
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| | | | - Lars Engebretsen
- Department of Orthopaedic Surgery, Oslo University Hospital Ullevål, Oslo, Norway
- Oslo Sports Trauma Research Center, Norwegian School of Sport Sciences, Oslo, Norway
| |
Collapse
|
33
|
Gasmi I, Calinghen A, Parienti JJ, Belloy F, Fohlen A, Pelage JP. Comparison of diagnostic performance of a deep learning algorithm, emergency physicians, junior radiologists and senior radiologists in the detection of appendicular fractures in children. Pediatr Radiol 2023; 53:1675-1684. [PMID: 36877239 DOI: 10.1007/s00247-023-05621-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 11/21/2022] [Accepted: 01/30/2023] [Indexed: 03/07/2023]
Abstract
BACKGROUND Advances have been made in the use of artificial intelligence (AI) in the field of diagnostic imaging, particularly in the detection of fractures on conventional radiographs. Studies looking at the detection of fractures in the pediatric population are few. The anatomical variations and evolution according to the child's age require specific studies of this population. Failure to diagnose fractures early in children may lead to serious consequences for growth. OBJECTIVE To evaluate the performance of an AI algorithm based on deep neural networks toward detecting traumatic appendicular fractures in a pediatric population. To compare sensitivity, specificity, positive predictive value and negative predictive value of different readers and the AI algorithm. MATERIALS AND METHODS This retrospective study conducted on 878 patients younger than 18 years of age evaluated conventional radiographs obtained after recent non-life-threatening trauma. All radiographs of the shoulder, arm, elbow, forearm, wrist, hand, leg, knee, ankle and foot were evaluated. The diagnostic performance of a consensus of radiology experts in pediatric imaging (reference standard) was compared with those of pediatric radiologists, emergency physicians, senior residents and junior residents. The predictions made by the AI algorithm and the annotations made by the different physicians were compared. RESULTS The algorithm predicted 174 fractures out of 182, corresponding to a sensitivity of 95.6%, a specificity of 91.64% and a negative predictive value of 98.76%. The AI predictions were close to that of pediatric radiologists (sensitivity 98.35%) and that of senior residents (95.05%) and were above those of emergency physicians (81.87%) and junior residents (90.1%). The algorithm identified 3 (1.6%) fractures not initially seen by pediatric radiologists. CONCLUSION This study suggests that deep learning algorithms can be useful in improving the detection of fractures in children.
Collapse
Affiliation(s)
- Idriss Gasmi
- Department of Radiology, Caen University Medical Center, 14033 Cedex 9, Caen, France
| | - Arvin Calinghen
- Department of Radiology, Caen University Medical Center, 14033 Cedex 9, Caen, France
| | - Jean-Jacques Parienti
- GRAM 2.0 EA2656 UNICAEN Normandie, University Hospital, Caen, France
- Department of Clinical Research, Caen University Hospital, Caen, France
| | - Frederique Belloy
- Department of Radiology, Caen University Medical Center, 14033 Cedex 9, Caen, France
| | - Audrey Fohlen
- Department of Radiology, Caen University Medical Center, 14033 Cedex 9, Caen, France
- UNICAEN CEA CNRS ISTCT- CERVOxy, Normandie University, 14000, Caen, France
| | - Jean-Pierre Pelage
- Department of Radiology, Caen University Medical Center, 14033 Cedex 9, Caen, France.
- UNICAEN CEA CNRS ISTCT- CERVOxy, Normandie University, 14000, Caen, France.
| |
Collapse
|
34
|
Lin DJ, Schwier M, Geiger B, Raithel E, von Busch H, Fritz J, Kline M, Brooks M, Dunham K, Shukla M, Alaia EF, Samim M, Joshi V, Walter WR, Ellermann JM, Ilaslan H, Rubin D, Winalski CS, Recht MP. Deep Learning Diagnosis and Classification of Rotator Cuff Tears on Shoulder MRI. Invest Radiol 2023; 58:405-412. [PMID: 36728041 DOI: 10.1097/rli.0000000000000951] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
BACKGROUND Detection of rotator cuff tears, a common cause of shoulder disability, can be time-consuming and subject to reader variability. Deep learning (DL) has the potential to increase radiologist accuracy and consistency. PURPOSE The aim of this study was to develop a prototype DL model for detection and classification of rotator cuff tears on shoulder magnetic resonance imaging into no tear, partial-thickness tear, or full-thickness tear. MATERIALS AND METHODS This Health Insurance Portability and Accountability Act-compliant, institutional review board-approved study included a total of 11,925 noncontrast shoulder magnetic resonance imaging scans from 2 institutions, with 11,405 for development and 520 dedicated for final testing. A DL ensemble algorithm was developed that used 4 series as input from each examination: fluid-sensitive sequences in 3 planes and a sagittal oblique T1-weighted sequence. Radiology reports served as ground truth for training with categories of no tear, partial tear, or full-thickness tear. A multireader study was conducted for the test set ground truth, which was determined by the majority vote of 3 readers per case. The ensemble comprised 4 parallel 3D ResNet50 convolutional neural network architectures trained via transfer learning and then adapted to the targeted domain. The final tear-type prediction was determined as the class with the highest probability, after averaging the class probabilities of the 4 individual models. RESULTS The AUC overall for supraspinatus, infraspinatus, and subscapularis tendon tears was 0.93, 0.89, and 0.90, respectively. The model performed best for full-thickness supraspinatus, infraspinatus, and subscapularis tears with AUCs of 0.98, 0.99, and 0.95, respectively. Multisequence input demonstrated higher AUCs than single-sequence input for infraspinatus and subscapularis tendon tears, whereas coronal oblique fluid-sensitive and multisequence input showed similar AUCs for supraspinatus tendon tears. Model accuracy for tear types and overall accuracy were similar to that of the clinical readers. CONCLUSIONS Deep learning diagnosis of rotator cuff tears is feasible with excellent diagnostic performance, particularly for full-thickness tears, with model accuracy similar to subspecialty-trained musculoskeletal radiologists.
Collapse
Affiliation(s)
- Dana J Lin
- From the Department of Radiology, NYU Grossman School of Medicine, New York, NY
| | | | | | | | | | - Jan Fritz
- From the Department of Radiology, NYU Grossman School of Medicine, New York, NY
| | - Mitchell Kline
- From the Department of Radiology, NYU Grossman School of Medicine, New York, NY
| | - Michael Brooks
- From the Department of Radiology, NYU Grossman School of Medicine, New York, NY
| | - Kevin Dunham
- From the Department of Radiology, NYU Grossman School of Medicine, New York, NY
| | - Mehool Shukla
- From the Department of Radiology, NYU Grossman School of Medicine, New York, NY
| | - Erin F Alaia
- From the Department of Radiology, NYU Grossman School of Medicine, New York, NY
| | - Mohammad Samim
- From the Department of Radiology, NYU Grossman School of Medicine, New York, NY
| | - Vivek Joshi
- From the Department of Radiology, NYU Grossman School of Medicine, New York, NY
| | - William R Walter
- From the Department of Radiology, NYU Grossman School of Medicine, New York, NY
| | - Jutta M Ellermann
- Department of Radiology, Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, MN
| | | | | | | | - Michael P Recht
- From the Department of Radiology, NYU Grossman School of Medicine, New York, NY
| |
Collapse
|
35
|
Soydan Z, Saglam Y, Key S, Kati YA, Taskiran M, Kiymet S, Salturk T, Aydin AS, Bilgili F, Sen C. An AI based classifier model for lateral pillar classification of Legg-Calve-Perthes. Sci Rep 2023; 13:6870. [PMID: 37106026 PMCID: PMC10140055 DOI: 10.1038/s41598-023-34176-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 04/25/2023] [Indexed: 04/29/2023] Open
Abstract
We intended to compare the doctors with a convolutional neural network (CNN) that we had trained using our own unique method for the Lateral Pillar Classification (LPC) of Legg-Calve-Perthes Disease (LCPD). Thousands of training data sets are frequently required for artificial intelligence (AI) applications in medicine. Since we did not have enough real patient radiographs to train a CNN, we devised a novel method to obtain them. We trained the CNN model with the data we created by modifying the normal hip radiographs. No real patient radiographs were ever used during the training phase. We tested the CNN model on 81 hips with LCPD. Firstly, we detected the interobserver reliability of the whole system and then the reliability of CNN alone. Second, the consensus list was used to compare the results of 11 doctors and the CNN model. Percentage agreement and interobserver analysis revealed that CNN had good reliability (ICC = 0.868). CNN has achieved a 76.54% classification performance and outperformed 9 out of 11 doctors. The CNN, which we trained with the aforementioned method, can now provide better results than doctors. In the future, as training data evolves and improves, we anticipate that AI will perform significantly better than physicians.
Collapse
Affiliation(s)
- Zafer Soydan
- Orthopedics and Traumatology, Bhtclinic İstanbul Tema Hastanesi, Nisantası University, Atakent Mh 4. Cadde No 36 PC, 34307, Kucukcekmece, Istanbul, Turkey.
| | - Yavuz Saglam
- Orthopedics and Traumatology, Istanbul University Istanbul Faculty of Medicine, Istanbul, Turkey
| | - Sefa Key
- Orthopedics and Traumatology, Bingol State Hospital, Bingol Merkez, Turkey
| | - Yusuf Alper Kati
- Orthopedics and Traumatology, Antalya Egitim ve Arastirma Hastanesi, Antalya, Turkey
| | - Murat Taskiran
- Department of Electronics and Communication Engineering, Yildiz Technical University, Istanbul, Turkey
| | - Seyfullah Kiymet
- Department of Electronics and Communication Engineering, Yildiz Technical University, Istanbul, Turkey
| | - Tuba Salturk
- Department of Informatics, Yildiz Technical University, Istanbul, Turkey
| | - Ahmet Serhat Aydin
- Orthopedics and Traumatology, Istanbul University Istanbul Faculty of Medicine, Istanbul, Turkey
| | - Fuat Bilgili
- Orthopedics and Traumatology, Istanbul University Istanbul Faculty of Medicine, Istanbul, Turkey
| | - Cengiz Sen
- Orthopedics and Traumatology, Istanbul University Istanbul Faculty of Medicine, Istanbul, Turkey
| |
Collapse
|
36
|
Kim MS, Kim JJ, Kang KH, Lee JH, In Y. Detection of Prosthetic Loosening in Hip and Knee Arthroplasty Using Machine Learning: A Systematic Review and Meta-Analysis. MEDICINA (KAUNAS, LITHUANIA) 2023; 59:medicina59040782. [PMID: 37109740 PMCID: PMC10141023 DOI: 10.3390/medicina59040782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Revised: 04/02/2023] [Accepted: 04/11/2023] [Indexed: 04/29/2023]
Abstract
Background: prosthetic loosening after hip and knee arthroplasty is one of the most common causes of joint arthroplasty failure and revision surgery. Diagnosis of prosthetic loosening is a difficult problem and, in many cases, loosening is not clearly diagnosed until accurately confirmed during surgery. The purpose of this study is to conduct a systematic review and meta-analysis to demonstrate the analysis and performance of machine learning in diagnosing prosthetic loosening after total hip arthroplasty (THA) and total knee arthroplasty (TKA). Materials and Methods: three comprehensive databases, including MEDLINE, EMBASE, and the Cochrane Library, were searched for studies that evaluated the detection accuracy of loosening around arthroplasty implants using machine learning. Data extraction, risk of bias assessment, and meta-analysis were performed. Results: five studies were included in the meta-analysis. All studies were retrospective studies. In total, data from 2013 patients with 3236 images were assessed; these data involved 2442 cases (75.5%) with THAs and 794 cases (24.5%) with TKAs. The most common and best-performing machine learning algorithm was DenseNet. In one study, a novel stacking approach using a random forest showed similar performance to DenseNet. The pooled sensitivity across studies was 0.92 (95% CI 0.84-0.97), the pooled specificity was 0.95 (95% CI 0.93-0.96), and the pooled diagnostic odds ratio was 194.09 (95% CI 61.60-611.57). The I2 statistics for sensitivity and specificity were 96% and 62%, respectively, showing that there was significant heterogeneity. The summary receiver operating characteristics curve indicated the sensitivity and specificity, as did the prediction regions, with an AUC of 0.9853. Conclusions: the performance of machine learning using plain radiography showed promising results with good accuracy, sensitivity, and specificity in the detection of loosening around THAs and TKAs. Machine learning can be incorporated into prosthetic loosening screening programs.
Collapse
Affiliation(s)
- Man-Soo Kim
- Department of Orthopaedic Surgery, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, 222, Banpo-daero, Seocho-gu, Seoul 06591, Republic of Korea
| | - Jae-Jung Kim
- Department of Orthopaedic Surgery, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, 222, Banpo-daero, Seocho-gu, Seoul 06591, Republic of Korea
| | - Ki-Ho Kang
- Department of Orthopaedic Surgery, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, 222, Banpo-daero, Seocho-gu, Seoul 06591, Republic of Korea
| | - Jeong-Han Lee
- Department of Orthopaedic Surgery, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, 222, Banpo-daero, Seocho-gu, Seoul 06591, Republic of Korea
| | - Yong In
- Department of Orthopaedic Surgery, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, 222, Banpo-daero, Seocho-gu, Seoul 06591, Republic of Korea
| |
Collapse
|
37
|
Anttila TT, Karjalainen TV, Mäkelä TO, Waris EM, Lindfors NC, Leminen MM, Ryhänen JO. Detecting Distal Radius Fractures Using a Segmentation-Based Deep Learning Model. J Digit Imaging 2023; 36:679-687. [PMID: 36542269 PMCID: PMC10039188 DOI: 10.1007/s10278-022-00741-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 11/08/2022] [Accepted: 11/09/2022] [Indexed: 12/24/2022] Open
Abstract
Deep learning algorithms can be used to classify medical images. In distal radius fracture treatment, fracture detection and radiographic assessment of fracture displacement are critical steps. The aim of this study was to use pixel-level annotations of fractures to develop a deep learning model for precise distal radius fracture detection. We randomly divided 3785 consecutive emergency wrist radiograph examinations from six hospitals to a training set (3399 examinations) and test set (386 examinations). The training set was used to develop the deep learning model and the test set to assess its validity. The consensus of three hand surgeons was used as the gold standard for the test set. The area under the ROC curve was 0.97 (CI 0.95-0.98) and 0.95 (CI 0.92-0.98) for examinations without a cast. Fractures were identified with higher accuracy in the postero-anterior radiographs than in the lateral radiographs. Our deep learning model performed well in our multi-hospital and multi-radiograph system manufacturer settings. Thus, segmentation-based deep learning models may provide additional benefit. Further research is needed with algorithm comparison and external validation.
Collapse
Affiliation(s)
- Turkka T Anttila
- Musculoskeletal and Plastic Surgery, Department of Hand Surgery, University of Helsinki and Helsinki University Hospital, Topeliuksenkatu 5B, Helsinki, 00260, Finland.
| | - Teemu V Karjalainen
- Department of Orthopedics, Traumatology and Hand Surgery, Central Finland Hospital, Jyvaskyla, Finland
| | - Teemu O Mäkelä
- Medical Imaging Center, Radiology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Department of Physics, University of Helsinki, Helsinki, Finland
| | - Eero M Waris
- Musculoskeletal and Plastic Surgery, Department of Hand Surgery, University of Helsinki and Helsinki University Hospital, Topeliuksenkatu 5B, Helsinki, 00260, Finland
| | - Nina C Lindfors
- Musculoskeletal and Plastic Surgery, Department of Hand Surgery, University of Helsinki and Helsinki University Hospital, Topeliuksenkatu 5B, Helsinki, 00260, Finland
| | - Miika M Leminen
- Analytics and AI Development Services, IT Department, Helsinki University Hospital, Helsinki, Finland
- Department of Otorhinolaryngology and Phoniatrics, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Jorma O Ryhänen
- Musculoskeletal and Plastic Surgery, Department of Hand Surgery, University of Helsinki and Helsinki University Hospital, Topeliuksenkatu 5B, Helsinki, 00260, Finland
| |
Collapse
|
38
|
Xu SM, Dong D, Li W, Bai T, Zhu MZ, Gu GS. Deep learning-assisted diagnosis of femoral trochlear dysplasia based on magnetic resonance imaging measurements. World J Clin Cases 2023; 11:1477-1487. [PMID: 36926411 PMCID: PMC10011995 DOI: 10.12998/wjcc.v11.i7.1477] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 01/27/2023] [Accepted: 02/13/2023] [Indexed: 03/02/2023] Open
Abstract
BACKGROUND Femoral trochlear dysplasia (FTD) is an important risk factor for patellar instability. Dejour classification is widely used at present and relies on standard lateral X-rays, which are not common in clinical work. Therefore, magnetic resonance imaging (MRI) has become the first choice for the diagnosis of FTD. However, manually measuring is tedious, time-consuming, and easily produces great variability.
AIM To use artificial intelligence (AI) to assist diagnosing FTD on MRI images and to evaluate its reliability.
METHODS We searched 464 knee MRI cases between January 2019 and December 2020, including FTD (n = 202) and normal trochlea (n = 252). This paper adopts the heatmap regression method to detect the key points network. For the final evaluation, several metrics (accuracy, sensitivity, specificity, etc.) were calculated.
RESULTS The accuracy, sensitivity, specificity, positive predictive value and negative predictive value of the AI model ranged from 0.74-0.96. All values were superior to junior doctors and intermediate doctors, similar to senior doctors. However, diagnostic time was much lower than that of junior doctors and intermediate doctors.
CONCLUSION The diagnosis of FTD on knee MRI can be aided by AI and can be achieved with a high level of accuracy.
Collapse
Affiliation(s)
- Sheng-Ming Xu
- Department of Orthopedic Surgery, The First Hospital of Jilin University, Changchun 130000, Jilin Province, China
| | - Dong Dong
- Department of Radiology, The First Hospital of Jilin University, Changchun 130000, Jilin Province, China
| | - Wei Li
- Department of Orthopedic Surgery, The First Hospital of Jilin University, Changchun 130000, Jilin Province, China
| | - Tian Bai
- College of Computer Science and Technology, Jilin University, Changchun 130000, Jilin Province, China
| | - Ming-Zhu Zhu
- College of Computer Science and Technology, Jilin University, Changchun 130000, Jilin Province, China
| | - Gui-Shan Gu
- Department of Orthopedic Surgery, The First Hospital of Jilin University, Changchun 130000, Jilin Province, China
| |
Collapse
|
39
|
Lex JR, Di Michele J, Koucheki R, Pincus D, Whyne C, Ravi B. Artificial Intelligence for Hip Fracture Detection and Outcome Prediction: A Systematic Review and Meta-analysis. JAMA Netw Open 2023; 6:e233391. [PMID: 36930153 PMCID: PMC10024206 DOI: 10.1001/jamanetworkopen.2023.3391] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/18/2023] Open
Abstract
IMPORTANCE Artificial intelligence (AI) enables powerful models for establishment of clinical diagnostic and prognostic tools for hip fractures; however the performance and potential impact of these newly developed algorithms are currently unknown. OBJECTIVE To evaluate the performance of AI algorithms designed to diagnose hip fractures on radiographs and predict postoperative clinical outcomes following hip fracture surgery relative to current practices. DATA SOURCES A systematic review of the literature was performed using the MEDLINE, Embase, and Cochrane Library databases for all articles published from database inception to January 23, 2023. A manual reference search of included articles was also undertaken to identify any additional relevant articles. STUDY SELECTION Studies developing machine learning (ML) models for the diagnosis of hip fractures from hip or pelvic radiographs or to predict any postoperative patient outcome following hip fracture surgery were included. DATA EXTRACTION AND SYNTHESIS This study followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses and was registered with PROSPERO. Eligible full-text articles were evaluated and relevant data extracted independently using a template data extraction form. For studies that predicted postoperative outcomes, the performance of traditional predictive statistical models, either multivariable logistic or linear regression, was recorded and compared with the performance of the best ML model on the same out-of-sample data set. MAIN OUTCOMES AND MEASURES Diagnostic accuracy of AI models was compared with the diagnostic accuracy of expert clinicians using odds ratios (ORs) with 95% CIs. Areas under the curve for postoperative outcome prediction between traditional statistical models (multivariable linear or logistic regression) and ML models were compared. RESULTS Of 39 studies that met all criteria and were included in this analysis, 18 (46.2%) used AI models to diagnose hip fractures on plain radiographs and 21 (53.8%) used AI models to predict patient outcomes following hip fracture surgery. A total of 39 598 plain radiographs and 714 939 hip fractures were used for training, validating, and testing ML models specific to diagnosis and postoperative outcome prediction, respectively. Mortality and length of hospital stay were the most predicted outcomes. On pooled data analysis, compared with clinicians, the OR for diagnostic error of ML models was 0.79 (95% CI, 0.48-1.31; P = .36; I2 = 60%) for hip fracture radiographs. For the ML models, the mean (SD) sensitivity was 89.3% (8.5%), specificity was 87.5% (9.9%), and F1 score was 0.90 (0.06). The mean area under the curve for mortality prediction was 0.84 with ML models compared with 0.79 for alternative controls (P = .09). CONCLUSIONS AND RELEVANCE The findings of this systematic review and meta-analysis suggest that the potential applications of AI to aid with diagnosis from hip radiographs are promising. The performance of AI in diagnosing hip fractures was comparable with that of expert radiologists and surgeons. However, current implementations of AI for outcome prediction do not seem to provide substantial benefit over traditional multivariable predictive statistics.
Collapse
Affiliation(s)
- Johnathan R. Lex
- Division of Orthopaedic Surgery, Department of Surgery, University of Toronto, Toronto, Ontario, Canada
- Institute of Biomedical Engineering, University of Toronto, Toronto, Ontario, Canada
- Orthopaedics Biomechanics Laboratory, Sunnybrook Research Institute, Toronto, Ontario, Canada
| | - Joseph Di Michele
- Division of Orthopaedic Surgery, Department of Surgery, University of Toronto, Toronto, Ontario, Canada
| | - Robert Koucheki
- Institute of Biomedical Engineering, University of Toronto, Toronto, Ontario, Canada
- Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Daniel Pincus
- Division of Orthopaedic Surgery, Department of Surgery, University of Toronto, Toronto, Ontario, Canada
- Division of Orthopaedic Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
| | - Cari Whyne
- Orthopaedics Biomechanics Laboratory, Sunnybrook Research Institute, Toronto, Ontario, Canada
| | - Bheeshma Ravi
- Division of Orthopaedic Surgery, Department of Surgery, University of Toronto, Toronto, Ontario, Canada
- Division of Orthopaedic Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
| |
Collapse
|
40
|
Anderson PG, Baum GL, Keathley N, Sicular S, Venkatesh S, Sharma A, Daluiski A, Potter H, Hotchkiss R, Lindsey RV, Jones RM. Deep Learning Assistance Closes the Accuracy Gap in Fracture Detection Across Clinician Types. Clin Orthop Relat Res 2023; 481:580-588. [PMID: 36083847 PMCID: PMC9928835 DOI: 10.1097/corr.0000000000002385] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 08/05/2022] [Indexed: 01/31/2023]
Abstract
BACKGROUND Missed fractures are the most common diagnostic errors in musculoskeletal imaging and can result in treatment delays and preventable morbidity. Deep learning, a subfield of artificial intelligence, can be used to accurately detect fractures by training algorithms to emulate the judgments of expert clinicians. Deep learning systems that detect fractures are often limited to specific anatomic regions and require regulatory approval to be used in practice. Once these hurdles are overcome, deep learning systems have the potential to improve clinician diagnostic accuracy and patient care. QUESTIONS/PURPOSES This study aimed to evaluate whether a Food and Drug Administration-cleared deep learning system that identifies fractures in adult musculoskeletal radiographs would improve diagnostic accuracy for fracture detection across different types of clinicians. Specifically, this study asked: (1) What are the trends in musculoskeletal radiograph interpretation by different clinician types in the publicly available Medicare claims data? (2) Does the deep learning system improve clinician accuracy in diagnosing fractures on radiographs and, if so, is there a greater benefit for clinicians with limited training in musculoskeletal imaging? METHODS We used the publicly available Medicare Part B Physician/Supplier Procedure Summary data provided by the Centers for Medicare & Medicaid Services to determine the trends in musculoskeletal radiograph interpretation by clinician type. In addition, we conducted a multiple-reader, multiple-case study to assess whether clinician accuracy in diagnosing fractures on radiographs was superior when aided by the deep learning system compared with when unaided. Twenty-four clinicians (radiologists, orthopaedic surgeons, physician assistants, primary care physicians, and emergency medicine physicians) with a median (range) of 16 years (2 to 37) of experience postresidency each assessed 175 unique musculoskeletal radiographic cases under aided and unaided conditions (4200 total case-physician pairs per condition). These cases were comprised of radiographs from 12 different anatomic regions (ankle, clavicle, elbow, femur, forearm, hip, humerus, knee, pelvis, shoulder, tibia and fibula, and wrist) and were randomly selected from 12 hospitals and healthcare centers. The gold standard for fracture diagnosis was the majority opinion of three US board-certified orthopaedic surgeons or radiologists who independently interpreted the case. The clinicians' diagnostic accuracy was determined by the area under the curve (AUC) of the receiver operating characteristic (ROC) curve, sensitivity, and specificity. Secondary analyses evaluated the fracture miss rate (1-sensitivity) by clinicians with and without extensive training in musculoskeletal imaging. RESULTS Medicare claims data revealed that physician assistants showed the greatest increase in interpretation of musculoskeletal radiographs within the analyzed time period (2012 to 2018), although clinicians with extensive training in imaging (radiologists and orthopaedic surgeons) still interpreted the majority of the musculoskeletal radiographs. Clinicians aided by the deep learning system had higher accuracy diagnosing fractures in radiographs compared with when unaided (unaided AUC: 0.90 [95% CI 0.89 to 0.92]; aided AUC: 0.94 [95% CI 0.93 to 0.95]; difference in least square mean per the Dorfman, Berbaum, Metz model AUC: 0.04 [95% CI 0.01 to 0.07]; p < 0.01). Clinician sensitivity increased when aided compared with when unaided (aided: 90% [95% CI 88% to 92%]; unaided: 82% [95% CI 79% to 84%]), and specificity increased when aided compared with when unaided (aided: 92% [95% CI 91% to 93%]; unaided: 89% [95% CI 88% to 90%]). Clinicians with limited training in musculoskeletal imaging missed a higher percentage of fractures when unaided compared with radiologists (miss rate for clinicians with limited imaging training: 20% [95% CI 17% to 24%]; miss rate for radiologists: 14% [95% CI 9% to 19%]). However, when assisted by the deep learning system, clinicians with limited training in musculoskeletal imaging reduced their fracture miss rate, resulting in a similar miss rate to radiologists (miss rate for clinicians with limited imaging training: 9% [95% CI 7% to 12%]; miss rate for radiologists: 10% [95% CI 6% to 15%]). CONCLUSION Clinicians were more accurate at diagnosing fractures when aided by the deep learning system, particularly those clinicians with limited training in musculoskeletal image interpretation. Reducing the number of missed fractures may allow for improved patient care and increased patient mobility. LEVEL OF EVIDENCE Level III, diagnostic study.
Collapse
Affiliation(s)
| | | | | | - Serge Sicular
- Imagen Technologies, New York, NY, USA
- The Mount Sinai Hospital, New York, NY, USA
| | | | | | | | | | | | | | | |
Collapse
|
41
|
Orthopedic surgeons’ attitudes and expectations toward artificial intelligence: A national survey study. JOURNAL OF SURGERY AND MEDICINE 2023. [DOI: 10.28982/josam.7709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023] Open
Abstract
Background/Aim: There is a lack of understanding of artificial intelligence (AI) among orthopedic surgeons regarding how it can be used in their clinical practices. This study aimed to evaluate the attitudes of orthopedic surgeons regarding the application of AI in their practices.
Methods: A cross-sectional study was conducted in Turkey among 189 orthopedic surgeons between November 2021 and February 2022. An electronic survey was designed using the SurveyMonkey platform. The questionnaire included six subsections related to AI usefulness in clinical practice and participants’ knowledge about the topic. It also surveyed their acceptance level of learning, concerns about the potential risks of AI, and implementation of this technology into their daily practice
Results: A total of 33.9% of the participants indicated that they were familiar with the concept of AI, while 82.5% planned to learn about artificial intelligence in the coming years. Most of the surgeons (68.3%) reported not using AI in their daily practice. The activities of orthopedic associations focused on AI were insufficient according to 77.2% of participants. Orthopedic surgeons expressed concern over AI involvement in the future regarding an insensitive and nonempathic attitude toward the patient (53.5%). A majority of respondents (80.4%) indicated that AI was most feasible in extremity reconstruction. Pelvis fractures were found in the region where the AI system is most needed in the fracture classification (68.7%).
Conclusion: Most of the respondents did not use AI in their daily clinical practice; however, almost all surgeons had plans to learn about artificial intelligence in the future. There was a need to improve orthopedic associations’ activities focusing on artificial intelligence. Furthermore, new research including the medical ethics issues of the field will be needed to allay the surgeons’ worries. The classification system of pelvic fractures and sub-branches of orthopedic extremity reconstruction were the most feasible areas for AI systems. We believe that this study will serve as a guide for all branches of orthopedic medicine.
Collapse
|
42
|
Hu H, Xu W, Jiang T, Cheng Y, Tao X, Liu W, Jian M, Li K, Wang G. Expert-Level Immunofixation Electrophoresis Image Recognition based on Explainable and Generalizable Deep Learning. Clin Chem 2023; 69:130-139. [PMID: 36544350 DOI: 10.1093/clinchem/hvac190] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 10/03/2022] [Indexed: 12/24/2022]
Abstract
BACKGROUND Immunofixation electrophoresis (IFE) is important for diagnosis of plasma cell disorders (PCDs). Manual analysis of IFE images is time-consuming and potentially subjective. An artificial intelligence (AI) system for automatic and accurate IFE image recognition is desirable. METHODS In total, 12 703 expert-annotated IFE images (9182 from a new IFE imaging system and 3521 from an old one) were used to develop and test an AI system that was an ensemble of 3 deep neural networks. The model takes an IFE image as input and predicts the presence of 8 basic patterns (IgA-, IgA-, IgG-, IgG-, IgM-, IgM-, light chain and ) and their combinations. Score-based class activation maps (Score-CAMs) were used for visual explanation of the models prediction. RESULTS The AI model achieved an average accuracy, sensitivity, and specificity of 99.82, 93.17, and 99.93, respectively, for detection of the 8 basic patterns, which outperformed 4 junior experts with 1 years experience and was comparable to a senior expert with 5 years experience. The Score-CAMs gave a reasonable visual explanation of the prediction by highlighting the target aligned regions in the bands and indicating potentially unreliable predictions. When trained with only the new system images, the models performance was still higher than junior experts on both the new and old IFE systems, with average accuracy of 99.91 and 99.81, respectively. CONCLUSIONS Our AI system achieved human-level performance in automatic recognition of IFE images, with high explainability and generalizability. It has the potential to improve the efficiency and reliability of diagnosis of PCDs.
Collapse
Affiliation(s)
- Honghua Hu
- Department of Laboratory Medicine and Sichuan Provincial Key Laboratory for Human Disease Gene Study, Sichuan Provincial Peoples Hospital, University of Electronic Science and Technology of China, Chengdu 610072, China
| | - Wei Xu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.,West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Ting Jiang
- Department of Laboratory Medicine, Tianfu New Area Peoples Hospital, Chengdu 610213, China
| | - Yuheng Cheng
- Department of Laboratory Medicine and Sichuan Provincial Key Laboratory for Human Disease Gene Study, Sichuan Provincial Peoples Hospital, University of Electronic Science and Technology of China, Chengdu 610072, China
| | - Xiaoyan Tao
- Department of Laboratory Medicine and Sichuan Provincial Key Laboratory for Human Disease Gene Study, Sichuan Provincial Peoples Hospital, University of Electronic Science and Technology of China, Chengdu 610072, China
| | - Wenna Liu
- Department of Laboratory Medicine and Sichuan Provincial Key Laboratory for Human Disease Gene Study, Sichuan Provincial Peoples Hospital, University of Electronic Science and Technology of China, Chengdu 610072, China
| | - Meiling Jian
- Department of Laboratory Medicine and Sichuan Provincial Key Laboratory for Human Disease Gene Study, Sichuan Provincial Peoples Hospital, University of Electronic Science and Technology of China, Chengdu 610072, China
| | - Kang Li
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Guotai Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
43
|
Üreten K, Maraş Y, Duran S, Gök K. Deep learning methods in the diagnosis of sacroiliitis from plain pelvic radiographs. Mod Rheumatol 2023; 33:202-206. [PMID: 34888699 DOI: 10.1093/mr/roab124] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 11/14/2021] [Accepted: 12/06/2021] [Indexed: 01/05/2023]
Abstract
OBJECTIVES The aim of this study is to develop a computer-aided diagnosis method to assist physicians in evaluating sacroiliac radiographs. METHODS Convolutional neural networks, a deep learning method, were used in this retrospective study. Transfer learning was implemented with pre-trained VGG-16, ResNet-101 and Inception-v3 networks. Normal pelvic radiographs (n = 290) and pelvic radiographs with sacroiliitis (n = 295) were used for the training of networks. RESULTS The training results were evaluated with the criteria of accuracy, sensitivity, specificity and precision calculated from the confusion matrix and AUC (area under the ROC curve) calculated from ROC (receiver operating characteristic) curve. Pre-trained VGG-16 model revealed accuracy, sensitivity, specificity, precision and AUC figures of 89.9%, 90.9%, 88.9%, 88.9% and 0.96 with test images, respectively. These results were 84.3%, 91.9%, 78.8%, 75.6 and 0.92 with pre-trained ResNet-101, and 82.0%, 79.6%, 85.0%, 86.7% and 0.90 with pre-trained inception-v3, respectively. CONCLUSIONS Successful results were obtained with all three models in this study where transfer learning was applied with pre-trained VGG-16, ResNet-101 and Inception-v3 networks. This method can assist clinicians in the diagnosis of sacroiliitis, provide them with a second objective interpretation and also reduce the need for advanced imaging methods such as magnetic resonance imaging.
Collapse
Affiliation(s)
- Kemal Üreten
- Department of Rheumatology, Faculty of Medicine, Kırıkkale University, Ankara, Turkey
- Computer Engineering Department, MSc, Çankaya University, Ankara, Turkey
| | - Yüksel Maraş
- Department of Rheumatology, Ankara City Hospital, Ankara, Turkey
| | - Semra Duran
- Department of Radiology, Ankara City Hospital, Ankara, Turkey
| | - Kevser Gök
- Department of Rheumatology, Ankara City Hospital, Ankara, Turkey
| |
Collapse
|
44
|
Rashid T, Zia MS, Najam-ur-Rehman, Meraj T, Rauf HT, Kadry S. A Minority Class Balanced Approach Using the DCNN-LSTM Method to Detect Human Wrist Fracture. Life (Basel) 2023; 13:133. [PMID: 36676082 PMCID: PMC9861673 DOI: 10.3390/life13010133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 12/19/2022] [Accepted: 12/27/2022] [Indexed: 01/06/2023] Open
Abstract
The emergency department of hospitals receives a massive number of patients with wrist fracture. For the clinical diagnosis of a suspected fracture, X-ray imaging is the major screening tool. A wrist fracture is a significant global health concern for children, adolescents, and the elderly. A missed diagnosis of wrist fracture on medical imaging can have significant consequences for patients, resulting in delayed treatment and poor functional recovery. Therefore, an intelligent method is needed in the medical department to precisely diagnose wrist fracture via an automated diagnosing tool by considering it a second option for doctors. In this research, a fused model of the deep learning method, a convolutional neural network (CNN), and long short-term memory (LSTM) is proposed to detect wrist fractures from X-ray images. It gives a second option to doctors to diagnose wrist facture using the computer vision method to lessen the number of missed fractures. The dataset acquired from Mendeley comprises 192 wrist X-ray images. In this framework, image pre-processing is applied, then the data augmentation approach is used to solve the class imbalance problem by generating rotated oversamples of images for minority classes during the training process, and pre-processed images and augmented normalized images are fed into a 28-layer dilated CNN (DCNN) to extract deep valuable features. Deep features are then fed to the proposed LSTM network to distinguish wrist fractures from normal ones. The experimental results of the DCNN-LSTM with and without augmentation is compared with other deep learning models. The proposed work is also compared to existing algorithms in terms of accuracy, sensitivity, specificity, precision, the F1-score, and kappa. The results show that the DCNN-LSTM fusion achieves higher accuracy and has high potential for medical applications to use as a second option.
Collapse
Affiliation(s)
- Tooba Rashid
- Department of Computer Science, The University of Lahore, Chenab Campus, Gujrat 50700, Pakistan
| | - Muhammad Sultan Zia
- Department of Computer Science, The University of Chenab, Gujrat 50700, Pakistan
| | - Najam-ur-Rehman
- Department of Human Resource Section, University of Gujrat, Gujrat 50700, Pakistan
| | - Talha Meraj
- Department of Computer Science, COMSATS University Islamabad-Wah Campus, Wah Cantt 47040, Pakistan
| | - Hafiz Tayyab Rauf
- Centre for Smart Systems, AI and Cybersecurity, Staffordshire University, Stoke-on-Trent ST4 2DE, UK
| | - Seifedine Kadry
- Department of Applied Data Science, Noroff University College, 4612 Kristiansand, Norway
- Artificial Intelligence Research Center (AIRC), Ajman University, Ajman 346, United Arab Emirates
- Department of Electrical and Computer Engineering, Lebanese American University, Byblos P.O. Box 13-5053, Lebanon
| |
Collapse
|
45
|
Beyaz S, Yayli SB, Kılıc E, Doktur U. The ensemble artificial intelligence (AI) method: Detection of hip fractures in AP pelvis plain radiographs by majority voting using a multi-center dataset. Digit Health 2023; 9:20552076231216549. [PMID: 38033522 PMCID: PMC10685786 DOI: 10.1177/20552076231216549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 11/07/2023] [Indexed: 12/02/2023] Open
Abstract
Introduction This article was undertaken to explore the potential of AI in enhancing the diagnostic accuracy and efficiency in identifying hip fractures using X-ray radiographs. In the study, we trained three distinct deep learning models, and we utilized majority voting to evaluate their outcomes, aiming to yield the most reliable and precise diagnoses of hip fractures from X-ray radiographs. Methods An initial study was conducted of 10,849 AP pelvis X-rays obtained from five hospitals affiliated with Başkent University. Two expert orthopedic surgeons initially labeled 2,291 radiographs as fractures and 8,558 as non-fractures. The algorithm was trained on 6,943 (64%) radiographs, validated on 1,736 (16%) radiographs, and tested on 2,170 (20%) radiographs, ensuring an even distribution of fracture presence, age, and gender. We employed three advanced deep learning architectures, Xception (Model A), EfficientNet (Model B), and NfNet (Model C), with a final decision aggregated through a majority voting technique (Model D). Results For each model, we achieved the following metrics:For Model A: F1 Score 0.895, Accuracy 0.956, Specificity 0.973, Sensitivity 0.893.For Model B: F1 Score 0.900, Accuracy 0.960, Specificity 0.991, Sensitivity 0.845.For Model C: F1 Score 0.919, Accuracy 0.966, Specificity 0.984, Sensitivity 0.899.For Model D: F1 Score 0.929, Accuracy 0.971, Specificity 0.991, Sensitivity 0.897.We concluded that Model D (majority voting) achieved the best results in terms of the F1 score, accuracy, and specificity values. Conclusions Our study demonstrates that the results obtained by aggregating the decisions of multiple models through voting, rather than relying solely on the decision of a single algorithm, are more consistent. The practical application of these algorithms will be difficult due to ethical, legal, and confidentiality issues, despite the theoretical success achieved. Developing successful algorithms and methodologies should not be viewed as the ultimate goal; it is important to understand how these algorithms will be used in real-life situations. In order to achieve more consistent results, feedback from clinical practice will be helpful.
Collapse
Affiliation(s)
- Salih Beyaz
- Başkent University Adana Dr. Turgut Noyan Research and Training Centre, Orthopedics and Traumatology Department, Adana, Türkiye
| | - Sahika Betul Yayli
- Turkcell Technology, Artificial Intelligence & Digital Analytic Solutions, İstanbul, Türkiye
| | - Ersin Kılıc
- Turkcell Technology, Artificial Intelligence & Digital Analytic Solutions, İstanbul, Türkiye
| | - Ugur Doktur
- Turkcell Technology, Artificial Intelligence & Digital Analytic Solutions, İstanbul, Türkiye
| |
Collapse
|
46
|
Hinz M, Lutter C, Mueller-Rath R, Niemeyer P, Miltner O, Tischer T. The German Arthroscopy Registry DART: what has happened after 5 years? Knee Surg Sports Traumatol Arthrosc 2023; 31:102-109. [PMID: 36153780 PMCID: PMC9510517 DOI: 10.1007/s00167-022-07152-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 08/30/2022] [Indexed: 01/25/2023]
Abstract
PURPOSE The German Arthroscopy Registry (DART) has been initiated in 2017 with the aim to collect real-life data of patients undergoing knee, shoulder, hip or ankle surgery. The purpose of this study was to present an overview of the current status and the collected data thus far. METHODS Data entered between 11/2017 and 01/2022 were analyzed. The number of cases (each case is defined as a single operation with or without concomitant procedures) entered for each joint, follow-up rates and trends between different age groups (18-29 years, 30-44 years, 45-64 years, ≥ 65 years) and across genders, and quality of life improvement (pre- vs. 1 year postoperative EQ visual analogue scale [EQ-VAS]) for frequently performed procedures (medial meniscus repair [MMR] vs. rotator cuff repair [RCR] vs. microfracturing of the talus [MFX-T]) were investigated. RESULTS Overall, 6651 cases were entered into DART, forming three distinct modules classified by joint (5370 knee, 1053 shoulder and 228 ankle cases). The most commonly entered procedures were: knee: partial medial meniscectomy (n = 2089), chondroplasty (n = 1389), anterior cruicate ligament reconstruction with hamstring autograft (n = 880); shoulder: sub acromial decompression (n = 631), bursectomy (n = 385), RCR (n = 359); ankle: partial synovectomy (n = 117), tibial osteophyte resection (n = 72), loose body removal (n = 48). In the knee and shoulder modules, middle-aged patients were the predominant age group, whereas in the ankle module, the youngest age group was the most frequent one. The two oldest age groups had the highest 1-year follow-up rates across all modules. In the knee and shoulder module, 1-year follow-up rates were higher in female patients, whereas follow-up rates were higher in male patients in the ankle module. From pre- to 1-year postoperative, MFX-T (EQ-VAS: 50.0 [25-75% interquartile range: 31.8-71.5] to 75.0 [54.3-84.3]; ∆ + 25.0) led to a comparably larger improvement in quality of life than did MMR (EQ-VAS: 70.0 [50.0-80.0] to 85.0 [70.0-94.0]; ∆ + 15.0) or RCR (EQ-VAS: 67.0 [50.0-80.0] to 85.0 [70.0-95.0]; ∆ + 18.0). CONCLUSION DART has been sufficiently established and collects high-quality patient-related data with satisfactory follow-up allowing for a comprehensive analysis of the collected data. The current focus lies on improving patient enrolment and follow-up rates as well as initiating the hip module.
Collapse
Affiliation(s)
- Maximilian Hinz
- Department of Sports Orthopaedics, Technical University of Munich, Ismaninger Street 22, 81675, Munich, Germany.
| | - Christoph Lutter
- Department of Orthopaedics, Rostock University Medical Center, Rostock, Germany
| | | | | | | | - Thomas Tischer
- Department of Orthopaedics, Rostock University Medical Center, Rostock, Germany ,Department of Orthopaedic and Traumatologic Surgery, Waldkrankenhaus, Erlangen, Germany
| |
Collapse
|
47
|
Performance of a deep convolutional neural network for MRI-based vertebral body measurements and insufficiency fracture detection. Eur Radiol 2022; 33:3188-3199. [PMID: 36576545 PMCID: PMC10121505 DOI: 10.1007/s00330-022-09354-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 09/23/2022] [Accepted: 11/29/2022] [Indexed: 12/29/2022]
Abstract
OBJECTIVES The aim is to validate the performance of a deep convolutional neural network (DCNN) for vertebral body measurements and insufficiency fracture detection on lumbar spine MRI. METHODS This retrospective analysis included 1000 vertebral bodies in 200 patients (age 75.2 ± 9.8 years) who underwent lumbar spine MRI at multiple institutions. 160/200 patients had ≥ one vertebral body insufficiency fracture, 40/200 had no fracture. The performance of the DCNN and that of two fellowship-trained musculoskeletal radiologists in vertebral body measurements (anterior/posterior height, extent of endplate concavity, vertebral angle) and evaluation for insufficiency fractures were compared. Statistics included (a) interobserver reliability metrics using intraclass correlation coefficient (ICC), kappa statistics, and Bland-Altman analysis, and (b) diagnostic performance metrics (sensitivity, specificity, accuracy). A statistically significant difference was accepted if the 95% confidence intervals did not overlap. RESULTS The inter-reader agreement between radiologists and the DCNN was excellent for vertebral body measurements, with ICC values of > 0.94 for anterior and posterior vertebral height and vertebral angle, and good to excellent for superior and inferior endplate concavity with ICC values of 0.79-0.85. The performance of the DCNN in fracture detection yielded a sensitivity of 0.941 (0.903-0.968), specificity of 0.969 (0.954-0.980), and accuracy of 0.962 (0.948-0.973). The diagnostic performance of the DCNN was independent of the radiological institution (accuracy 0.964 vs. 0.960), type of MRI scanner (accuracy 0.957 vs. 0.964), and magnetic field strength (accuracy 0.966 vs. 0.957). CONCLUSIONS A DCNN can achieve high diagnostic performance in vertebral body measurements and insufficiency fracture detection on heterogeneous lumbar spine MRI. KEY POINTS • A DCNN has the potential for high diagnostic performance in measuring vertebral bodies and detecting insufficiency fractures of the lumbar spine.
Collapse
|
48
|
Farhadi F, Barnes MR, Sugito HR, Sin JM, Henderson ER, Levy JJ. Applications of artificial intelligence in orthopaedic surgery. FRONTIERS IN MEDICAL TECHNOLOGY 2022; 4:995526. [PMID: 36590152 PMCID: PMC9797865 DOI: 10.3389/fmedt.2022.995526] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Accepted: 11/28/2022] [Indexed: 12/23/2022] Open
Abstract
The practice of medicine is rapidly transforming as a result of technological breakthroughs. Artificial intelligence (AI) systems are becoming more and more relevant in medicine and orthopaedic surgery as a result of the nearly exponential growth in computer processing power, cloud based computing, and development, and refining of medical-task specific software algorithms. Because of the extensive role of technologies such as medical imaging that bring high sensitivity, specificity, and positive/negative prognostic value to management of orthopaedic disorders, the field is particularly ripe for the application of machine-based integration of imaging studies, among other applications. Through this review, we seek to promote awareness in the orthopaedics community of the current accomplishments and projected uses of AI and ML as described in the literature. We summarize the current state of the art in the use of ML and AI in five key orthopaedic disciplines: joint reconstruction, spine, orthopaedic oncology, trauma, and sports medicine.
Collapse
Affiliation(s)
- Faraz Farhadi
- Geisel School of Medicine, Dartmouth College, Hanover, NH, United States
- Radiology and Imaging Sciences, National Institutes of Health (NIH), Bethesda, United States
| | - Matthew R. Barnes
- Geisel School of Medicine, Dartmouth College, Hanover, NH, United States
| | - Harun R. Sugito
- Geisel School of Medicine, Dartmouth College, Hanover, NH, United States
| | - Jessica M. Sin
- Department of Radiology, Dartmouth Health, Lebanon, United States
| | - Eric R. Henderson
- Department of Orthopaedics, Dartmouth Health, Lebanon, United States
| | - Joshua J. Levy
- Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH, United States
| |
Collapse
|
49
|
Zech JR, Santomartino SM, Yi PH. Artificial Intelligence (AI) for Fracture Diagnosis: An Overview of Current Products and Considerations for Clinical Adoption, From the AJR Special Series on AI Applications. AJR Am J Roentgenol 2022; 219:869-878. [PMID: 35731103 DOI: 10.2214/ajr.22.27873] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Fractures are common injuries that can be difficult to diagnose, with missed fractures accounting for most misdiagnoses in the emergency department. Artificial intelligence (AI) and, specifically, deep learning have shown a strong ability to accurately detect fractures and augment the performance of radiologists in proof-of-concept research settings. Although the number of real-world AI products available for clinical use continues to increase, guidance for practicing radiologists in the adoption of this new technology is limited. This review describes how AI and deep learning algorithms can help radiologists to better diagnose fractures. The article also provides an overview of commercially available U.S. FDA-cleared AI tools for fracture detection as well as considerations for the clinical adoption of these tools by radiology practices.
Collapse
Affiliation(s)
- John R Zech
- Department of Radiology, Columbia University Irving Medical Center/New York-Presbyterian Hospital, New York, NY
| | - Samantha M Santomartino
- Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland Medical Intelligent Imaging (UM2ii) Center, University of Maryland School of Medicine, 670 W Baltimore St, First Fl, Rm 1172, Baltimore, MD 21201
| | - Paul H Yi
- Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland Medical Intelligent Imaging (UM2ii) Center, University of Maryland School of Medicine, 670 W Baltimore St, First Fl, Rm 1172, Baltimore, MD 21201
| |
Collapse
|
50
|
Yang L, Gao S, Li P, Shi J, Zhou F. Recognition and Segmentation of Individual Bone Fragments with a Deep Learning Approach in CT Scans of Complex Intertrochanteric Fractures: A Retrospective Study. J Digit Imaging 2022; 35:1681-1689. [PMID: 35711073 PMCID: PMC9712885 DOI: 10.1007/s10278-022-00669-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 05/04/2022] [Accepted: 06/07/2022] [Indexed: 10/18/2022] Open
Abstract
The characteristics of bone fragments are the main influencing factors for the choice of treatment in intertrochanteric fractures. This study aimed to develop a deep learning algorithm for recognizing and segmenting individual fragments in CT images of complex intertrochanteric fractures for orthopedic surgeons. This study was based on 160 hip CT scans (43,510 images) of complex fractures of three types based on the Evans-Jensen classification (40 cases of type 3 (IIA) fractures, 80 cases of type 4 (IIB)fractures, and 40 cases of type 5 (III)fractures) retrospectively. The images were randomly split into two groups to construct a training set of 120 CT scans (32,045 images) and a testing set of 40 CT scans (11,465 images). A deep learning model was built into a cascaded architecture composed by a convolutional neural network (CNN) for location of the fracture ROI and another CNN for recognition and segmentation of individual fragments within the ROI. The accuracy of object detection and dice coefficient of segmentation of individual fragments were used to evaluate model performance. The model yielded an average accuracy of 89.4% for individual fragment recognition and an average dice coefficient of 90.5% for segmentation in CT images. The results demonstrated the feasibility of recognition and segmentation of individual fragments in complex intertrochanteric fractures with a deep learning approach. Altogether, these promising results suggest the potential of our model to be applied to many clinical scenarios.
Collapse
Affiliation(s)
- Lv Yang
- Department of Orthopedics, Peking University Third Hospital, Beijing, China
| | - Shan Gao
- Department of Orthopedics, Peking University Third Hospital, Beijing, China
| | - Pengfei Li
- Department of Orthopedics, Peking University Third Hospital, Beijing, China
| | - Jiancheng Shi
- Department of Radiology, Peking University Third Hospital, Yanqing Hospital, Beijing, China
| | - Fang Zhou
- Department of Orthopedics, Peking University Third Hospital, Beijing, China.
| |
Collapse
|