Hung TNK, Vy VPT, Tri NM, Hoang LN, Tuan LV, Ho QT, Le NQK, Kang JH. Automatic Detection of Meniscus Tears Using Backbone Convolutional Neural Networks on Knee MRI.
J Magn Reson Imaging 2023;
57:740-749. [PMID:
35648374 DOI:
10.1002/jmri.28284]
[Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 05/21/2022] [Accepted: 05/23/2022] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND
Timely diagnosis of meniscus injuries is key for preventing knee joint dysfunction and improving patient outcomes because it decreases morbidity and facilitates treatment planning.
PURPOSE
To train and evaluate a deep learning model for automated detection of meniscus tears on knee magnetic resonance imaging (MRI).
STUDY TYPE
Bicentric retrospective study.
SUBJECTS
In total, 584 knee MRI studies, divided among training (n = 234), testing (n = 200), and external validation (n = 150) data sets, were used in this study. The public data set MRNet was used as a second external validation data set to evaluate the performance of the model.
SEQUENCE
A 3 T, coronal, and sagittal images from T1-weighted proton density (PD) fast spin-echo (FSE) with fat saturation and T2-weighted FSE with fat saturation sequences.
ASSESSMENT
The detection system for meniscus tear was based on the improved YOLOv4 model with Darknet-53 as the backbone. The performance of the model was also compared with that of three radiologists of varying levels of experience. The determination of the presence of a meniscus tear from surgery reports was used as the ground truth for the images.
STATISTICAL TESTS
Sensitivity, specificity, prevalence, positive predictive value, negative predictive value, accuracy, and receiver operating characteristic curve were used to evaluate the performance of the detection model. Two-way analysis of variance, Wilcoxon signed-rank test, and Tukey's multiple tests were used to evaluate differences in performance between the model and radiologists.
RESULTS
The overall accuracies for detecting meniscus tears using our model on the internal testing, internal validation, and external validation data sets were 95.4%, 95.8%, and 78.8%, respectively. One radiologist had significantly lower performance than our model in detecting meniscal tears (accuracy: 0.9025 ± 0.093 vs. 0.9580 ± 0.025).
DATA CONCLUSION
The proposed model had high sensitivity, specificity, and accuracy for detecting meniscus tears on knee MRIs.
EVIDENCE LEVEL
3 TECHNICAL EFFICACY: Stage 2.
Collapse