Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Bayliss L, Jones LD. The role of artificial intelligence and machine learning in predicting orthopaedic outcomes. Bone Joint J 2019;101-B:1476-1478. [DOI: 10.1302/0301-620x.101b12.bjj-2019-0850.r1] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

For:	Bayliss L, Jones LD. The role of artificial intelligence and machine learning in predicting orthopaedic outcomes. Bone Joint J 2019;101-B:1476-1478. [DOI: 10.1302/0301-620x.101b12.bjj-2019-0850.r1] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Number

Cited by Other Article(s)

Lu T, Lu M, Liu H, Song D, Wang Z, Guo Y, Fang Y, Chen Q, Li T. Establishment of a prognostic model for gastric cancer patients who underwent radical gastrectomy using machine learning: a two-center study. Front Oncol 2024;13:1282042. [PMID: 38665864 PMCID: PMC11043579 DOI: 10.3389/fonc.2023.1282042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 12/21/2023] [Indexed: 04/28/2024] Open

Abstract

Objective

Gastric cancer is a prevalent gastrointestinal malignancy worldwide. In this study, a prognostic model was developed for gastric cancer patients who underwent radical gastrectomy using machine learning, employing advanced computational techniques to investigate postoperative mortality risk factors in such patients.

Methods

Data of 295 patients with gastric cancer who underwent radical gastrectomy at the Department of General Surgery of Affiliated Hospital of Xuzhou Medical University (Xuzhou, China) between March 2016 and November 2019 were retrospectively analyzed as the training group. Additionally, 109 patients who underwent radical gastrectomy at the Department of General Surgery Affiliated to Jining First People's Hospital (Jining, China) were included for external validation. Four machine learning models, including logistic regression (LR), decision tree (DT), random forest (RF), and gradient boosting machine (GBM), were utilized. Model performance was assessed by comparing the area under the curve (AUC) for each model. An LR-based nomogram model was constructed to assess patients' clinical prognosis.

Results

Lasso regression identified eight associated factors: age, sex, maximum tumor diameter, nerve or vascular invasion, TNM stage, gastrectomy type, lymphocyte count, and carcinoembryonic antigen (CEA) level. The performance of these models was evaluated using the AUC. In the training group, the AUC values were 0.795, 0.759, 0.873, and 0.853 for LR, DT, RF, and GBM, respectively. In the validation group, the AUC values were 0.734, 0.708, 0.746, and 0.707 for LR, DT, RF, and GBM, respectively. The nomogram model, constructed based on LR, demonstrated excellent clinical prognostic evaluation capabilities.

Conclusion

Machine learning algorithms are robust performance assessment tools for evaluating the prognosis of gastric cancer patients who have undergone radical gastrectomy. The LR-based nomogram model can aid clinicians in making more reliable clinical decisions.

Collapse

Ghadirinejad K, Milimonfared R, Taylor M, Solomon LB, Graves S, Pratt N, de Steiger R, Hashemi R. Supervised machine learning for the prediction of post-operative clinical outcomes of hip and knee replacements: a review. ANZ J Surg 2024. [PMID: 38597170 DOI: 10.1111/ans.19003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 02/28/2024] [Accepted: 03/27/2024] [Indexed: 04/11/2024]

Tang Y, Liu Y, Du Z, Wang Z, Pan S. Prediction of coronary artery lesions in children with Kawasaki syndrome based on machine learning. BMC Pediatr 2024;24:158. [PMID: 38443868 PMCID: PMC10916227 DOI: 10.1186/s12887-024-04608-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 01/31/2024] [Indexed: 03/07/2024] Open

Abstract

OBJECTIVE

Kawasaki syndrome (KS) is an acute vasculitis that affects children < 5 years of age and leads to coronary artery lesions (CAL) in about 20-25% of untreated cases. Machine learning (ML) is a branch of artificial intelligence (AI) that integrates complex data sets on a large scale and uses huge data to predict future events. The purpose of the present study was to use ML to present the model for early risk assessment of CAL in children with KS by different algorithms.

METHODS

A total of 158 children were enrolled from Women and Children's Hospital, Qingdao University, and divided into 70-30% as the training sets and the test sets for modeling and validation studies. There are several classifiers are constructed for models including the random forest (RF), the logistic regression (LR), and the eXtreme Gradient Boosting (XGBoost). Data preprocessing is analyzed before applying the classifiers to modeling. To avoid the problem of overfitting, the 5-fold cross validation method was used throughout all the data.

RESULTS

The area under the curve (AUC) of the RF model was 0.925 according to the validation of the test set. The average accuracy was 0.930 (95% CI, 0.905 to 0.956). The AUC of the LG model was 0.888 and the average accuracy was 0.893 (95% CI, 0,837 to 0.950). The AUC of the XGBoost model was 0.879 and the average accuracy was 0.935 (95% CI, 0.891 to 0.980).

CONCLUSION

The RF algorithm was used in the present study to construct a prediction model for CAL effectively, with an accuracy of 0.930 and AUC of 0.925. The novel model established by ML may help guide clinicians in the initial decision to make a more aggressive initial anti-inflammatory therapy. Due to the limitations of external validation and regional population characteristics, additional research is required to initiate a further application in the clinic.

Collapse

Lu T, Lu M, Wu D, Ding YY, Liu HN, Li TT, Song DQ. Predictive value of machine learning models for lymph node metastasis in gastric cancer: A two-center study. World J Gastrointest Surg 2024;16:85-94. [PMID: 38328326 PMCID: PMC10845275 DOI: 10.4240/wjgs.v16.i1.85] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 11/24/2023] [Accepted: 12/21/2023] [Indexed: 01/25/2024] Open

Abstract

BACKGROUND

Gastric cancer is one of the most common malignant tumors in the digestive system, ranking sixth in incidence and fourth in mortality worldwide. Since 42.5% of metastatic lymph nodes in gastric cancer belong to nodule type and peripheral type, the application of imaging diagnosis is restricted.

AIM

To establish models for predicting the risk of lymph node metastasis in gastric cancer patients using machine learning (ML) algorithms and to evaluate their predictive performance in clinical practice.

METHODS

Data of a total of 369 patients who underwent radical gastrectomy at the Department of General Surgery of Affiliated Hospital of Xuzhou Medical University (Xuzhou, China) from March 2016 to November 2019 were collected and retrospectively analyzed as the training group. In addition, data of 123 patients who underwent radical gastrectomy at the Department of General Surgery of Jining First People's Hospital (Jining, China) were collected and analyzed as the verification group. Seven ML models, including decision tree, random forest, support vector machine (SVM), gradient boosting machine, naive Bayes, neural network, and logistic regression, were developed to evaluate the occurrence of lymph node metastasis in patients with gastric cancer. The ML models were established following ten cross-validation iterations using the training dataset, and subsequently, each model was assessed using the test dataset. The models' performance was evaluated by comparing the area under the receiver operating characteristic curve of each model.

RESULTS

Among the seven ML models, except for SVM, the other ones exhibited higher accuracy and reliability, and the influences of various risk factors on the models are intuitive.

CONCLUSION

The ML models developed exhibit strong predictive capabilities for lymph node metastasis in gastric cancer, which can aid in personalized clinical diagnosis and treatment.

Collapse

Lu T, Fang Y, Liu H, Chen C, Li T, Lu M, Song D. Comparison of Machine Learning and Logic Regression Algorithms for Predicting Lymph Node Metastasis in Patients with Gastric Cancer: A two-Center Study. Technol Cancer Res Treat 2024;23:15330338231222331. [PMID: 38190617 PMCID: PMC10775719 DOI: 10.1177/15330338231222331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 11/01/2023] [Accepted: 11/20/2023] [Indexed: 01/10/2024] Open

Zhang Z, Luo Y, Zhang C, Wang X, Zhang T, Zhang G. Prediction of gap balancing based on 2-D radiography in total knee arthroplasty for knee osteoarthritis patients. ARTHROPLASTY 2023;5:60. [PMID: 37968740 PMCID: PMC10652581 DOI: 10.1186/s42836-023-00218-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 10/13/2023] [Indexed: 11/17/2023] Open

Karlin EA, Lin CC, Meftah M, Slover JD, Schwarzkopf R. The Impact of Machine Learning on Total Joint Arthroplasty Patient Outcomes: A Systemic Review. J Arthroplasty 2023;38:2085-2095. [PMID: 36441039 DOI: 10.1016/j.arth.2022.10.039] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 10/19/2022] [Accepted: 10/24/2022] [Indexed: 11/27/2022] Open

Abstract

BACKGROUND

Supervised machine learning techniques have been increasingly applied to predict patient outcomes after hip and knee arthroplasty procedures. The purpose of this study was to systematically review the applications of supervised machine learning techniques to predict patient outcomes after primary total hip and knee arthroplasty.

METHODS

A comprehensive literature search using the electronic databases MEDLINE, EMBASE, Cochrane Central Register of Controlled Trials, and Cochrane Database of Systematic Reviews was conducted in July of 2021. The inclusion criteria were studies that utilized supervised machine learning techniques to predict patient outcomes after primary total hip or knee arthroplasty.

RESULTS

Search criteria yielded n = 30 relevant studies. Topics of study included patient complications (n = 6), readmissions (n = 1), revision (n = 2), patient-reported outcome measures (n = 4), patient satisfaction (n = 4), inpatient status and length of stay (LOS) (n = 9), opioid usage (n = 3), and patient function (n = 1). Studies involved TKA (n = 12), THA (n = 11), or a combination (n = 7). Less than 35% of predictive outcomes had an area under the receiver operating characteristic curve (AUC) in the excellent or outstanding range. Additionally, only 9 of the studies found improvement over logistic regression, and only 9 studies were externally validated.

CONCLUSION

Supervised machine learning algorithms are powerful tools that have been increasingly applied to predict patient outcomes after total hip and knee arthroplasty. However, these algorithms should be evaluated in the context of prognostic accuracy, comparison to traditional statistical techniques for outcome prediction, and application to populations outside the training set. While machine learning algorithms have been received with considerable interest, they should be critically assessed and validated prior to clinical adoption.

Collapse

Johnson QJ, Jabal MS, Arguello AM, Lu Y, Jurgensmeier K, Levy BA, Camp CL, Krych AJ. Machine learning can accurately predict risk factors for all-cause reoperation after ACLR: creating a clinical tool to improve patient counseling and outcomes. Knee Surg Sports Traumatol Arthrosc 2023;31:4099-4108. [PMID: 37414947 DOI: 10.1007/s00167-023-07497-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Accepted: 06/16/2023] [Indexed: 07/08/2023]

Abstract

PURPOSE

Identifying predictive factors for all-cause reoperation after anterior cruciate ligament reconstruction could inform clinical decision making and improve risk mitigation. The primary purposes of this study are to (1) determine the incidence of all-cause reoperation after anterior cruciate ligament reconstruction, (2) identify predictors of reoperation after anterior cruciate ligament reconstruction using machine learning methodology, and (3) compare the predictive capacity of the machine learning methods to that of traditional logistic regression.

METHODS

A longitudinal geographical database was utilized to identify patients with a diagnosis of new anterior cruciate ligament injury. Eight machine learning models were appraised on their ability to predict all-cause reoperation after anterior cruciate ligament reconstruction. Model performance was evaluated via area under the receiver operating characteristics curve. To explore modeling interpretability and radiomic feature influence on the predictions, we utilized a game-theory-based method through SHapley Additive exPlanations.

RESULTS

A total of 1400 patients underwent anterior cruciate ligament reconstruction with a mean postoperative follow-up of 9 years. Two-hundred and eighteen (16%) patients experienced a reoperation after anterior cruciate ligament reconstruction, of which 6% of these were revision ACL reconstruction. SHapley Additive exPlanations plots identified the following risk factors as predictive for all-cause reoperation: diagnosis of systemic inflammatory disease, distal tear location, concomitant medial collateral ligament repair, higher visual analog scale pain score prior to surgery, hamstring autograft, tibial fixation via radial expansion device, younger age at initial injury, and concomitant meniscal repair. Pertinent negatives, when compared to previous studies, included sex and timing of surgery. XGBoost was the best-performing model (area under the receiver operating characteristics curve of 0.77) and outperformed logistic regression in this regard.

CONCLUSIONS

All-cause reoperation after anterior cruciate ligament reconstruction occurred at a rate of 16%. Machine learning models outperformed traditional statistics and identified diagnosis of systemic inflammatory disease, distal tear location, concomitant medial collateral ligament repair, higher visual analog scale pain score prior to surgery, hamstring autograft, tibial fixation via radial expansion device, younger age at initial injury, and concomitant meniscal repair as predictive risk factors for reoperation. Pertinent negatives, when compared to previous studies, included sex and timing of surgery. These models will allow surgeons to tabulate individualized risk for future reoperation for patients undergoing anterior cruciate ligament reconstruction.

LEVEL OF EVIDENCE

III.

Collapse

Sedigh A, Townsend C, Khawam SM, Vaccaro AR, Carreras BN, Beredjiklian PK, Rivlin M. Remote fit wrist braces through artificial intelligence. Prosthet Orthot Int 2023;47:434-439. [PMID: 37068013 DOI: 10.1097/pxr.0000000000000233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 01/18/2023] [Indexed: 04/18/2023]

Abstract

INTRODUCTION

Physical boundaries to access skilled orthotist or hand therapy care may be hindered by multiple factors, such as geography, or availability. This study evaluated the accuracy of fitting a prefabricated wrist splint using an app on a smart device. We hypothesize that remote brace fitting by artificial intelligence (AI) can accurately determine the brace size the patient needs without in-person fitting.

METHODS

Healthy volunteers were recruited to fit wrist braces. Using 2 standardized calibrated images captured by the smart device, each subject's image was loaded into the machine learning software (AI). Later, hand features were extracted, calibrated, and measured the application, calculated the correct splint size, and compared with the splint chosen by our subjects to improve its own accuracy. As a control (control 1), the subjects independently selected the best brace fit from an array of available splints. Subject selection was recorded and compared with the AI fit splint. As the second method of fitting (control 2), we compared the manufacturer recommended brace size (based on measured wrist circumference and provided sizing chart/insert brochure) with the AI fit splint.

RESULTS

A total of 54 volunteers were included. Thirty-two splints predicted by the algorithm matched the exact size chosen by each subject yielding 70% accuracy with a standard deviation of 10% ( p < 0.001). The accuracy increased to 90% with 5% standard deviation if the splints were predicted within the next size category. Fit by manufacturer sizing chart was only 33% in agreement with participant selection.

CONCLUSION

Remote brace fitting using AI prediction model may be an acceptable alternative to current standards because it can accurately predict wrist splint size. As more subjects were analyzed, the AI algorithm became more accurate predicting proper brace fit. In addition, AI fit braces are more than twice as accurate as relying on the manufacturer sizing chart.

Collapse

Ma Y, Lu Q, Yuan F, Chen H. Comparison of the effectiveness of different machine learning algorithms in predicting new fractures after PKP for osteoporotic vertebral compression fractures. J Orthop Surg Res 2023;18:62. [PMID: 36683045 PMCID: PMC9869614 DOI: 10.1186/s13018-023-03551-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 01/19/2023] [Indexed: 01/24/2023] Open

Abstract

BACKGROUND

The use of machine learning has the potential to estimate the probability of a second classification event more accurately than traditional statistical methods, and few previous studies on predicting new fractures after osteoporotic vertebral compression fractures (OVCFs) have focussed on this point. The aim of this study was to explore whether several different machine learning models could produce better predictions than logistic regression models and to select an optimal model.

METHODS

A retrospective analysis of 529 patients who underwent percutaneous kyphoplasty (PKP) for OVCFs at our institution between June 2017 and June 2020 was performed. The patient data were used to create machine learning (including decision trees (DT), random forests (RF), support vector machines (SVM), gradient boosting machines (GBM), neural networks (NNET), and regularized discriminant analysis (RDA)) and logistic regression models (LR) to estimate the probability of new fractures occurring after surgery. The dataset was divided into a training set (75%) and a test set (25%), and machine learning models were built in the training set after ten cross-validations, after which each model was evaluated in the test set, and model performance was assessed by comparing the area under the curve (AUC) of each model.

RESULTS

Among the six machine learning algorithms, except that the AUC of DT [0.775 (95% CI 0.728-0.822)] was lower than that of LR [0.831 (95% CI 0.783-0.878)], RA [0.953 (95% CI 0.927-0.980)], GBM [0.941 (95% CI 0.911-0.971)], SVM [0.869 (95% CI 0.827-0.910), NNET [0.869 (95% CI 0.826-0.912)], and RDA [0.890 (95% CI 0.851-0.929)] were all better than LR.

CONCLUSIONS

For prediction of the probability of new fracture after PKP, machine learning algorithms outperformed logistic regression, with random forest having the strongest predictive power.

Collapse

Scala A, Borrelli A, Improta G. Predictive analysis of lower limb fractures in the orthopedic complex operative unit using artificial intelligence: the case study of AOU Ruggi. Sci Rep 2022;12:22153. [PMID: 36550192 PMCID: PMC9780352 DOI: 10.1038/s41598-022-26667-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022] Open

Haddad FS. Looking back over the past year. Bone Joint J 2022;104-B:1279-1280. [DOI: 10.1302/0301-620x.104b12.bjj-2022-1161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]

Lu Y, Jurgensmeier K, Till SE, Reinholz AK, Saris DBF, Camp CL, Krych AJ. Early ACLR and Risk and Timing of Secondary Meniscal Injury Compared With Delayed ACLR or Nonoperative Treatment: A Time-to-Event Analysis Using Machine Learning. Am J Sports Med 2022;50:3544-3556. [PMID: 36178166 PMCID: PMC10075196 DOI: 10.1177/03635465221124258] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

Abstract

BACKGROUND

Surgical and nonoperative management of anterior cruciate ligament (ACL) injuries seek to mitigate the risk of knee instability and secondary meniscal injury. However, the associated risk and timing of secondary meniscal tears have not been completely elucidated.

PURPOSE

To compare risk and timing of secondary meniscal injury between patients receiving nonoperative management, delayed ACL reconstruction (ACLR), and early ACLR using a machine learning survival analysis.

STUDY DESIGN

Cohort study; Level of evidence, 3.

METHODS

A geographic database was used to identify and review records of patients with a diagnosis of ACL rupture between 1990 and 2016 with minimum 2-year follow-up. Patients undergoing ACLR were matched 1:1 with nonoperatively treated controls. Rate and time to secondary meniscal tear were compared using random survival forest algorithms; independent models were developed and internally validated for predicting injury-free duration in both cohorts. Performance was measured using out-of-bag c-statistic, calibration, and Brier score. Model interpretability was enhanced using global variable importance and partial dependence curves.

RESULTS

The study included 1369 patients who underwent ACLR and 294 patients who had nonoperative treatment. After matching, no significant differences in rates of secondary meniscal tear were found (P = .09); subgroup analysis revealed the shortest periods of meniscal survival in patients undergoing delayed ACLR. The random survival forest algorithm achieved excellent predictive performance for the ACLR cohort, with an out-of-bag c-statistic of 0.80 and a Brier score of 0.11. Significant variables for risk of meniscal tear for the ACLR cohort included time to return to sports or activity ≤350 days, time to surgery ≥50 days, age at injury ≤40 years, and high-impact or rotational landing sports, whereas those in the nonoperative cohort model included time to RTS ≤200 days, visual analog scale pain score >3 at consultation, hypermobility, and noncontact sports.

CONCLUSION

Delayed ACLR demonstrated the greatest long-term risk of meniscal injury compared with nonoperative treatment or early ACLR. Risk factors for decreased meniscal survival after ACLR included increased time to surgery, shorter time to return to sports or activity, older age at injury, and involvement in high-impact or rotational landing sports. Pending careful external validation, these models may be deployed in the clinical space to provide real-time insights and enhance decision making.

Collapse

Lu Y, Labott JR, Salmons Iv HI, Gross BD, Barlow JD, Sanchez-Sotelo J, Camp CL. Identifying modifiable and nonmodifiable cost drivers of ambulatory rotator cuff repair: a machine learning analysis. J Shoulder Elbow Surg 2022;31:2262-2273. [PMID: 35562029 DOI: 10.1016/j.jse.2022.04.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 03/25/2022] [Accepted: 04/09/2022] [Indexed: 02/01/2023]

Abstract

INTRODUCTION

Implementing novel tools that identify contributors to the cost of orthopedic procedures can help hospitals maximize efficiency, minimize waste, improve surgical decision-making, and practice value-based care. The purpose of this study was to develop and internally validate a machine learning algorithm to identify key drivers of total charges after ambulatory arthroscopic rotator cuff repair and compare its performance with a state-of-the-art statistical learning model.

METHODS

A retrospective review of the New York State Ambulatory Surgery and Services Database was performed to identify patients who underwent elective outpatient rotator cuff repair (RCR) from 2015 to 2016. Initial models were constructed using patient characteristics (age, gender, insurance status, patient income, Elixhauser Comorbidity Index) as well as intraoperative variables (concomitant procedures and services, operative time). These were subsequently entered into 5 separate machine learning algorithms and a generalized additive model using natural splines. Global variable importance and partial dependence curves were constructed to identify the greatest contributors to cost.

RESULTS

A total of 33,976 patients undergoing ambulatory RCR were included. Median total charges after ambulatory RCR were $16,017 (interquartile range: $11,009-$22,510). The ensemble model outperformed the generalized additive model and demonstrated the best performance on internal validation (root mean squared error: $7112, 95% confidence interval: 7036-7188; logarithmic root mean squared error: 0.354, 95% confidence interval: 0.336-0.373, R²: 0.53), and identified major drivers of total charges after RCR as increasing operating room time, patient income level, number of anchors used, use of local infiltration anesthesia/peripheral nerve blocks, non-White race/ethnicity, and concurrent distal clavicle excision. The model was integrated into a web-based open-access application capable of providing individual predictions and explanations on a case-by-case basis.

CONCLUSION

This study developed an ensemble supervised machine learning algorithm that outperformed a sophisticated statistical learning model in predicting total charges after ambulatory RCR. Important contributors to total charges included operating room time, duration of care, number of anchors used, type of anesthesia, concomitant distal clavicle excision, community characteristics, and patient demographic factors. Generation of a patient-specific payment schedule based on the Agency for Healthcare Research and Quality risk of mortality highlighted the financial risk assumed by physicians in flat episodic reimbursement schedules given variable patient comorbidities and the importance of an accurate prediction algorithm to appropriately reward high-value care at low costs.

Collapse

Artificial Intelligence in Orthopedic Radiography Analysis: A Narrative Review. Diagnostics (Basel) 2022;12:diagnostics12092235. [PMID: 36140636 PMCID: PMC9498096 DOI: 10.3390/diagnostics12092235] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 09/12/2022] [Accepted: 09/13/2022] [Indexed: 11/17/2022] Open

Alsoof D, McDonald CL, Kuris EO, Daniels AH. Machine Learning for the Orthopaedic Surgeon: Uses and Limitations. J Bone Joint Surg Am 2022;104:1586-1594. [PMID: 35383655 DOI: 10.2106/jbjs.21.01305] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Prijs J, Liao Z, Ashkani-Esfahani S, Olczak J, Gordon M, Jayakumar P, Jutte PC, Jaarsma RL, IJpma FFA, Doornberg JN. Artificial intelligence and computer vision in orthopaedic trauma : the why, what, and how. Bone Joint J 2022;104-B:911-914. [PMID: 35909378 DOI: 10.1302/0301-620x.104b8.bjj-2022-0119.r1] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Bulstra AEJ. A Machine Learning Algorithm to Estimate the Probability of a True Scaphoid Fracture After Wrist Trauma. J Hand Surg Am 2022;47:709-718. [PMID: 35667955 DOI: 10.1016/j.jhsa.2022.02.023] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 01/12/2022] [Accepted: 02/23/2022] [Indexed: 02/02/2023]

Abstract

PURPOSE

To identify predictors of a true scaphoid fracture among patients with radial wrist pain following acute trauma, train 5 machine learning (ML) algorithms in predicting scaphoid fracture probability, and design a decision rule to initiate advanced imaging in high-risk patients.

METHODS

Two prospective cohorts including 422 patients with radial wrist pain following wrist trauma were combined. There were 117 scaphoid fractures (28%) confirmed on computed tomography, magnetic resonance imaging, or radiographs. Eighteen fractures (15%) were occult. Predictors of a scaphoid fracture were identified among demographics, mechanism of injury and examination maneuvers. Five ML-algorithms were trained in calculating scaphoid fracture probability. ML-algorithms were assessed on ability to discriminate between patients with and without a fracture (area under the receiver operating characteristic curve), agreement between observed and predicted probabilities (calibration), and overall performance (Brier score). The best performing ML-algorithm was incorporated into a probability calculator. A decision rule was proposed to initiate advanced imaging among patients with negative radiographs.

RESULTS

Pain over the scaphoid on ulnar deviation, sex, age, and mechanism of injury were most strongly associated with a true scaphoid fracture. The best performing ML-algorithm yielded an area under the receiver operating characteristic curve, calibration slope, intercept, and Brier score of 0.77, 0.84, -0.01 and 0.159, respectively. The ML-derived decision rule proposes to initiate advanced imaging in patients with radial-sided wrist pain, negative radiographs, and a fracture probability of ≥10%. When applied to our cohort, this would yield 100% sensitivity, 38% specificity, and would have reduced the number of patients undergoing advanced imaging by 36% without missing a fracture.

CONCLUSIONS

The ML-algorithm accurately calculated scaphoid fracture probability based on scaphoid pain on ulnar deviation, sex, age, and mechanism of injury. The ML-decision rule may reduce the number of patients undergoing advanced imaging by a third with a small risk of missing a fracture. External validation is required before implementation.

TYPE OF STUDY/LEVEL OF EVIDENCE

Diagnostic II.

Collapse

Vigdorchik JM, Jang SJ, Taunton MJ, Haddad FS. Deep learning in orthopaedic research : weighing idealism against realism. Bone Joint J 2022;104-B:909-910. [PMID: 35909380 DOI: 10.1302/0301-620x.104b8.bjj-2022-0416] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Brandenburg LS, Berger L, Schwarz SJ, Meine H, Weingart JV, Steybe D, Spies BC, Burkhardt F, Schlager S, Metzger MC. Reconstruction of dental roots for implant planning purposes: a feasibility study. Int J Comput Assist Radiol Surg 2022;17:1957-1968. [PMID: 35902422 PMCID: PMC9468133 DOI: 10.1007/s11548-022-02716-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 07/04/2022] [Indexed: 11/27/2022]

Abstract

Purpose

Modern virtual implant planning is a time-consuming procedure, requiring a careful assessment of prosthetic and anatomical factors within a three-dimensional dataset. In order to facilitate the planning process and provide additional information, this study examines a statistical shape model (SSM) to compute the course of dental roots based on a surface scan.

Material and methods

Plaster models of orthognathic patients were scanned and superimposed with three-dimensional data of a cone-beam computer tomography (CBCT). Based on the open-source software “R”, including the packages Morpho, mesheR, Rvcg and RvtkStatismo, an SSM was generated to estimate the tooth axes. The accuracy of the calculated tooth axes was determined using a leave-one-out cross-validation. The deviation of tooth axis prediction in terms of angle or horizontal shift is described with mean and standard deviation. The planning dataset of an implant surgery patient was additionally analyzed using the SSM.

Results

71 datasets were included in this study. The mean angle between the estimated tooth-axis and the actual tooth-axis was 7.5 ± 4.3° in the upper jaw and 6.7 ± 3.8° in the lower jaw. The horizontal deviation between the tooth axis and estimated axis was 1.3 ± 0.8 mm close to the cementoenamel junction, and 0.7 ± 0.5 mm in the apical third of the root. Results for models with one missing tooth did not differ significantly. In the clinical dataset, the SSM could give a reasonable aid for implant positioning.

Conclusions

With the presented SSM, the approximate course of dental roots can be predicted based on a surface scan. There was no difference in predicting the tooth axis of existent or missing teeth. In clinical context, the estimation of tooth axes of missing teeth could serve as a reference for implant positioning. However, a higher number of training data must be achieved to obtain increasing accuracy.

Supplementary Information

The online version contains supplementary material available at 10.1007/s11548-022-02716-x.

Collapse

Loos NL, Hoogendam L, Souer JS, Slijper HP, Andrinopoulou ER, Coppieters MW, Selles RW. Machine Learning Can be Used to Predict Function but Not Pain After Surgery for Thumb Carpometacarpal Osteoarthritis. Clin Orthop Relat Res 2022;480:1271-1284. [PMID: 35042837 PMCID: PMC9191288 DOI: 10.1097/corr.0000000000002105] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Accepted: 12/13/2021] [Indexed: 01/31/2023]

Abstract

BACKGROUND

Surgery for thumb carpometacarpal osteoarthritis is offered to patients who do not benefit from nonoperative treatment. Although surgery is generally successful in reducing symptoms, not all patients benefit. Predicting clinical improvement after surgery could provide decision support and enhance preoperative patient selection.

QUESTIONS/PURPOSES

This study aimed to develop and validate prediction models for clinically important improvement in (1) pain and (2) hand function 12 months after surgery for thumb carpometacarpal osteoarthritis.

METHODS

Between November 2011 and June 2020, 2653 patients were surgically treated for thumb carpometacarpal osteoarthritis. Patient-reported outcome measures were used to preoperatively assess pain, hand function, and satisfaction with hand function, as well as the general mental health of patients and mindset toward their condition. Patient characteristics, medical history, patient-reported symptom severity, and patient-reported mindset were considered as possible predictors. Patients who had incomplete Michigan Hand outcomes Questionnaires at baseline or 12 months postsurgery were excluded, as these scores were used to determine clinical improvement. The Michigan Hand outcomes Questionnaire provides subscores for pain and hand function. Scores range from 0 to 100, with higher scores indicating less pain and better hand function. An improvement of at least the minimum clinically important difference (MCID) of 14.4 for the pain score and 11.7 for the function score were considered "clinically relevant." These values were derived from previous reports that provided triangulated estimates of two anchor-based and one distribution-based MCID. Data collection resulted in a dataset of 1489 patients for the pain model and 1469 patients for the hand function model. The data were split into training (60%), validation (20%), and test (20%) dataset. The training dataset was used to select the predictive variables and to train our models. The performance of all models was evaluated in the validation dataset, after which one model was selected for further evaluation. Performance of this final model was evaluated on the test dataset. We trained the models using logistic regression, random forest, and gradient boosting machines and compared their performance. We chose these algorithms because of their relative simplicity, which makes them easier to implement and interpret. Model performance was assessed using discriminative ability and qualitative visual inspection of calibration curves. Discrimination was measured using area under the curve (AUC) and is a measure of how well the model can differentiate between the outcomes (improvement or no improvement), with an AUC of 0.5 being equal to chance. Calibration is a measure of the agreement between the predicted probabilities and the observed frequencies and was assessed by visual inspection of calibration curves. We selected the model with the most promising performance for clinical implementation (that is, good model performance and a low number of predictors) for further evaluation in the test dataset.

RESULTS

For pain, the random forest model showed the most promising results based on discrimination, calibration, and number of predictors in the validation dataset. In the test dataset, this pain model had a poor AUC (0.59) and poor calibration. For function, the gradient boosting machine showed the most promising results in the validation dataset. This model had a good AUC (0.74) and good calibration in the test dataset. The baseline Michigan Hand outcomes Questionnaire hand function score was the only predictor in the model. For the hand function model, we made a web application that can be accessed via https://analyse.equipezorgbedrijven.nl/shiny/cmc1-prediction-model-Eng/.

CONCLUSION

We developed a promising model that may allow clinicians to predict the chance of functional improvement in an individual patient undergoing surgery for thumb carpometacarpal osteoarthritis, which would thereby help in the decision-making process. However, caution is warranted because our model has not been externally validated. Unfortunately, the performance of the prediction model for pain is insufficient for application in clinical practice.

LEVEL OF EVIDENCE

Level III, therapeutic study.

Collapse

Kokkotis C, Moustakidis S, Tsatalas T, Ntakolia C, Chalatsis G, Konstadakos S, Hantes ME, Giakas G, Tsaopoulos D. Leveraging explainable machine learning to identify gait biomechanical parameters associated with anterior cruciate ligament injury. Sci Rep 2022;12:6647. [PMID: 35459787 PMCID: PMC9026057 DOI: 10.1038/s41598-022-10666-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Accepted: 04/11/2022] [Indexed: 11/09/2022] Open

Wang KY, Puvanesarajah V, Raad M, Barry K, Srikumaran U, Thakkar SC. The BTK Safety Score: A Novel Scoring System for Risk Stratifying Patients Undergoing Simultaneous Bilateral Total Knee Arthroplasty. J Knee Surg 2022;36:702-709. [PMID: 34979584 DOI: 10.1055/s-0041-1741000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Abstract

Selection of appropriate candidates for simultaneous bilateral total knee arthroplasty (si-BTKA) is crucial for minimizing postoperative complications. The aim of this study was to develop a scoring system for identifying patients who may be appropriate for si-BTKA. Patients who underwent si-BTKA were identified in the National Surgical Quality Improvement Program database. Patients who experienced a major 30-day complication were identified as high-risk patients for si-BTKA who potentially would have benefitted from staged bilateral total knee arthroplasty. Major complications included deep wound infection, pneumonia, renal insufficiency or failure, cerebrovascular accident, cardiac arrest, myocardial infarction, pulmonary embolism, sepsis, or death. The predictive model was trained using randomly split 70% of the dataset and validated on the remaining 30%. The scoring system was compared against the American Society of Anesthesiologists (ASA) score, the Charlson Comorbidity Index (CCI), and legacy risk-stratification measures, using area under the curve (AUC) statistic. Total 4,630 patients undergoing si-BTKA were included in our cohort. In our model, patients are assigned points based on the following risk factors: +1 for age ≥ 75, +2 for age ≥ 82, +1 for body mass index (BMI) ≥ 34, +2 for BMI ≥ 42, +1 for hypertension requiring medication, +1 for pulmonary disease (chronic obstructive pulmonary disease or dyspnea), and +3 for end-stage renal disease. The scoring system exhibited an AUC of 0.816, which was significantly higher than the AUC of ASA (0.545; p < 0.001) and CCI (0.599; p < 0.001). The BTK Safety Score developed and validated in our study can be used by surgeons and perioperative teams to risk stratify patients undergoing si-BTKA. Future work is needed to assess this scoring system's ability to predict long-term functional outcomes.

Collapse

Hoogendam L, Bakx JAC, Souer JS, Slijper HP, Andrinopoulou ER, Selles RW. Predicting Clinically Relevant Patient-Reported Symptom Improvement After Carpal Tunnel Release: A Machine Learning Approach. Neurosurgery 2022;90:106-113. [PMID: 34982877 DOI: 10.1227/neu.0000000000001749] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 08/21/2021] [Indexed: 01/01/2023] Open

Lu Y, Pareek A, Wilbur RR, Leland DP, Krych AJ, Camp CL. Understanding Anterior Shoulder Instability Through Machine Learning: New Models That Predict Recurrence, Progression to Surgery, and Development of Arthritis. Orthop J Sports Med 2021;9:23259671211053326. [PMID: 34888391 PMCID: PMC8649098 DOI: 10.1177/23259671211053326] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 08/02/2021] [Indexed: 01/06/2023] Open

Abstract

Background

Management of anterior shoulder instability (ASI) aims to reduce risk of future recurrence and prevent complications via nonoperative and surgical management. Machine learning may be able to reliably provide predictions to improve decision making for this condition.

Purpose

To develop and internally validate a machine-learning model to predict the following outcomes after ASI: (1) recurrent instability, (2) progression to surgery, and (3) the development of symptomatic osteoarthritis (OA) over long-term follow-up.

Study Design

Cohort study (prognosis); Level of evidence, 2.

Methods

An established geographic database of >500,000 patients was used to identify 654 patients aged <40 years with an initial diagnosis of ASI between 1994 and 2016; the mean follow-up was 11.1 years. Medical records were reviewed to obtain patient information, and models were generated to predict the outcomes of interest. Five candidate algorithms were trained in the development of each of the models, as well as an additional ensemble of the algorithms. Performance of the algorithms was assessed using discrimination, calibration, and decision curve analysis.

Results

Of the 654 included patients, 443 (67.7%) experienced multiple instability events, 228 (34.9%) underwent surgery, and 39 (5.9%) developed symptomatic OA. The ensemble gradient-boosted machines achieved the best performances based on discrimination (via area under the receiver operating characteristic curve [AUC]: AUC_recurrence = 0.86), AUC_surgery = 0.76, AUC_OA = 0.78), calibration, decision curve analysis, and Brier score (Brier_recurrence = 0.138, Brier_surgery = 0.185, Brier_OA = 0.05). For demonstration purposes, models were integrated into a single web-based open-access application able to provide predictions and explanations for practitioners and researchers.

Conclusion

After identification of key features, including time from initial instability, age at initial instability, sports involvement, and radiographic findings, machine-learning models were developed that effectively and reliably predicted recurrent instability, progression to surgery, and the development of OA in patients with ASI. After careful external validation, these models can be incorporated into open-access digital applications to inform patients, clinicians, and researchers regarding quantifiable risks of relevant outcomes in the clinic.

Collapse

Stirling PHC, Strelzow JA, Doornberg JN, White TO, McQueen MM, Duckworth AD. Diagnosis of Suspected Scaphoid Fractures. JBJS Rev 2021;9:01874474-202112000-00001. [PMID: 34879033 DOI: 10.2106/jbjs.rvw.20.00247] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Farrow L, Zhong M, Ashcroft GP, Anderson L, Meek RMD. Interpretation and reporting of predictive or diagnostic machine-learning research in Trauma & Orthopaedics. Bone Joint J 2021;103-B:1754-1758. [PMID: 34847720 DOI: 10.1302/0301-620x.103b12.bjj-2021-0851.r1] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Owusu-Akyaw KA, Bido J, Warner T, Rodeo SA, Williams RJ. SF-36 Physical Component Score Is Predictive of Achieving a Clinically Meaningful Improvement after Osteochondral Allograft Transplantation of the Femur. Cartilage 2021;13:853S-859S. [PMID: 32940050 PMCID: PMC8808818 DOI: 10.1177/1947603520958132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Federer SJ, Jones GG. Artificial intelligence in orthopaedics: A scoping review. PLoS One 2021;16:e0260471. [PMID: 34813611 PMCID: PMC8610245 DOI: 10.1371/journal.pone.0260471] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 11/11/2021] [Indexed: 11/19/2022] Open

Artificial Neural Networks Predict 30-Day Mortality After Hip Fracture: Insights From Machine Learning. J Am Acad Orthop Surg 2021;29:977-983. [PMID: 33315645 DOI: 10.5435/jaaos-d-20-00429] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Accepted: 08/14/2020] [Indexed: 02/01/2023] Open

Patient Factors That Matter in Predicting Hip Arthroplasty Outcomes: A Machine-Learning Approach. J Arthroplasty 2021;36:2024-2032. [PMID: 33558044 DOI: 10.1016/j.arth.2020.12.038] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 12/09/2020] [Accepted: 12/22/2020] [Indexed: 02/02/2023] Open

Abstract

BACKGROUND

Despite the success of total hip arthroplasty (THA), approximately 10%-15% of patients will be dissatisfied with their outcome. Identifying patients at risk of not achieving meaningful gains postoperatively is critical to pre-surgical counseling and clinical decision support. Machine learning has shown promise in creating predictive models. This study used a machine-learning model to identify patient-specific variables that predict the postoperative functional outcome in THA.

METHODS

A prospective longitudinal cohort of 160 consecutive patients undergoing total hip replacement for the treatment of degenerative arthritis completed self-reported measures preoperatively and at 3 months postoperatively. Using four types of independent variables (patient demographics, patient-reported health, cognitive appraisal processes and surgical approach), a machine-learning model utilizing Least Absolute Shrinkage Selection Operator (LASSO) was constructed to predict postoperative Hip Disability and Osteoarthritis Outcome Score (HOOS) at 3 months.

RESULTS

The most predictive independent variables of postoperative HOOS were cognitive appraisal processes. Variables that predicted a worse HOOS consisted of frequent thoughts of work (β = -0.34), frequent comparison to healthier peers (β = -0.26), increased body mass index (β = -0.17), increased medical comorbidities (β = -0.19), and the anterior surgical approach (β = -0.15). Variables that predicted a better HOOS consisted of employment at the time of surgery (β = 0.17), and thoughts related to family interaction (β = 0.12), trying not to complain (β = 0.13), and helping others (β = 0.22).

CONCLUSIONS

This clinical prediction model in THA revealed that the factors most predictive of outcome were cognitive appraisal processes, demonstrating their importance to outcome-based research.

LEVEL OF EVIDENCE

Prognostic Level 1.

Collapse

Machine Learning Algorithms Predict Clinically Significant Improvements in Satisfaction After Hip Arthroscopy. Arthroscopy 2021;37:1143-1151. [PMID: 33359160 DOI: 10.1016/j.arthro.2020.11.027] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 11/05/2020] [Accepted: 11/06/2020] [Indexed: 02/06/2023]

Does Artificial Intelligence Outperform Natural Intelligence in Interpreting Musculoskeletal Radiological Studies? A Systematic Review. Clin Orthop Relat Res 2020;478:2751-2764. [PMID: 32740477 PMCID: PMC7899420 DOI: 10.1097/corr.0000000000001360] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

Abstract

BACKGROUND

Machine learning (ML) is a subdomain of artificial intelligence that enables computers to abstract patterns from data without explicit programming. A myriad of impactful ML applications already exists in orthopaedics ranging from predicting infections after surgery to diagnostic imaging. However, no systematic reviews that we know of have compared, in particular, the performance of ML models with that of clinicians in musculoskeletal imaging to provide an up-to-date summary regarding the extent of applying ML to imaging diagnoses. By doing so, this review delves into where current ML developments stand in aiding orthopaedists in assessing musculoskeletal images.

QUESTIONS/PURPOSES

This systematic review aimed (1) to compare performance of ML models versus clinicians in detecting, differentiating, or classifying orthopaedic abnormalities on imaging by (A) accuracy, sensitivity, and specificity, (B) input features (for example, plain radiographs, MRI scans, ultrasound), (C) clinician specialties, and (2) to compare the performance of clinician-aided versus unaided ML models.

METHODS

A systematic review was performed in PubMed, Embase, and the Cochrane Library for studies published up to October 1, 2019, using synonyms for machine learning and all potential orthopaedic specialties. We included all studies that compared ML models head-to-head against clinicians in the binary detection of abnormalities in musculoskeletal images. After screening 6531 studies, we ultimately included 12 studies. We conducted quality assessment using the Methodological Index for Non-randomized Studies (MINORS) checklist. All 12 studies were of comparable quality, and they all clearly included six of the eight critical appraisal items (study aim, input feature, ground truth, ML versus human comparison, performance metric, and ML model description). This justified summarizing the findings in a quantitative form by calculating the median absolute improvement of the ML models compared with clinicians for the following metrics of performance: accuracy, sensitivity, and specificity.

RESULTS

ML models provided, in aggregate, only very slight improvements in diagnostic accuracy and sensitivity compared with clinicians working alone and were on par in specificity (3% (interquartile range [IQR] -2.0% to 7.5%), 0.06% (IQR -0.03 to 0.14), and 0.00 (IQR -0.048 to 0.048), respectively). Inputs used by the ML models were plain radiographs (n = 8), MRI scans (n = 3), and ultrasound examinations (n = 1). Overall, ML models outperformed clinicians more when interpreting plain radiographs than when interpreting MRIs (17 of 34 and 3 of 16 performance comparisons, respectively). Orthopaedists and radiologists performed similarly to ML models, while ML models mostly outperformed other clinicians (outperformance in 7 of 19, 7 of 23, and 6 of 10 performance comparisons, respectively). Two studies evaluated the performance of clinicians aided and unaided by ML models; both demonstrated considerable improvements in ML-aided clinician performance by reporting a 47% decrease of misinterpretation rate (95% confidence interval [CI] 37 to 54; p < 0.001) and a mean increase in specificity of 0.048 (95% CI 0.029 to 0.068; p < 0.001) in detecting abnormalities on musculoskeletal images.

CONCLUSIONS

At present, ML models have comparable performance to clinicians in assessing musculoskeletal images. ML models may enhance the performance of clinicians as a technical supplement rather than as a replacement for clinical intelligence. Future ML-related studies should emphasize how ML models can complement clinicians, instead of determining the overall superiority of one versus the other. This can be accomplished by improving transparent reporting, diminishing bias, determining the feasibility of implantation in the clinical setting, and appropriately tempering conclusions.

LEVEL OF EVIDENCE

Level III, diagnostic study.

Collapse

Curtin P, Conway A, Martin L, Lin E, Jayakumar P, Swart E. Compilation and Analysis of Web-Based Orthopedic Personalized Predictive Tools: A Scoping Review. J Pers Med 2020;10:E223. [PMID: 33198106 PMCID: PMC7712817 DOI: 10.3390/jpm10040223] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Revised: 10/27/2020] [Accepted: 11/10/2020] [Indexed: 12/15/2022] Open

Zhang SC, Sun J, Liu CB, Fang JH, Xie HT, Ning B. Clinical application of artificial intelligence-assisted diagnosis using anteroposterior pelvic radiographs in children with developmental dysplasia of the hip. Bone Joint J 2020;102-B:1574-1581. [PMID: 33135455 DOI: 10.1302/0301-620x.102b11.bjj-2020-0712.r2] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Abstract

AIMS

The diagnosis of developmental dysplasia of the hip (DDH) is challenging owing to extensive variation in paediatric pelvic anatomy. Artificial intelligence (AI) may represent an effective diagnostic tool for DDH. Here, we aimed to develop an anteroposterior pelvic radiograph deep learning system for diagnosing DDH in children and analyze the feasibility of its application.

METHODS

In total, 10,219 anteroposterior pelvic radiographs were retrospectively collected from April 2014 to December 2018. Clinicians labelled each radiograph using a uniform standard method. Radiographs were grouped according to age and into 'dislocation' (dislocation and subluxation) and 'non-dislocation' (normal cases and those with dysplasia of the acetabulum) groups based on clinical diagnosis. The deep learning system was trained and optimized using 9,081 radiographs; 1,138 test radiographs were then used to compare the diagnoses made by deep learning system and clinicians. The accuracy of the deep learning system was determined using a receiver operating characteristic curve, and the consistency of acetabular index measurements was evaluated using Bland-Altman plots.

RESULTS

In all, 1,138 patients (242 males; 896 females; mean age 1.5 years (SD 1.79; 0 to 10) were included in this study. The area under the receiver operating characteristic curve, sensitivity, and specificity of the deep learning system for diagnosing hip dislocation were 0.975, 276/289 (95.5%), and 1,978/1,987 (99.5%), respectively. Compared with clinical diagnoses, the Bland-Altman 95% limits of agreement for acetabular index, as determined by the deep learning system from the radiographs of non-dislocated and dislocated hips, were -3.27° - 2.94° and -7.36° - 5.36°, respectively (p < 0.001).

CONCLUSION

The deep learning system was highly consistent, more convenient, and more effective for diagnosing DDH compared with clinician-led diagnoses. Deep learning systems should be considered for analysis of anteroposterior pelvic radiographs when diagnosing DDH. The deep learning system will improve the current artificially complicated screening referral process. Cite this article: Bone Joint J 2020;102-B(11):1574-1581.

Collapse