1
|
Srinivas S, Young AJ. Machine Learning and Artificial Intelligence in Surgical Research. Surg Clin North Am 2023; 103:299-316. [PMID: 36948720 DOI: 10.1016/j.suc.2022.11.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/24/2023]
Abstract
Machine learning, a subtype of artificial intelligence, is an emerging field of surgical research dedicated to predictive modeling. From its inception, machine learning has been of interest in medical and surgical research. Built on traditional research metrics for optimal success, avenues of research include diagnostics, prognosis, operative timing, and surgical education, in a variety of surgical subspecialties. Machine learning represents an exciting and developing future in the world of surgical research that will not only allow for more personalized and comprehensive medical care.
Collapse
Affiliation(s)
- Shruthi Srinivas
- Department of Surgery, The Ohio State University, 370 West 9th Avenue, Columbus, OH 43210, USA
| | - Andrew J Young
- Division of Trauma, Critical Care, and Burn, The Ohio State University, 181 Taylor Avenue, Suite 1102K, Columbus, OH 43203, USA.
| |
Collapse
|
2
|
Enodien B, Taha-Mehlitz S, Saad B, Nasser M, Frey DM, Taha A. The development of machine learning in bariatric surgery. Front Surg 2023; 10:1102711. [PMID: 36911599 PMCID: PMC9998495 DOI: 10.3389/fsurg.2023.1102711] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 02/08/2023] [Indexed: 03/14/2023] Open
Abstract
Background Machine learning (ML), is an approach to data analysis that makes the process of analytical model building automatic. The significance of ML stems from its potential to evaluate big data and achieve quicker and more accurate outcomes. ML has recently witnessed increased adoption in the medical domain. Bariatric surgery, otherwise referred to as weight loss surgery, reflects the series of procedures performed on people demonstrating obesity. This systematic scoping review aims to explore the development of ML in bariatric surgery. Methods The study used the Preferred Reporting Items for Systematic and Meta-analyses for Scoping Review (PRISMA-ScR). A comprehensive literature search was performed of several databases including PubMed, Cochrane, and IEEE, and search engines namely Google Scholar. Eligible studies included journals published from 2016 to the current date. The PRESS checklist was used to evaluate the consistency demonstrated during the process. Results A total of seventeen articles qualified for inclusion in the study. Out of the included studies, sixteen concentrated on the role of ML algorithms in prediction, while one addressed ML's diagnostic capacity. Most articles (n = 15) were journal publications, whereas the rest (n = 2) were papers from conference proceedings. Most included reports were from the United States (n = 6). Most studies addressed neural networks, with convolutional neural networks as the most prevalent. Also, the data type used in most articles (n = 13) was derived from hospital databases, with very few articles (n = 4) collecting original data via observation. Conclusions This study indicates that ML has numerous benefits in bariatric surgery, however its current application is limited. The evidence suggests that bariatric surgeons can benefit from ML algorithms since they will facilitate the prediction and evaluation of patient outcomes. Also, ML approaches to enhance work processes by making data categorization and analysis easier. However, further large multicenter studies are required to validate results internally and externally as well as explore and address limitations of ML application in bariatric surgery.
Collapse
Affiliation(s)
- Bassey Enodien
- Department of Surgery, GZO-Hospital, Wetzikon, Switzerland
| | - Stephanie Taha-Mehlitz
- Clarunis, University Centre for Gastrointestinal and Liver Diseases, St. Clara Hospital and University Hospital, Basel, Switzerland
| | - Baraa Saad
- School of Medicine, St George's University of London, London, United Kingdom
| | - Maya Nasser
- School of Medicine, St George's University of London, London, United Kingdom
| | - Daniel M Frey
- Department of Biomedical Engineering, Faculty of Medicine, University of Basel, Allschwil, Switzerland
| | - Anas Taha
- Clarunis, University Centre for Gastrointestinal and Liver Diseases, St. Clara Hospital and University Hospital, Basel, Switzerland.,Department of Biomedical Engineering, Faculty of Medicine, University of Basel, Allschwil, Switzerland
| |
Collapse
|
3
|
Devana SK, Shah AA, Lee C, Jensen AR, Cheung E, van der Schaar M, SooHoo NF. Development of a Machine Learning Algorithm for Prediction of Complications and Unplanned Readmission Following Primary Anatomic Total Shoulder Replacements. J Shoulder Elb Arthroplast 2022; 6:24715492221075444. [PMID: 35669619 PMCID: PMC9163721 DOI: 10.1177/24715492221075444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 12/23/2021] [Accepted: 01/05/2022] [Indexed: 11/16/2022] Open
Abstract
Background The demand and incidence of anatomic total shoulder arthroplasty (aTSA) procedures is projected to increase substantially over the next decade. There is a paucity of accurate risk prediction models which would be of great utility in minimizing morbidity and costs associated with major post-operative complications. Machine learning is a powerful predictive modeling tool and has become increasingly popular, especially in orthopedics. We aimed to build a ML model for prediction of major complications and readmission following primary aTSA. Methods A large California administrative database was retrospectively reviewed for all adults undergoing primary aTSA between 2015 to 2017. The primary outcome was any major complication or readmission following aTSA. A wide scope of standard ML benchmarks, including Logistic regression (LR), XGBoost, Gradient boosting, AdaBoost and Random Forest were employed to determine their power to predict outcomes. Additionally, important patient features to the prediction models were indentified. Results There were a total of 10,302 aTSAs with 598 (5.8%) having at least one major post-operative complication or readmission. XGBoost had the highest discriminative power (area under receiver operating curve AUROC of 0.689) of the 5 ML benchmarks with an area under precision recall curve AURPC of 0.207. History of implant complication, severe chronic kidney disease, teaching hospital status, coronary artery disease and male sex were the most important features for the performance of XGBoost. In addition, XGBoost identified teaching hospital status and male sex as markedly more important predictors of outcomes compared to LR models. Conclusion We report a well calibrated XGBoost ML algorithm for predicting major complications and 30-day readmission following aTSA. History of prior implant complication was the most important patient feature for XGBoost performance, a novel patient feature that surgeons should consider when counseling patients.
Collapse
Affiliation(s)
- Sai K Devana
- David Geffen School of Medicine UCLA, Los Angeles, CA
| | - Akash A Shah
- David Geffen School of Medicine UCLA, Los Angeles, CA
| | | | | | - Edward Cheung
- David Geffen School of Medicine UCLA, Los Angeles, CA
| | | | | |
Collapse
|
4
|
Shahi N, Shahi AK, Phillips R, Shirek G, Lindberg DM, Moulton SL. Using deep learning and natural language processing models to detect child physical abuse. J Pediatr Surg 2021; 56:2326-32. [PMID: 33838900 DOI: 10.1016/j.jpedsurg.2021.03.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 03/02/2021] [Accepted: 03/14/2021] [Indexed: 11/24/2022]
Abstract
BACKGROUND The recognition of child physical abuse can be challenging and often requires a multidisciplinary assessment. Deep learning models, based on clinical characteristics, laboratory studies, and imaging findings, were developed to facilitate unbiased identification of children who may have been abused. METHODS Level 1 pediatric trauma center registry data from 1/1/2010-1/31/2020 were queried for abused children and matched participants with non-abusive trauma. Observations were de-identified and divided into training and validation sets. Model 1 used patient demographics (age, gender, and insurance type) and clinical characteristics (vital signs, shock index pediatric age-adjusted, Glasgow Coma Score, lactate, base deficit, and international normalized ratio). Model 2 used the same features as Model 1, but with the text of the radiology reports of head computed tomography, brain MRIs, and skeletal surveys. Google's latest BERT Natural Language Processing (NLP) model, which was pre-trained on a large corpus, was used for fine-tuning Model 2. Accuracy, sensitivity, specificity, F1 scores, and positive predictive values were used to assess performance. RESULTS Of 1,312 patients, 737 (56.2%) were abused. Model 1 had an accuracy of 86.3%, sensitivity of 87.2%, specificity of 85.1%, F1 score of 0.86, and positive predictive value (PPV) of 88.7% for the validation set with an area under the receiver Operating Curve (ROC AUC) of 0.86. NLP based Model 2 had an accuracy of 93.4%, sensitivity 92.5%, specificity of 94.6%, F1 score of 0.93, and PPV of 95.9% for the validation set, with a ROC AUC of 0.94. Most features had weak individual correlations with abuse (r < 0.3). CONCLUSIONS Deep learning models accurately distinguished child physical abuse from non-abuse, and NLP further improved the accuracy of the models. Such models could be developed to run in real-time in the electronic medical record and alert clinicians when certain criteria are met, which would prompt them to pursue the diagnosis of abuse. LEVEL OF EVIDENCE III STUDY TYPE: Diagnostic.
Collapse
|
5
|
Henn J, Buness A, Schmid M, Kalff JC, Matthaei H. Machine learning to guide clinical decision-making in abdominal surgery-a systematic literature review. Langenbecks Arch Surg 2021. [PMID: 34716472 DOI: 10.1007/s00423-021-02348-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 10/03/2021] [Indexed: 12/16/2022]
Abstract
PURPOSE An indication for surgical therapy includes balancing benefits against risk, which remains a key task in all surgical disciplines. Decisions are oftentimes based on clinical experience while guidelines lack evidence-based background. Various medical fields capitalized the application of machine learning (ML), and preliminary research suggests promising implications in surgeons' workflow. Hence, we evaluated ML's contemporary and possible future role in clinical decision-making (CDM) focusing on abdominal surgery. METHODS Using the PICO framework, relevant keywords and research questions were identified. Following the PRISMA guidelines, a systemic search strategy in the PubMed database was conducted. Results were filtered by distinct criteria and selected articles were manually full text reviewed. RESULTS Literature review revealed 4,396 articles, of which 47 matched the search criteria. The mean number of patients included was 55,843. A total of eight distinct ML techniques were evaluated whereas AUROC was applied by most authors for comparing ML predictions vs. conventional CDM routines. Most authors (N = 30/47, 63.8%) stated ML's superiority in the prediction of benefits and risks of surgery. The identification of highly relevant parameters to be integrated into algorithms allowing a more precise prognosis was emphasized as the main advantage of ML in CDM. CONCLUSIONS A potential value of ML for surgical decision-making was demonstrated in several scientific articles. However, the low number of publications with only few collaborative studies between surgeons and computer scientists underpins the early phase of this highly promising field. Interdisciplinary research initiatives combining existing clinical datasets and emerging techniques of data processing may likely improve CDM in abdominal surgery in the future.
Collapse
|
6
|
Devana SK, Shah AA, Lee C, Roney AR, van der Schaar M, SooHoo NF. A Novel, Potentially Universal Machine Learning Algorithm to Predict Complications in Total Knee Arthroplasty. Arthroplast Today 2021; 10:135-143. [PMID: 34401416 PMCID: PMC8349766 DOI: 10.1016/j.artd.2021.06.020] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 06/23/2021] [Accepted: 06/25/2021] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND There remains a lack of accurate and validated outcome-prediction models in total knee arthroplasty (TKA). While machine learning (ML) is a powerful predictive tool, determining the proper algorithm to apply across diverse data sets is challenging. AutoPrognosis (AP) is a novel method that uses automated ML framework to incorporate the best performing stages of prognostic modeling into a single well-calibrated algorithm. We aimed to compare various ML methods to AP in predictive performance of complications after TKA. METHODS Thirty-eight preoperative patient demographics and clinical features from all primary TKAs performed at California-licensed hospitals between 2015 and 2017 were evaluated as predictors of major complications after TKA. Traditional logistic regression (LR), various other ML methods (XGBoost, Gradient Boosting, AdaBoost, and Random Forest), and AP were used for model building to determine discriminative power (area under receiver operating curve), calibration (Brier score), and feature importance. RESULTS Between 2015 and 2017, there were a total of 156,750 TKAs with 1109 (0.7%) total major complications. AP had the highest discriminative performance with area under receiver operating curve 0.679 compared with LR, XGBoost, Gradient Boosting, AdaBoost, and Random Forest (0.617, 0.601, 0.662, 0.657, and 0.545, respectively). AP (Brier score 0.007) had similar calibration as the other ML methods (0.006, 0.006, 0.022, 0.007, and 0.008, respectively). The variables that are most important for AP differ from those that are most important for LR. CONCLUSION Compared to conventional ML algorithms, AP has superior discriminative ability with similar calibration and suggests nonlinear relationships between variables in outcomes of TKA.
Collapse
Affiliation(s)
- Sai K. Devana
- Department of Orthopaedic Surgery, University of California, Los Angeles, USA
| | - Akash A. Shah
- Department of Orthopaedic Surgery, University of California, Los Angeles, USA
| | - Changhee Lee
- Department of Electrical and Computer Engineering, University of California, Los Angeles, USA
| | - Andrew R. Roney
- Department of Orthopaedic Surgery, University of California, Los Angeles, USA
| | - Mihaela van der Schaar
- Department of Electrical and Computer Engineering, University of California, Los Angeles, USA
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, London, UK
- The Alan Turing Institute, London, UK
| | - Nelson F. SooHoo
- Department of Orthopaedic Surgery, University of California, Los Angeles, USA
| |
Collapse
|
7
|
Khan K, Ramsahai E. Maintaining proper health records improves machine learning predictions for novel 2019-nCoV. BMC Med Inform Decis Mak 2021; 21:172. [PMID: 34044839 PMCID: PMC8159067 DOI: 10.1186/s12911-021-01537-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Accepted: 05/23/2021] [Indexed: 11/19/2022] Open
Abstract
Background An ongoing outbreak of a novel coronavirus (2019-nCoV) pneumonia continues to affect the whole world including major countries such as China, USA, Italy, France and the United Kingdom. We present outcome (‘recovered’, ‘isolated’ or ‘death’) risk estimates of 2019-nCoV over ‘early’ datasets. A major consideration is the likelihood of death for patients with 2019-nCoV. Method Accounting for the impact of the variations in the reporting rate of 2019-nCoV, we used machine learning techniques (AdaBoost, bagging, extra-trees, decision trees and k-nearest neighbour classifiers) on two 2019-nCoV datasets obtained from Kaggle on March 30, 2020. We used ‘country’, ‘age’ and ‘gender’ as features to predict outcome for both datasets. We included the patient’s ‘disease’ history (only present in the second dataset) to predict the outcome for the second dataset. Results The use of a patient’s ‘disease’ history improves the prediction of ‘death’ by more than sevenfold. The models ignoring a patent’s ‘disease’ history performed poorly in test predictions. Conclusion Our findings indicate the potential of using a patient’s ‘disease’ history as part of the feature set in machine learning techniques to improve 2019-nCoV predictions. This development can have a positive effect on predictive patient treatment and can result in easing currently overburdened healthcare systems worldwide, especially with the increasing prevalence of second and third wave re-infections in some countries.
Collapse
Affiliation(s)
- Koffka Khan
- Department of Computing and Information Technology, The University of the West Indies, St. Augustine, Trinidad and Tobago.
| | - Emilie Ramsahai
- UWI School of Business & Applied Studies Ltd (UWI-ROYTEC), 136-138 Henry Street, 24105, Port of Spain, Trinidad and Tobago
| |
Collapse
|
8
|
Seib CD, Roose JP, Hubbard AE, Suh I. Ensemble machine learning for the prediction of patient-level outcomes following thyroidectomy. Am J Surg 2020; 222:347-353. [PMID: 33339618 DOI: 10.1016/j.amjsurg.2020.11.055] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 10/17/2020] [Accepted: 11/25/2020] [Indexed: 11/25/2022]
Abstract
BACKGROUND Accurate prediction of thyroidectomy complications is necessary to inform treatment decisions. Ensemble machine learning provides one approach to improve prediction. METHODS We applied the Super Learner (SL) algorithm to the 2016-2018 thyroidectomy-specific NSQIP database to predict complications following thyroidectomy. Cross-validation was used to assess model discrimination and precision. RESULTS For the 17,987 patients undergoing thyroidectomy, rates of recurrent laryngeal nerve injury, post-operative hypocalcemia prior to discharge or within 30 days, and neck hematoma were 6.1%, 6.4%, 9.0%, and 1.8%, respectively. SL improved prediction of thyroidectomy-specific outcomes when compared with benchmark logistic regression approaches. For postoperative hypocalcemia prior to discharge, SL improved the cross-validated AUROC to 0.72 (95%CI 0.70-0.74) compared to 0.70 (95%CI 0.68-0.72; p < 0.001) when using a manually curated logistic regression algorithm. CONCLUSION Ensemble machine learning modestly improves prediction for thyroidectomy-specific outcomes. SL holds promise to provide more accurate patient-level risk prediction to inform treatment decisions.
Collapse
Affiliation(s)
- Carolyn D Seib
- Stanford-Surgery Policy Improvement Research and Education Center (S-SPIRE), Department of Surgery, Stanford University School of Medicine, Stanford, CA, United States; Division of General Surgery, Palo Alto Veterans Affairs Health Care System, United States.
| | - James P Roose
- University of California, Berkeley, Division of Biostatistics, Berkeley, United States
| | - Alan E Hubbard
- University of California, Berkeley, Division of Biostatistics, Berkeley, United States
| | - Insoo Suh
- University of California, San Francisco, Section of Endocrine Surgery, San Francisco, United States
| |
Collapse
|
9
|
Cao Y, Montgomery S, Ottosson J, Näslund E, Stenberg E. Deep Learning Neural Networks to Predict Serious Complications After Bariatric Surgery: Analysis of Scandinavian Obesity Surgery Registry Data. JMIR Med Inform 2020; 8:e15992. [PMID: 32383681 PMCID: PMC7244994 DOI: 10.2196/15992] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2019] [Revised: 01/07/2020] [Accepted: 02/07/2020] [Indexed: 12/13/2022] Open
Abstract
Background Obesity is one of today’s most visible public health problems worldwide. Although modern bariatric surgery is ostensibly considered safe, serious complications and mortality still occur in some patients. Objective This study aimed to explore whether serious postoperative complications of bariatric surgery recorded in a national quality registry can be predicted preoperatively using deep learning methods. Methods Patients who were registered in the Scandinavian Obesity Surgery Registry (SOReg) between 2010 and 2015 were included in this study. The patients who underwent a bariatric procedure between 2010 and 2014 were used as training data, and those who underwent a bariatric procedure in 2015 were used as test data. Postoperative complications were graded according to the Clavien-Dindo classification, and complications requiring intervention under general anesthesia or resulting in organ failure or death were considered serious. Three supervised deep learning neural networks were applied and compared in our study: multilayer perceptron (MLP), convolutional neural network (CNN), and recurrent neural network (RNN). The synthetic minority oversampling technique (SMOTE) was used to artificially augment the patients with serious complications. The performances of the neural networks were evaluated using accuracy, sensitivity, specificity, Matthews correlation coefficient, and area under the receiver operating characteristic curve. Results In total, 37,811 and 6250 patients were used as the training data and test data, with incidence rates of serious complication of 3.2% (1220/37,811) and 3.0% (188/6250), respectively. When trained using the SMOTE data, the MLP appeared to have a desirable performance, with an area under curve (AUC) of 0.84 (95% CI 0.83-0.85). However, its performance was low for the test data, with an AUC of 0.54 (95% CI 0.53-0.55). The performance of CNN was similar to that of MLP. It generated AUCs of 0.79 (95% CI 0.78-0.80) and 0.57 (95% CI 0.59-0.61) for the SMOTE data and test data, respectively. Compared with the MLP and CNN, the RNN showed worse performance, with AUCs of 0.65 (95% CI 0.64-0.66) and 0.55 (95% CI 0.53-0.57) for the SMOTE data and test data, respectively. Conclusions MLP and CNN showed improved, but limited, ability for predicting the postoperative serious complications after bariatric surgery in the Scandinavian Obesity Surgery Registry data. However, the overfitting issue is still apparent and needs to be overcome by incorporating intra- and perioperative information.
Collapse
Affiliation(s)
- Yang Cao
- Clinical Epidemiology and Biostatistics, School of Medical Sciences, Örebro University, Örebro, Sweden
| | - Scott Montgomery
- Clinical Epidemiology and Biostatistics, School of Medical Sciences, Örebro University, Örebro, Sweden.,Clinical Epidemiology Division, Department of Medicine, Karolinska Institutet, Stockholm, Sweden.,Department of Epidemiology and Public Health, University College London, London, United Kingdom
| | - Johan Ottosson
- Department of Surgery, Faculty of Medicine and Health, Örebro University, Örebro, Sweden
| | - Erik Näslund
- Division of Surgery, Department of Clinical Sciences, Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden
| | - Erik Stenberg
- Department of Surgery, Faculty of Medicine and Health, Örebro University, Örebro, Sweden
| |
Collapse
|
10
|
Parimbelli E, Szymon W, O'Sullivan D, Kingwell S, Michalowski W, Michalowski M. How Do Spinal Surgeons Perceive The Impact of Factors Used in Post-Surgical Complication Risk Scores? AMIA Annu Symp Proc 2020; 2019:699-706. [PMID: 32308865 PMCID: PMC7153101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
When deciding about surgical treatment options, an important aspect of the decision-making process is the potential risk of complications. A risk assessment performed by a spinal surgeon is based on their knowledge of the best available evidence and on their own clinical experience. The objective of this work is to demonstrate the differences in the way spine surgeons perceive the importance of attributes used to calculate risk of post-operative and quantify the differences by building individual formal models of risk perceptions. We employ a preference-learning method - ROR-UTADIS - to build surgeon-specific additive value functions for risk of complications. Comparing these functions enables the identification and discussion of differences among personal perceptions of risk factors. Our results show there exist differences in surgeons' perceived factors including primary diagnosis, type of surgery, patient's age, body mass index, or presence of comorbidities.
Collapse
Affiliation(s)
| | - Wilk Szymon
- University of Ottawa, Ottawa, ON, Canada
- Poznan University of Technology, Poznan, Poland
| | | | | | | | | |
Collapse
|
11
|
Quddusi A, Eversdijk HAJ, Klukowska AM, de Wispelaere MP, Kernbach JM, Schröder ML, Staartjes VE. External validation of a prediction model for pain and functional outcome after elective lumbar spinal fusion. Eur Spine J 2020; 29:374-83. [PMID: 31641905 DOI: 10.1007/s00586-019-06189-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2019] [Revised: 09/16/2019] [Accepted: 10/13/2019] [Indexed: 12/23/2022]
Abstract
OBJECTIVE Patient-reported outcome measures following elective lumbar fusion surgery demonstrate major heterogeneity. Individualized prediction tools can provide valuable insights for shared decision-making. We externally validated the spine surgical care and outcomes assessment programme/comparative effectiveness translational network (SCOAP-CERTAIN) model for prediction of 12-month minimum clinically important difference in Oswestry Disability Index (ODI) and in numeric rating scales for back (NRS-BP) and leg pain (NRS-LP) after elective lumbar fusion. METHODS Data from a prospective registry were obtained. We calculated the area under the curve (AUC), calibration slope and intercept, and Hosmer-Lemeshow values to estimate discrimination and calibration of the models. RESULTS We included 100 patients, with average age of 50.4 ± 11.4 years. For 12-month ODI, AUC was 0.71 while the calibration intercept and slope were 1.08 and 0.95, respectively. For NRS-BP, AUC was 0.72, with a calibration intercept of 1.02, and slope of 0.74. For NRS-LP, AUC was 0.83, with a calibration intercept of 1.08, and slope of 0.95. Sensitivity ranged from 0.64 to 1.00, while specificity ranged from 0.38 to 0.65. A lack of fit was found for all three models based on Hosmer-Lemeshow testing. CONCLUSIONS The SCOAP-CERTAIN tool can accurately predict which patients will achieve favourable outcomes. However, the predicted probabilities-which are the most valuable in clinical practice-reported by the tool do not correspond well to the true probability of a favourable outcome. We suggest that any prediction tool should first be externally validated before it is applied in routine clinical practice. These slides can be retrieved under Electronic Supplementary Material.
Collapse
|
12
|
Fontana MA, Lyman S, Sarker GK, Padgett DE, MacLean CH. Can Machine Learning Algorithms Predict Which Patients Will Achieve Minimally Clinically Important Differences From Total Joint Arthroplasty? Clin Orthop Relat Res 2019; 477:1267-1279. [PMID: 31094833 PMCID: PMC6554103 DOI: 10.1097/corr.0000000000000687] [Citation(s) in RCA: 120] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 01/30/2019] [Indexed: 02/06/2023]
Abstract
BACKGROUND Identifying patients at risk of not achieving meaningful gains in long-term postsurgical patient-reported outcome measures (PROMs) is important for improving patient monitoring and facilitating presurgical decision support. Machine learning may help automatically select and weigh many predictors to create models that maximize predictive power. However, these techniques are underused among studies of total joint arthroplasty (TJA) patients, particularly those exploring changes in postsurgical PROMs. QUESTION/PURPOSES: (1) To evaluate whether machine learning algorithms, applied to hospital registry data, could predict patients who would not achieve a minimally clinically important difference (MCID) in four PROMs 2 years after TJA; (2) to explore how predictive ability changes as more information is included in modeling; and (3) to identify which variables drive the predictive power of these models. METHODS Data from a single, high-volume institution's TJA registry were used for this study. We identified 7239 hip and 6480 knee TJAs between 2007 and 2012, which, for at least one PROM, patients had completed both baseline and 2-year followup surveys (among 19,187 TJAs in our registry and 43,313 total TJAs). In all, 12,203 registry TJAs had valid SF-36 physical component scores (PCS) and mental component scores (MCS) at baseline and 2 years; 7085 and 6205 had valid Hip and Knee Disability and Osteoarthritis Outcome Scores for joint replacement (HOOS JR and KOOS JR scores), respectively. Supervised machine learning refers to a class of algorithms that links a mapping of inputs to an output based on many input-output examples. We trained three of the most popular such algorithms (logistic least absolute shrinkage and selection operator (LASSO), random forest, and linear support vector machine) to predict 2-year postsurgical MCIDs. We incrementally considered predictors available at four time points: (1) before the decision to have surgery, (2) before surgery, (3) before discharge, and (4) immediately after discharge. We evaluated the performance of each model using area under the receiver operating characteristic (AUROC) statistics on a validation sample composed of a random 20% subsample of TJAs excluded from modeling. We also considered abbreviated models that only used baseline PROMs and procedure as predictors (to isolate their predictive power). We further directly evaluated which variables were ranked by each model as most predictive of 2-year MCIDs. RESULTS The three machine learning algorithms performed in the poor-to-good range for predicting 2-year MCIDs, with AUROCs ranging from 0.60 to 0.89. They performed virtually identically for a given PROM and time point. AUROCs for the logistic LASSO models for predicting SF-36 PCS 2-year MCIDs at the four time points were: 0.69, 0.78, 0.78, and 0.78, respectively; for SF-36 MCS 2-year MCIDs, AUROCs were: 0.63, 0.89, 0.89, and 0.88; for HOOS JR 2-year MCIDs: 0.67, 0.78, 0.77, and 0.77; for KOOS JR 2-year MCIDs: 0.61, 0.75, 0.75, and 0.75. Before-surgery models performed in the fair-to-good range and consistently ranked the associated baseline PROM as among the most important predictors. Abbreviated LASSO models performed worse than the full before-surgery models, though they retained much of the predictive power of the full before-surgery models. CONCLUSIONS Machine learning has the potential to improve clinical decision-making and patient care by helping to prioritize resources for postsurgical monitoring and informing presurgical discussions of likely outcomes of TJA. Applied to presurgical registry data, such models can predict, with fair-to-good ability, 2-year postsurgical MCIDs. Although we report all parameters of our best-performing models, they cannot simply be applied off-the-shelf without proper testing. Our analyses indicate that machine learning holds much promise for predicting orthopaedic outcomes. LEVEL OF EVIDENCE: Level III, diagnostic study.
Collapse
Affiliation(s)
- Mark Alan Fontana
- M. A. Fontana, S. Lyman, G. K. Sarker, D. E. Padgett, C. H. MacLean, Hospital for Special Surgery, Center for the Advancement of Value in Musculoskeletal Care, New York, NY, USA M. A. Fontana, S. Lyman, Weill Cornell Medical College, Department of Healthcare Policy and Research, New York, NY, USA
| | | | | | | | | |
Collapse
|
13
|
Cao Y, Fang X, Ottosson J, Näslund E, Stenberg E. A Comparative Study of Machine Learning Algorithms in Predicting Severe Complications after Bariatric Surgery. J Clin Med 2019; 8:jcm8050668. [PMID: 31083643 PMCID: PMC6571760 DOI: 10.3390/jcm8050668] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 05/08/2019] [Accepted: 05/10/2019] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Severe obesity is a global public health threat of growing proportions. Accurate models to predict severe postoperative complications could be of value in the preoperative assessment of potential candidates for bariatric surgery. So far, traditional statistical methods have failed to produce high accuracy. We aimed to find a useful machine learning (ML) algorithm to predict the risk for severe complication after bariatric surgery. METHODS We trained and compared 29 supervised ML algorithms using information from 37,811 patients that operated with a bariatric surgical procedure between 2010 and 2014 in Sweden. The algorithms were then tested on 6250 patients operated in 2015. We performed the synthetic minority oversampling technique tackling the issue that only 3% of patients experienced severe complications. RESULTS Most of the ML algorithms showed high accuracy (>90%) and specificity (>90%) in both the training and test data. However, none of the algorithms achieved an acceptable sensitivity in the test data. We also tried to tune the hyperparameters of the algorithms to maximize sensitivity, but did not yet identify one with a high enough sensitivity that can be used in clinical praxis in bariatric surgery. However, a minor, but perceptible, improvement in deep neural network (NN) ML was found. CONCLUSION In predicting the severe postoperative complication among the bariatric surgery patients, ensemble algorithms outperform base algorithms. When compared to other ML algorithms, deep NN has the potential to improve the accuracy and it deserves further investigation. The oversampling technique should be considered in the context of imbalanced data where the number of the interested outcome is relatively small.
Collapse
Affiliation(s)
- Yang Cao
- Clinical Epidemiology and Biostatistics, School of Medical Sciences, Örebro University, Örebro, Sweden.
| | - Xin Fang
- Unit of Biostatistics, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden.
| | - Johan Ottosson
- Department of Surgery, Faculty of Medicine and Health, Örebro University, Örebro, Sweden.
| | - Erik Näslund
- Division of Surgery, Department of Clinical Sciences, Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden.
| | - Erik Stenberg
- Department of Surgery, Faculty of Medicine and Health, Örebro University, Örebro, Sweden.
| |
Collapse
|
14
|
Devine B. Concordium 2016: Data and Knowledge Transforming Health. ACTA ACUST UNITED AC 2017; 5:9. [PMID: 29881751 PMCID: PMC5983075 DOI: 10.13063/2327-9214.1306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Introduction and Context: Concordium 2016 celebrated the potential for data and knowledge to transform health. Through a series of plenaries, presentations, workshops and demonstrations, the conference highlighted projects among four themes: effectiveness and outcomes research, health care analytics and operations, public and population health, and quality improvement. Papers in the Special Issue: The eight papers that comprise this special issue of eGEMs provide exemplars of solutions to the Big Data problems faced in today’s healthcare environment. Cross-Cutting Elements and Overlapping Themes: Several of the papers contain elements of multiple overlapping themes. We integrate these into five overlapping themes: telehealth, user-centered design/usability, clinic workflow, patient-centered care, and population health management through prediction modeling and risk adjustment. Conclusion and Future Directions: The effort to leverage all types of Big Data to improve health and healthcare is a monumental effort that will require the work of numerous stakeholders, and one that will unfold incrementally over time. This collection of eight papers reflects the current state of the art. Concordium 2017 will take a different form, inviting a small set of leaders in the field to focus on the next round of exciting and provocative research currently underway to improve the nation’s health.
Collapse
Affiliation(s)
- Beth Devine
- Department of Pharmaceutical Outcomes Research and Policy Program, Department of Health Services, Department of Biomedical Informatics, Department of Surgery at the University of Washington
| |
Collapse
|