1
|
Boal MWE, Anastasiou D, Tesfai F, Ghamrawi W, Mazomenos E, Curtis N, Collins JW, Sridhar A, Kelly J, Stoyanov D, Francis NK. Evaluation of objective tools and artificial intelligence in robotic surgery technical skills assessment: a systematic review. Br J Surg 2024; 111:znad331. [PMID: 37951600 PMCID: PMC10771126 DOI: 10.1093/bjs/znad331] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 09/18/2023] [Accepted: 09/19/2023] [Indexed: 11/14/2023]
Abstract
BACKGROUND There is a need to standardize training in robotic surgery, including objective assessment for accreditation. This systematic review aimed to identify objective tools for technical skills assessment, providing evaluation statuses to guide research and inform implementation into training curricula. METHODS A systematic literature search was conducted in accordance with the PRISMA guidelines. Ovid Embase/Medline, PubMed and Web of Science were searched. Inclusion criterion: robotic surgery technical skills tools. Exclusion criteria: non-technical, laparoscopy or open skills only. Manual tools and automated performance metrics (APMs) were analysed using Messick's concept of validity and the Oxford Centre of Evidence-Based Medicine (OCEBM) Levels of Evidence and Recommendation (LoR). A bespoke tool analysed artificial intelligence (AI) studies. The Modified Downs-Black checklist was used to assess risk of bias. RESULTS Two hundred and forty-seven studies were analysed, identifying: 8 global rating scales, 26 procedure-/task-specific tools, 3 main error-based methods, 10 simulators, 28 studies analysing APMs and 53 AI studies. Global Evaluative Assessment of Robotic Skills and the da Vinci Skills Simulator were the most evaluated tools at LoR 1 (OCEBM). Three procedure-specific tools, 3 error-based methods and 1 non-simulator APMs reached LoR 2. AI models estimated outcomes (skill or clinical), demonstrating superior accuracy rates in the laboratory with 60 per cent of methods reporting accuracies over 90 per cent, compared to real surgery ranging from 67 to 100 per cent. CONCLUSIONS Manual and automated assessment tools for robotic surgery are not well validated and require further evaluation before use in accreditation processes.PROSPERO: registration ID CRD42022304901.
Collapse
Affiliation(s)
- Matthew W E Boal
- The Griffin Institute, Northwick Park & St Marks’ Hospital, London, UK
- Wellcome/ESPRC Centre for Interventional Surgical Sciences (WEISS), University College London (UCL), London, UK
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, UCL, London, UK
| | - Dimitrios Anastasiou
- Wellcome/ESPRC Centre for Interventional Surgical Sciences (WEISS), University College London (UCL), London, UK
- Medical Physics and Biomedical Engineering, UCL, London, UK
| | - Freweini Tesfai
- The Griffin Institute, Northwick Park & St Marks’ Hospital, London, UK
- Wellcome/ESPRC Centre for Interventional Surgical Sciences (WEISS), University College London (UCL), London, UK
| | - Walaa Ghamrawi
- The Griffin Institute, Northwick Park & St Marks’ Hospital, London, UK
| | - Evangelos Mazomenos
- Wellcome/ESPRC Centre for Interventional Surgical Sciences (WEISS), University College London (UCL), London, UK
- Medical Physics and Biomedical Engineering, UCL, London, UK
| | - Nathan Curtis
- Department of General Surgey, Dorset County Hospital NHS Foundation Trust, Dorchester, UK
| | - Justin W Collins
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, UCL, London, UK
- University College London Hospitals NHS Foundation Trust, London, UK
| | - Ashwin Sridhar
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, UCL, London, UK
- University College London Hospitals NHS Foundation Trust, London, UK
| | - John Kelly
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, UCL, London, UK
- University College London Hospitals NHS Foundation Trust, London, UK
| | - Danail Stoyanov
- Wellcome/ESPRC Centre for Interventional Surgical Sciences (WEISS), University College London (UCL), London, UK
- Computer Science, UCL, London, UK
| | - Nader K Francis
- The Griffin Institute, Northwick Park & St Marks’ Hospital, London, UK
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, UCL, London, UK
- Yeovil District Hospital, Somerset Foundation NHS Trust, Yeovil, Somerset, UK
| |
Collapse
|
2
|
Atroshchenko GV, Navarra E, Valdis M, Sandoval E, Hashemi N, Cerny S, Pereda D, Palmen M, Bjerrum F, Bruun NH, Tolsgaard MG. Simulation-based assessment of robotic cardiac surgery skills: An international multicenter, cross-specialty trial. JTCVS OPEN 2023; 16:619-627. [PMID: 38204726 PMCID: PMC10775167 DOI: 10.1016/j.xjon.2023.10.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Revised: 10/02/2023] [Accepted: 10/10/2023] [Indexed: 01/12/2024]
Abstract
Objective This study aimed to investigate the validity of simulation-based assessment of robotic-assisted cardiac surgery skills using a wet lab model, focusing on the use of a time-based score (TBS) and modified Global Evaluative Assessment of Robotic Skills (mGEARS) score. Methods We tested 3 wet lab tasks (atrial closure, mitral annular stitches, and internal thoracic artery [ITA] dissection) with both experienced robotic cardiac surgeons and novices from multiple European centers. The tasks were assessed using 2 tools: TBS and mGEARS score. Reliability, internal consistency, and the ability to discriminate between different levels of competence were evaluated. Results The results demonstrated a high internal consistency for all 3 tasks using mGEARS assessment tool. The mGEARS score and TBS could reliably discriminate between different levels of competence for the atrial closure and mitral stitches tasks but not for the ITA harvesting task. A generalizability study also revealed that it was feasible to assess competency of the atrial closure and mitral stitches tasks using mGEARS but not the ITA dissection task. Pass/fail scores were established for each task using both TBS and mGEARS assessment tools. Conclusions The study provides sufficient evidence for using TBS and mGEARS scores in evaluating robotic-assisted cardiac surgery skills in wet lab settings for intracardiac tasks. Combining both assessment tools enhances the evaluation of proficiency in robotic cardiac surgery, paving the way for standardized, evidence-based preclinical training and credentialing. Clinical trial registry number NCT05043064.
Collapse
Affiliation(s)
- Gennady V. Atroshchenko
- Department of Cardiothoracic Surgery, Aalborg University Hospital, Aalborg, Denmark
- ROCnord Robotic Centre Aalborg, Aalborg University Hospital, Aalborg, Denmark
- Department of Clinical Medicine, Aalborg University, Aalborg, Denmark
| | - Emiliano Navarra
- Department of Cardiac Surgery, Ospedale Sant'Andrea, “Sapienza” University of Rome, Rome, Italy
| | - Matthew Valdis
- Division of Cardiac Surgery, Department of Surgery, Western University, London Health Sciences Center, London, Ontario, Canada
| | - Elena Sandoval
- Department of Cardiovascular Surgery, Hospital Clínic, Barcelona, Spain
| | - Nasseh Hashemi
- Department of Clinical Medicine, Aalborg University, Aalborg, Denmark
- Nordsim, Aalborg University Hospital, Aalborg, Denmark
| | - Stepan Cerny
- Department of Cardiac Surgery, Na Homolce Hospital, Prague, Czech Republic
| | - Daniel Pereda
- Department of Cardiovascular Surgery, Hospital Clínic, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Cardiovasculares (CIBERCV), Madrid, Spain
| | - Meindert Palmen
- Department of Cardiothoracic Surgery, Leiden University Medical Center, Leiden, The Netherlands
- Department of Cardiothoracic Surgery, Amsterdam University Medical Center, Amsterdam, The Netherlands
| | - Flemming Bjerrum
- Department of Gastrointestinal and Hepatic Diseases, Copenhagen University Hospital–Herlev and Gentofte, Herlev, Denmark
- Copenhagen Academy for Medical Education and Simulation (CAMES), Rigshospitalet, Denmark
| | - Niels Henrik Bruun
- Unit of Clinical Biostatistics, Aalborg University Hospital, Aalborg, Denmark
| | - Martin G. Tolsgaard
- Copenhagen Academy for Medical Education and Simulation (CAMES), Rigshospitalet, Denmark
- Department of Obstetrics, Copenhagen University Hospital Rigshospitalet, Denmark
- Department of Medicine, University of Copenhagen, Denmark
| |
Collapse
|
3
|
Kutana S, Bitner DP, Addison P, Chung PJ, Talamini MA, Filicori F. Objective assessment of robotic surgical skills: review of literature and future directions. Surg Endosc 2022; 36:3698-3707. [PMID: 35229215 DOI: 10.1007/s00464-022-09134-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 02/13/2022] [Indexed: 01/29/2023]
Abstract
BACKGROUND Evaluation of robotic surgical skill has become increasingly important as robotic approaches to common surgeries become more widely utilized. However, evaluation of these currently lacks standardization. In this paper, we aimed to review the literature on robotic surgical skill evaluation. METHODS A review of literature on robotic surgical skill evaluation was performed and representative literature presented over the past ten years. RESULTS The study of reliability and validity in robotic surgical evaluation shows two main assessment categories: manual and automatic. Manual assessments have been shown to be valid but typically are time consuming and costly. Automatic evaluation and simulation are similarly valid and simpler to implement. Initial reports on evaluation of skill using artificial intelligence platforms show validity. Few data on evaluation methods of surgical skill connect directly to patient outcomes. CONCLUSION As evaluation in surgery begins to incorporate robotic skills, a simultaneous shift from manual to automatic evaluation may occur given the ease of implementation of these technologies. Robotic platforms offer the unique benefit of providing more objective data streams including kinematic data which allows for precise instrument tracking in the operative field. Such data streams will likely incrementally be implemented in performance evaluations. Similarly, with advances in artificial intelligence, machine evaluation of human technical skill will likely form the next wave of surgical evaluation.
Collapse
Affiliation(s)
- Saratu Kutana
- Intraoperative Performance Analytics Laboratory (IPAL), Department of General Surgery, Northwell Health, Lenox Hill Hospital, 186 E. 76th Street, 1st Floor, New York, NY, 10021, USA
| | - Daniel P Bitner
- Intraoperative Performance Analytics Laboratory (IPAL), Department of General Surgery, Northwell Health, Lenox Hill Hospital, 186 E. 76th Street, 1st Floor, New York, NY, 10021, USA.
| | - Poppy Addison
- Intraoperative Performance Analytics Laboratory (IPAL), Department of General Surgery, Northwell Health, Lenox Hill Hospital, 186 E. 76th Street, 1st Floor, New York, NY, 10021, USA
| | - Paul J Chung
- Intraoperative Performance Analytics Laboratory (IPAL), Department of General Surgery, Northwell Health, Lenox Hill Hospital, 186 E. 76th Street, 1st Floor, New York, NY, 10021, USA.,Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, USA
| | - Mark A Talamini
- Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, USA
| | - Filippo Filicori
- Intraoperative Performance Analytics Laboratory (IPAL), Department of General Surgery, Northwell Health, Lenox Hill Hospital, 186 E. 76th Street, 1st Floor, New York, NY, 10021, USA.,Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, USA
| |
Collapse
|
4
|
Abstract
OBJECTIVE This systematic review aims to examine the use of standard-setting methods in the context of simulation-based training of surgical procedures. SUMMARY OF BACKGROUND Simulation-based training is increasingly used in surgical education. However, it is important to determine which level of competency trainees must reach during simulation-based training before operating on patients. Therefore, pass/fail standards must be established using systematic, transparent, and valid methods. METHODS Systematic literature search was done in four databases (Ovid MEDLINE, Embase, Web of Science, and Cochrane Library). Original studies investigating simulation-based assessment of surgical procedures with application of a standard setting were included. Quality of evidence was appraised using GRADE. RESULTS Of 24,299 studies identified by searches, 232 studies met the inclusion criteria. Publications using already established standard settings were excluded (N = 70), resulting in 162 original studies included in the final analyses. Most studies described how the standard setting was determined (N = 147, 91%) and most used the mean or median performance score of experienced surgeons (n = 65, 40%) for standard setting. We found considerable differences across most of the studies regarding study design, set-up, and expert level classification. The studies were appraised as having low and moderate evidence. CONCLUSION Surgical education is shifting towards competency-based education, and simulation-based training is increasingly used for acquiring skills and assessment. Most studies consider and describe how standard settings are established using more or less structured methods but for current and future educational programs, a critical approach is needed so that the learners receive a fair, valid and reliable assessment.
Collapse
|
5
|
Olsen RG, Bjerrum F, Konge L, Jepsen JV, Azawi NH, Bube SH. Validation of a Novel Simulation-Based Test in Robot-Assisted Radical Prostatectomy. J Endourol 2021; 35:1265-1272. [PMID: 33530867 DOI: 10.1089/end.2020.0986] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Purpose: To investigate validity evidence for a simulator-based test in robot-assisted radical prostatectomy (RARP). Materials and Methods: The test consisted of three modules on the RobotiX Mentor VR-simulator: Bladder Neck Dissection, Neurovascular Bundle Dissection, and Ureterovesical Anastomosis. Validity evidence was investigated by using Messick's framework by including doctors with different RARP experience: novices (who had assisted for RARP), intermediates (robotic surgeons, but not RARP surgeons), or experienced (RARP surgeons). The simulator metrics were analyzed, and Cronbach's alpha and generalizability theory were used to explore reliability. Intergroup comparisons were done with mixed-model, repeated measurement analysis of variance and the correlation between the number of robotic procedures and the mean test score were examined. A pass/fail score was established by using the contrasting groups' method. Results: Ten novices, 11 intermediates, and 6 experienced RARP surgeons were included. Six metrics could discriminate between groups and showed acceptable internal consistency reliability, Cronbach's alpha = 0.49, p < 0.001. Test-retest reliability was 0.75, 0.85, and 0.90 for one, two, and three repetitions of tests, respectively. Six metrics were combined into a simulator score that could discriminate between all three groups, p = 0.002, p < 0.001, and p = 0.029 for novices vs intermediates, novices vs experienced, and intermediates vs experienced, respectively. Total number of robotic operations and the mean score of the three repetitions were significantly correlated, Pearson's r = 0.74, p < 0.001. Conclusion: This study provides validity evidence for a simulator-based test in RARP. We determined a pass/fail level that can be used to ensure competency before proceeding to supervised clinical training.
Collapse
Affiliation(s)
- Rikke Groth Olsen
- Copenhagen Academy for Medical Education and Simulation (CAMES), Copenhagen, Denmark
| | - Flemming Bjerrum
- Copenhagen Academy for Medical Education and Simulation (CAMES), Copenhagen, Denmark.,Department of Surgery, Herlev/Gentofte Hospital, Herlev, Denmark
| | - Lars Konge
- Copenhagen Academy for Medical Education and Simulation (CAMES), Copenhagen, Denmark.,Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Jan Viberg Jepsen
- Copenhagen Academy for Medical Education and Simulation (CAMES), Copenhagen, Denmark.,Department of Urology, Herlev/Gentofte Hospital, Herlev, Denmark
| | - Nessn H Azawi
- Department of Surgery, Herlev/Gentofte Hospital, Herlev, Denmark.,Department of Urology, Zealand University Hospital, Roskilde, Denmark
| | - Sarah Hjartbro Bube
- Copenhagen Academy for Medical Education and Simulation (CAMES), Copenhagen, Denmark.,Department of Urology, Zealand University Hospital, Roskilde, Denmark
| |
Collapse
|
6
|
DeStephano CC, Nitsche JF, Heckman MG, Banks E, Hur HC. ACOG Simulation Working Group: A Needs Assessment of Simulation Training in OB/GYN Residencies and Recommendations for Future Research. JOURNAL OF SURGICAL EDUCATION 2020; 77:661-670. [PMID: 31859227 DOI: 10.1016/j.jsurg.2019.12.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Revised: 10/29/2019] [Accepted: 12/01/2019] [Indexed: 06/10/2023]
Abstract
OBJECTIVE To evaluate current availability and needs of simulation training among obstetrics/gynecology (OB/GYN) residency programs. DESIGN Cross-sectional survey. SETTING Accreditation Council for Graduate Medical Education accredited OB/GYN residency programs in the United States. PARTICIPANTS Residency program directors, gynecology simulation faculty, obstetrics simulation faculty, and fourth-year residents. RESULTS Of 673 invited participants, 251 (37.3%) completed the survey. Among the survey responses, OB procedures were more broadly represented compared to the GYN procedures for simulation teaching: 8 (50%) of 16 OB procedures versus 4 (18.2%) of 22 GYN procedures had simulation teaching. Among the simulated procedures, a majority of residents and faculty reported that simulation teaching was available for operative vaginal delivery, postpartum hemorrhage, shoulder dystocia, perineal laceration repair, conventional laparoscopic procedures, and robotic surgery. There were significant differences between residents and faculty perceptions regarding the availability and needs of simulated procedures with a minority of residents having knowledge of Council on Resident Education in Obstetrics and Gynecology (47.2%) and American College of Obstetrics and Gynecology (27.8%) simulation tools compared to the majority of faculty (84.7% and 72.1%, respectively). More than 80% of trainees and faculty reported they felt the average graduating resident could perform vaginal, laparoscopic, and abdominal hysterectomies independently. CONCLUSIONS Simulation is now widely available for both gynecologic and obstetric procedures, but there remains tremendous heterogeneity between programs and the perceptions of residents, program directors, and faculty. The variations in simulation training and readiness for performing different procedures following residency support the need for objective, validated assessments of actual performance to better guide resident learning and faculty teaching efforts.
Collapse
Affiliation(s)
| | - Joshua F Nitsche
- Wake Forest School of Medicine Department of OB/GYN, Winston-Salem, North Carolina
| | - Michael G Heckman
- Mayo Clinic Department of Surgical Gynecology, Jacksonville, Florida; Mayo Clinic Division of Biomedical Statistics and Informatics, Jacksonville, Florida
| | - Erika Banks
- Department of Obstetrics and Gynecology and Women's Health, Albert Einstein College of Medicine, New York, New York
| | - Hye-Chun Hur
- Division of Gynecologic Specialty Surgery, Department of Obstetrics and Gynecology, New York Presbyterian Hospital, Columbia University Medical Center, New York, New York
| |
Collapse
|