1
|
Lavanchy JL, Ramesh S, Dall'Alba D, Gonzalez C, Fiorini P, Müller-Stich BP, Nett PC, Marescaux J, Mutter D, Padoy N. Challenges in multi-centric generalization: phase and step recognition in Roux-en-Y gastric bypass surgery. Int J Comput Assist Radiol Surg 2024:10.1007/s11548-024-03166-3. [PMID: 38761319 DOI: 10.1007/s11548-024-03166-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 04/02/2024] [Indexed: 05/20/2024]
Abstract
PURPOSE Most studies on surgical activity recognition utilizing artificial intelligence (AI) have focused mainly on recognizing one type of activity from small and mono-centric surgical video datasets. It remains speculative whether those models would generalize to other centers. METHODS In this work, we introduce a large multi-centric multi-activity dataset consisting of 140 surgical videos (MultiBypass140) of laparoscopic Roux-en-Y gastric bypass (LRYGB) surgeries performed at two medical centers, i.e., the University Hospital of Strasbourg, France (StrasBypass70) and Inselspital, Bern University Hospital, Switzerland (BernBypass70). The dataset has been fully annotated with phases and steps by two board-certified surgeons. Furthermore, we assess the generalizability and benchmark different deep learning models for the task of phase and step recognition in 7 experimental studies: (1) Training and evaluation on BernBypass70; (2) Training and evaluation on StrasBypass70; (3) Training and evaluation on the joint MultiBypass140 dataset; (4) Training on BernBypass70, evaluation on StrasBypass70; (5) Training on StrasBypass70, evaluation on BernBypass70; Training on MultiBypass140, (6) evaluation on BernBypass70 and (7) evaluation on StrasBypass70. RESULTS The model's performance is markedly influenced by the training data. The worst results were obtained in experiments (4) and (5) confirming the limited generalization capabilities of models trained on mono-centric data. The use of multi-centric training data, experiments (6) and (7), improves the generalization capabilities of the models, bringing them beyond the level of independent mono-centric training and validation (experiments (1) and (2)). CONCLUSION MultiBypass140 shows considerable variation in surgical technique and workflow of LRYGB procedures between centers. Therefore, generalization experiments demonstrate a remarkable difference in model performance. These results highlight the importance of multi-centric datasets for AI model generalization to account for variance in surgical technique and workflows. The dataset and code are publicly available at https://github.com/CAMMA-public/MultiBypass140.
Collapse
Affiliation(s)
- Joël L Lavanchy
- University Digestive Health Care Center - Clarunis, 4002, Basel, Switzerland.
- Department of Biomedical Engineering, University of Basel, 4123, Allschwil, Switzerland.
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France.
| | - Sanat Ramesh
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France
- ICube, University of Strasbourg, CNRS, 67000, Strasbourg, France
- Altair Robotics Lab, University of Verona, 37134, Verona, Italy
| | - Diego Dall'Alba
- Altair Robotics Lab, University of Verona, 37134, Verona, Italy
| | - Cristians Gonzalez
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France
- University Hospital of Strasbourg, 67000, Strasbourg, France
| | - Paolo Fiorini
- Altair Robotics Lab, University of Verona, 37134, Verona, Italy
| | - Beat P Müller-Stich
- University Digestive Health Care Center - Clarunis, 4002, Basel, Switzerland
- Department of Biomedical Engineering, University of Basel, 4123, Allschwil, Switzerland
| | - Philipp C Nett
- Department of Visceral Surgery and Medicine, Inselspital Bern University Hospital, 3010, Bern, Switzerland
| | | | - Didier Mutter
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France
- University Hospital of Strasbourg, 67000, Strasbourg, France
| | - Nicolas Padoy
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France
- ICube, University of Strasbourg, CNRS, 67000, Strasbourg, France
| |
Collapse
|
2
|
Knudsen JE, Ghaffar U, Ma R, Hung AJ. Clinical applications of artificial intelligence in robotic surgery. J Robot Surg 2024; 18:102. [PMID: 38427094 PMCID: PMC10907451 DOI: 10.1007/s11701-024-01867-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 02/10/2024] [Indexed: 03/02/2024]
Abstract
Artificial intelligence (AI) is revolutionizing nearly every aspect of modern life. In the medical field, robotic surgery is the sector with some of the most innovative and impactful advancements. In this narrative review, we outline recent contributions of AI to the field of robotic surgery with a particular focus on intraoperative enhancement. AI modeling is allowing surgeons to have advanced intraoperative metrics such as force and tactile measurements, enhanced detection of positive surgical margins, and even allowing for the complete automation of certain steps in surgical procedures. AI is also Query revolutionizing the field of surgical education. AI modeling applied to intraoperative surgical video feeds and instrument kinematics data is allowing for the generation of automated skills assessments. AI also shows promise for the generation and delivery of highly specialized intraoperative surgical feedback for training surgeons. Although the adoption and integration of AI show promise in robotic surgery, it raises important, complex ethical questions. Frameworks for thinking through ethical dilemmas raised by AI are outlined in this review. AI enhancements in robotic surgery is some of the most groundbreaking research happening today, and the studies outlined in this review represent some of the most exciting innovations in recent years.
Collapse
Affiliation(s)
- J Everett Knudsen
- Keck School of Medicine, University of Southern California, Los Angeles, USA
| | | | - Runzhuo Ma
- Cedars-Sinai Medical Center, Los Angeles, USA
| | | |
Collapse
|
3
|
Zhai Y, Chen Z, Zheng Z, Wang X, Yan X, Liu X, Yin J, Wang J, Zhang J. Artificial intelligence for automatic surgical phase recognition of laparoscopic gastrectomy in gastric cancer. Int J Comput Assist Radiol Surg 2024; 19:345-353. [PMID: 37914911 DOI: 10.1007/s11548-023-03027-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Accepted: 10/02/2023] [Indexed: 11/03/2023]
Abstract
PURPOSE This study aimed to classify laparoscopic gastric cancer phases. We also aimed to develop a transformer-based artificial intelligence (AI) model for automatic surgical phase recognition and to evaluate the model's performance using laparoscopic gastric cancer surgical videos. METHODS One hundred patients who underwent laparoscopic surgery for gastric cancer were included in this study. All surgical videos were labeled and classified into eight phases (P0. Preparation. P1. Separate the greater gastric curvature. P2. Separate the distal stomach. P3. Separate lesser gastric curvature. P4. Dissect the superior margin of the pancreas. P5. Separation of the proximal stomach. P6. Digestive tract reconstruction. P7. End of operation). This study proposed an AI phase recognition model consisting of a convolutional neural network-based visual feature extractor and temporal relational transformer. RESULTS A visual and temporal relationship network was proposed to automatically perform accurate surgical phase prediction. The average time for all surgical videos in the video set was 9114 ± 2571 s. The longest phase was at P1 (3388 s). The final research accuracy, F1, recall, and precision were 90.128, 87.04, 87.04, and 87.32%, respectively. The phase with the highest recognition accuracy was P1, and that with the lowest accuracy was P2. CONCLUSION An AI model based on neural and transformer networks was developed in this study. This model can identify the phases of laparoscopic surgery for gastric cancer accurately. AI can be used as an analytical tool for gastric cancer surgical videos.
Collapse
Affiliation(s)
- Yuhao Zhai
- Department of General Surgery, Beijing Friendship Hospital, Capital Medical University, 95 Yong'an Road, Xicheng District, Beijing, China
- State Key Lab of Digestive Health, Beijing, China
| | - Zhen Chen
- Centre for Artificial Intelligence and Robotics (CAIR), Hong Kong Institute of Science and Innovation, Chinese Academy of Sciences, Hong Kong SAR, China
| | - Zhi Zheng
- Department of General Surgery, Beijing Friendship Hospital, Capital Medical University, 95 Yong'an Road, Xicheng District, Beijing, China
- State Key Lab of Digestive Health, Beijing, China
| | - Xi Wang
- Department of General Surgery, Beijing Friendship Hospital, Capital Medical University, 95 Yong'an Road, Xicheng District, Beijing, China
- State Key Lab of Digestive Health, Beijing, China
| | - Xiaosheng Yan
- Department of General Surgery, Beijing Friendship Hospital, Capital Medical University, 95 Yong'an Road, Xicheng District, Beijing, China
- State Key Lab of Digestive Health, Beijing, China
| | - Xiaoye Liu
- Department of General Surgery, Beijing Friendship Hospital, Capital Medical University, 95 Yong'an Road, Xicheng District, Beijing, China
- State Key Lab of Digestive Health, Beijing, China
| | - Jie Yin
- Department of General Surgery, Beijing Friendship Hospital, Capital Medical University, 95 Yong'an Road, Xicheng District, Beijing, China
- State Key Lab of Digestive Health, Beijing, China
| | - Jinqiao Wang
- Centre for Artificial Intelligence and Robotics (CAIR), Hong Kong Institute of Science and Innovation, Chinese Academy of Sciences, Hong Kong SAR, China.
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 95 Zhongguancun East Road, Haidian District, Beijing, China.
- Wuhan AI Research, Wuhan, China.
| | - Jun Zhang
- Department of General Surgery, Beijing Friendship Hospital, Capital Medical University, 95 Yong'an Road, Xicheng District, Beijing, China.
- State Key Lab of Digestive Health, Beijing, China.
| |
Collapse
|
4
|
Kostiuchik G, Sharan L, Mayer B, Wolf I, Preim B, Engelhardt S. Surgical phase and instrument recognition: how to identify appropriate dataset splits. Int J Comput Assist Radiol Surg 2024:10.1007/s11548-024-03063-9. [PMID: 38285380 DOI: 10.1007/s11548-024-03063-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 01/08/2024] [Indexed: 01/30/2024]
Abstract
PURPOSE Machine learning approaches can only be reliably evaluated if training, validation, and test data splits are representative and not affected by the absence of classes. Surgical workflow and instrument recognition are two tasks that are complicated in this manner, because of heavy data imbalances resulting from different length of phases and their potential erratic occurrences. Furthermore, sub-properties like instrument (co-)occurrence are usually not particularly considered when defining the split. METHODS We present a publicly available data visualization tool that enables interactive exploration of dataset partitions for surgical phase and instrument recognition. The application focuses on the visualization of the occurrence of phases, phase transitions, instruments, and instrument combinations across sets. Particularly, it facilitates assessment of dataset splits, especially regarding identification of sub-optimal dataset splits. RESULTS We performed analysis of the datasets Cholec80, CATARACTS, CaDIS, M2CAI-workflow, and M2CAI-tool using the proposed application. We were able to uncover phase transitions, individual instruments, and combinations of surgical instruments that were not represented in one of the sets. Addressing these issues, we identify possible improvements in the splits using our tool. A user study with ten participants demonstrated that the participants were able to successfully solve a selection of data exploration tasks. CONCLUSION In highly unbalanced class distributions, special care should be taken with respect to the selection of an appropriate dataset split because it can greatly influence the assessments of machine learning approaches. Our interactive tool allows for determination of better splits to improve current practices in the field. The live application is available at https://cardio-ai.github.io/endovis-ml/ .
Collapse
Affiliation(s)
- Georgii Kostiuchik
- Department of Cardiac Surgery, Heidelberg University Hospital, Heidelberg, Germany.
- DZHK (German Centre for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Heidelberg, Germany.
| | - Lalith Sharan
- Department of Cardiac Surgery, Heidelberg University Hospital, Heidelberg, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Heidelberg, Germany
| | - Benedikt Mayer
- Department of Simulation and Graphics, University of Magdeburg, Magdeburg, Germany
| | - Ivo Wolf
- Department of Computer Science, Mannheim University of Applied Sciences, Mannheim, Germany
| | - Bernhard Preim
- Department of Simulation and Graphics, University of Magdeburg, Magdeburg, Germany
| | - Sandy Engelhardt
- Department of Cardiac Surgery, Heidelberg University Hospital, Heidelberg, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Heidelberg, Germany
| |
Collapse
|
5
|
Demir KC, Schieber H, Weise T, Roth D, May M, Maier A, Yang SH. Deep Learning in Surgical Workflow Analysis: A Review of Phase and Step Recognition. IEEE J Biomed Health Inform 2023; 27:5405-5417. [PMID: 37665700 DOI: 10.1109/jbhi.2023.3311628] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2023]
Abstract
OBJECTIVE In the last two decades, there has been a growing interest in exploring surgical procedures with statistical models to analyze operations at different semantic levels. This information is necessary for developing context-aware intelligent systems, which can assist the physicians during operations, evaluate procedures afterward or help the management team to effectively utilize the operating room. The objective is to extract reliable patterns from surgical data for the robust estimation of surgical activities performed during operations. The purpose of this article is to review the state-of-the-art deep learning methods that have been published after 2018 for analyzing surgical workflows, with a focus on phase and step recognition. METHODS Three databases, IEEE Xplore, Scopus, and PubMed were searched, and additional studies are added through a manual search. After the database search, 343 studies were screened and a total of 44 studies are selected for this review. CONCLUSION The use of temporal information is essential for identifying the next surgical action. Contemporary methods used mainly RNNs, hierarchical CNNs, and Transformers to preserve long-distance temporal relations. The lack of large publicly available datasets for various procedures is a great challenge for the development of new and robust models. As supervised learning strategies are used to show proof-of-concept, self-supervised, semi-supervised, or active learning methods are used to mitigate dependency on annotated data. SIGNIFICANCE The present study provides a comprehensive review of recent methods in surgical workflow analysis, summarizes commonly used architectures, datasets, and discusses challenges.
Collapse
|
6
|
Zhang J, Zhou S, Wang Y, Shi S, Wan C, Zhao H, Cai X, Ding H. Laparoscopic Image-Based Critical Action Recognition and Anticipation With Explainable Features. IEEE J Biomed Health Inform 2023; 27:5393-5404. [PMID: 37603480 DOI: 10.1109/jbhi.2023.3306818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
Surgical workflow analysis integrates perception, comprehension, and prediction of the surgical workflow, which helps real-time surgical support systems provide proper guidance and assistance for surgeons. This article promotes the idea of critical actions, which refer to the essential surgical actions that progress towards the fulfillment of the operation. Fine-grained workflow analysis involves recognizing current critical actions and previewing the moving tendency of instruments in the early stage of critical actions. Aiming at this, we propose a framework that incorporates operational experience to improve the robustness and interpretability of action recognition in in-vivo situations. High-dimensional images are mapped into an experience-based explainable feature space with low dimensions to achieve critical action recognition through a hierarchical classification structure. To forecast the instrument's motion tendency, we model the motion primitives in the polar coordinate system (PCS) to represent patterns of complex trajectories. Given the laparoscopy variance, the adaptive pattern recognition (APR) method, which adapts to uncertain trajectories by modifying model parameters, is designed to improve prediction accuracy. The in-vivo dataset validations show that our framework fulfilled the surgical awareness tasks with exceptional accuracy and real-time performance.
Collapse
|
7
|
Tao R, Zou X, Zheng G. LAST: LAtent Space-Constrained Transformers for Automatic Surgical Phase Recognition and Tool Presence Detection. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3256-3268. [PMID: 37227905 DOI: 10.1109/tmi.2023.3279838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
When developing context-aware systems, automatic surgical phase recognition and tool presence detection are two essential tasks. There exist previous attempts to develop methods for both tasks but majority of the existing methods utilize a frame-level loss function (e.g., cross-entropy) which does not fully leverage the underlying semantic structure of a surgery, leading to sub-optimal results. In this paper, we propose multi-task learning-based, LAtent Space-constrained Transformers, referred as LAST, for automatic surgical phase recognition and tool presence detection. Our design features a two-branch transformer architecture with a novel and generic way to leverage video-level semantic information during network training. This is done by learning a non-linear compact presentation of the underlying semantic structure information of surgical videos through a transformer variational autoencoder (VAE) and by encouraging models to follow the learned statistical distributions. In other words, LAST is of structure-aware and favors predictions that lie on the extracted low dimensional data manifold. Validated on two public datasets of the cholecystectomy surgery, i.e., the Cholec80 dataset and the M2cai16 dataset, our method achieves better results than other state-of-the-art methods. Specifically, on the Cholec80 dataset, our method achieves an average accuracy of 93.12±4.71%, an average precision of 89.25±5.49%, an average recall of 90.10±5.45% and an average Jaccard of 81.11 ±7.62% for phase recognition, and an average mAP of 95.15±3.87% for tool presence detection. Similar superior performance is also observed when LAST is applied to the M2cai16 dataset.
Collapse
|
8
|
Cao J, Yip HC, Chen Y, Scheppach M, Luo X, Yang H, Cheng MK, Long Y, Jin Y, Chiu PWY, Yam Y, Meng HML, Dou Q. Intelligent surgical workflow recognition for endoscopic submucosal dissection with real-time animal study. Nat Commun 2023; 14:6676. [PMID: 37865629 PMCID: PMC10590425 DOI: 10.1038/s41467-023-42451-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 10/11/2023] [Indexed: 10/23/2023] Open
Abstract
Recent advancements in artificial intelligence have witnessed human-level performance; however, AI-enabled cognitive assistance for therapeutic procedures has not been fully explored nor pre-clinically validated. Here we propose AI-Endo, an intelligent surgical workflow recognition suit, for endoscopic submucosal dissection (ESD). Our AI-Endo is trained on high-quality ESD cases from an expert endoscopist, covering a decade time expansion and consisting of 201,026 labeled frames. The learned model demonstrates outstanding performance on validation data, including cases from relatively junior endoscopists with various skill levels, procedures conducted with different endoscopy systems and therapeutic skills, and cohorts from international multi-centers. Furthermore, we integrate our AI-Endo with the Olympus endoscopic system and validate the AI-enabled cognitive assistance system with animal studies in live ESD training sessions. Dedicated data analysis from surgical phase recognition results is summarized in an automatically generated report for skill assessment.
Collapse
Affiliation(s)
- Jianfeng Cao
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Hon-Chi Yip
- Department of Surgery, The Chinese University of Hong Kong, Hong Kong, China.
| | - Yueyao Chen
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Markus Scheppach
- Internal Medicine III-Gastroenterology, University Hospital of Augsburg, Augsburg, Germany
| | - Xiaobei Luo
- Guangdong Provincial Key Laboratory of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Hongzheng Yang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Ming Kit Cheng
- Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Yonghao Long
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Yueming Jin
- Department of Biomedical Engineering, National University of Singapore, Singapore, Singapore
| | - Philip Wai-Yan Chiu
- Multi-scale Medical Robotics Center and The Chinese University of Hong Kong, Hong Kong, China.
| | - Yeung Yam
- Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Hong Kong, China.
- Multi-scale Medical Robotics Center and The Chinese University of Hong Kong, Hong Kong, China.
- Centre for Perceptual and Interactive Intelligence and The Chinese University of Hong Kong, Hong Kong, China.
| | - Helen Mei-Ling Meng
- Centre for Perceptual and Interactive Intelligence and The Chinese University of Hong Kong, Hong Kong, China.
| | - Qi Dou
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China.
| |
Collapse
|
9
|
Ramesh S, Dall'Alba D, Gonzalez C, Yu T, Mascagni P, Mutter D, Marescaux J, Fiorini P, Padoy N. Weakly Supervised Temporal Convolutional Networks for Fine-Grained Surgical Activity Recognition. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2592-2602. [PMID: 37030859 DOI: 10.1109/tmi.2023.3262847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Automatic recognition of fine-grained surgical activities, called steps, is a challenging but crucial task for intelligent intra-operative computer assistance. The development of current vision-based activity recognition methods relies heavily on a high volume of manually annotated data. This data is difficult and time-consuming to generate and requires domain-specific knowledge. In this work, we propose to use coarser and easier-to-annotate activity labels, namely phases, as weak supervision to learn step recognition with fewer step annotated videos. We introduce a step-phase dependency loss to exploit the weak supervision signal. We then employ a Single-Stage Temporal Convolutional Network (SS-TCN) with a ResNet-50 backbone, trained in an end-to-end fashion from weakly annotated videos, for temporal activity segmentation and recognition. We extensively evaluate and show the effectiveness of the proposed method on a large video dataset consisting of 40 laparoscopic gastric bypass procedures and the public benchmark CATARACTS containing 50 cataract surgeries.
Collapse
|
10
|
Ramesh S, Dall'Alba D, Gonzalez C, Yu T, Mascagni P, Mutter D, Marescaux J, Fiorini P, Padoy N. TRandAugment: temporal random augmentation strategy for surgical activity recognition from videos. Int J Comput Assist Radiol Surg 2023; 18:1665-1672. [PMID: 36944845 PMCID: PMC10491694 DOI: 10.1007/s11548-023-02864-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 03/01/2023] [Indexed: 03/23/2023]
Abstract
PURPOSE Automatic recognition of surgical activities from intraoperative surgical videos is crucial for developing intelligent support systems for computer-assisted interventions. Current state-of-the-art recognition methods are based on deep learning where data augmentation has shown the potential to improve the generalization of these methods. This has spurred work on automated and simplified augmentation strategies for image classification and object detection on datasets of still images. Extending such augmentation methods to videos is not straightforward, as the temporal dimension needs to be considered. Furthermore, surgical videos pose additional challenges as they are composed of multiple, interconnected, and long-duration activities. METHODS This work proposes a new simplified augmentation method, called TRandAugment, specifically designed for long surgical videos, that treats each video as an assemble of temporal segments and applies consistent but random transformations to each segment. The proposed augmentation method is used to train an end-to-end spatiotemporal model consisting of a CNN (ResNet50) followed by a TCN. RESULTS The effectiveness of the proposed method is demonstrated on two surgical video datasets, namely Bypass40 and CATARACTS, and two tasks, surgical phase and step recognition. TRandAugment adds a performance boost of 1-6% over previous state-of-the-art methods, that uses manually designed augmentations. CONCLUSION This work presents a simplified and automated augmentation method for long surgical videos. The proposed method has been validated on different datasets and tasks indicating the importance of devising temporal augmentation methods for long surgical videos.
Collapse
Affiliation(s)
- Sanat Ramesh
- Altair Robotics Lab, University of Verona, 37134, Verona, Italy.
- ICube, University of Strasbourg, CNRS, 67000, Strasbourg, France.
| | - Diego Dall'Alba
- Altair Robotics Lab, University of Verona, 37134, Verona, Italy
| | - Cristians Gonzalez
- University Hospital of Strasbourg, 67000, Strasbourg, France
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France
| | - Tong Yu
- ICube, University of Strasbourg, CNRS, 67000, Strasbourg, France
| | - Pietro Mascagni
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France
- Fondazione Policlinico Universitario Agostino Gemelli IRCCS, 00168, Rome, Italy
| | - Didier Mutter
- University Hospital of Strasbourg, 67000, Strasbourg, France
- IRCAD, 67000, Strasbourg, France
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France
| | | | - Paolo Fiorini
- Altair Robotics Lab, University of Verona, 37134, Verona, Italy
| | - Nicolas Padoy
- ICube, University of Strasbourg, CNRS, 67000, Strasbourg, France
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France
| |
Collapse
|
11
|
Yu T, Mascagni P, Verde J, Marescaux J, Mutter D, Padoy N. Live laparoscopic video retrieval with compressed uncertainty. Med Image Anal 2023; 88:102866. [PMID: 37356320 DOI: 10.1016/j.media.2023.102866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 04/14/2023] [Accepted: 06/07/2023] [Indexed: 06/27/2023]
Abstract
Searching through large volumes of medical data to retrieve relevant information is a challenging yet crucial task for clinical care. However the primitive and most common approach to retrieval, involving text in the form of keywords, is severely limited when dealing with complex media formats. Content-based retrieval offers a way to overcome this limitation, by using rich media as the query itself. Surgical video-to-video retrieval in particular is a new and largely unexplored research problem with high clinical value, especially in the real-time case: using real-time video hashing, search can be achieved directly inside of the operating room. Indeed, the process of hashing converts large data entries into compact binary arrays or hashes, enabling large-scale search operations at a very fast rate. However, due to fluctuations over the course of a video, not all bits in a given hash are equally reliable. In this work, we propose a method capable of mitigating this uncertainty while maintaining a light computational footprint. We present superior retrieval results (3%-4% top 10 mean average precision) on a multi-task evaluation protocol for surgery, using cholecystectomy phases, bypass phases, and coming from an entirely new dataset introduced here, surgical events across six different surgery types. Success on this multi-task benchmark shows the generalizability of our approach for surgical video retrieval.
Collapse
Affiliation(s)
- Tong Yu
- ICube, University of Strasbourg, CNRS, France; IHU Strasbourg, France.
| | - Pietro Mascagni
- ICube, University of Strasbourg, CNRS, France; IHU Strasbourg, France; Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| | | | | | - Didier Mutter
- IHU Strasbourg, France; University Hospital of Strasbourg, France
| | - Nicolas Padoy
- ICube, University of Strasbourg, CNRS, France; IHU Strasbourg, France
| |
Collapse
|
12
|
Lavanchy JL, Vardazaryan A, Mascagni P, Mutter D, Padoy N. Preserving privacy in surgical video analysis using a deep learning classifier to identify out-of-body scenes in endoscopic videos. Sci Rep 2023; 13:9235. [PMID: 37286660 DOI: 10.1038/s41598-023-36453-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 06/03/2023] [Indexed: 06/09/2023] Open
Abstract
Surgical video analysis facilitates education and research. However, video recordings of endoscopic surgeries can contain privacy-sensitive information, especially if the endoscopic camera is moved out of the body of patients and out-of-body scenes are recorded. Therefore, identification of out-of-body scenes in endoscopic videos is of major importance to preserve the privacy of patients and operating room staff. This study developed and validated a deep learning model for the identification of out-of-body images in endoscopic videos. The model was trained and evaluated on an internal dataset of 12 different types of laparoscopic and robotic surgeries and was externally validated on two independent multicentric test datasets of laparoscopic gastric bypass and cholecystectomy surgeries. Model performance was evaluated compared to human ground truth annotations measuring the receiver operating characteristic area under the curve (ROC AUC). The internal dataset consisting of 356,267 images from 48 videos and the two multicentric test datasets consisting of 54,385 and 58,349 images from 10 and 20 videos, respectively, were annotated. The model identified out-of-body images with 99.97% ROC AUC on the internal test dataset. Mean ± standard deviation ROC AUC on the multicentric gastric bypass dataset was 99.94 ± 0.07% and 99.71 ± 0.40% on the multicentric cholecystectomy dataset, respectively. The model can reliably identify out-of-body images in endoscopic videos and is publicly shared. This facilitates privacy preservation in surgical video analysis.
Collapse
Affiliation(s)
- Joël L Lavanchy
- IHU Strasbourg, 1 Place de l'Hôpital, 67091, Strasbourg Cedex, France.
- Department of Visceral Surgery and Medicine, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland.
- Division of Surgery, Clarunis-University Center for Gastrointestinal and Liver Diseases, St Clara and University Hospital of Basel, Basel, Switzerland.
| | - Armine Vardazaryan
- IHU Strasbourg, 1 Place de l'Hôpital, 67091, Strasbourg Cedex, France
- ICube, University of Strasbourg, CNRS, Strasbourg, France
| | - Pietro Mascagni
- IHU Strasbourg, 1 Place de l'Hôpital, 67091, Strasbourg Cedex, France
- Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| | - Didier Mutter
- IHU Strasbourg, 1 Place de l'Hôpital, 67091, Strasbourg Cedex, France
- University Hospital of Strasbourg, Strasbourg, France
| | - Nicolas Padoy
- IHU Strasbourg, 1 Place de l'Hôpital, 67091, Strasbourg Cedex, France
- ICube, University of Strasbourg, CNRS, Strasbourg, France
| |
Collapse
|
13
|
Nyangoh Timoh K, Huaulme A, Cleary K, Zaheer MA, Lavoué V, Donoho D, Jannin P. A systematic review of annotation for surgical process model analysis in minimally invasive surgery based on video. Surg Endosc 2023:10.1007/s00464-023-10041-w. [PMID: 37157035 DOI: 10.1007/s00464-023-10041-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Accepted: 03/25/2023] [Indexed: 05/10/2023]
Abstract
BACKGROUND Annotated data are foundational to applications of supervised machine learning. However, there seems to be a lack of common language used in the field of surgical data science. The aim of this study is to review the process of annotation and semantics used in the creation of SPM for minimally invasive surgery videos. METHODS For this systematic review, we reviewed articles indexed in the MEDLINE database from January 2000 until March 2022. We selected articles using surgical video annotations to describe a surgical process model in the field of minimally invasive surgery. We excluded studies focusing on instrument detection or recognition of anatomical areas only. The risk of bias was evaluated with the Newcastle Ottawa Quality assessment tool. Data from the studies were visually presented in table using the SPIDER tool. RESULTS Of the 2806 articles identified, 34 were selected for review. Twenty-two were in the field of digestive surgery, six in ophthalmologic surgery only, one in neurosurgery, three in gynecologic surgery, and two in mixed fields. Thirty-one studies (88.2%) were dedicated to phase, step, or action recognition and mainly relied on a very simple formalization (29, 85.2%). Clinical information in the datasets was lacking for studies using available public datasets. The process of annotation for surgical process model was lacking and poorly described, and description of the surgical procedures was highly variable between studies. CONCLUSION Surgical video annotation lacks a rigorous and reproducible framework. This leads to difficulties in sharing videos between institutions and hospitals because of the different languages used. There is a need to develop and use common ontology to improve libraries of annotated surgical videos.
Collapse
Affiliation(s)
- Krystel Nyangoh Timoh
- Department of Gynecology and Obstetrics and Human Reproduction, CHU Rennes, Rennes, France.
- INSERM, LTSI - UMR 1099, University Rennes 1, Rennes, France.
- Laboratoire d'Anatomie et d'Organogenèse, Faculté de Médecine, Centre Hospitalier Universitaire de Rennes, 2 Avenue du Professeur Léon Bernard, 35043, Rennes Cedex, France.
- Department of Obstetrics and Gynecology, Rennes Hospital, Rennes, France.
| | - Arnaud Huaulme
- INSERM, LTSI - UMR 1099, University Rennes 1, Rennes, France
| | - Kevin Cleary
- Sheikh Zayed Institute for Pediatric Surgical Innovation, Children's National Hospital, Washington, DC, 20010, USA
| | - Myra A Zaheer
- George Washington University School of Medicine and Health Sciences, Washington, DC, USA
| | - Vincent Lavoué
- Department of Gynecology and Obstetrics and Human Reproduction, CHU Rennes, Rennes, France
| | - Dan Donoho
- Division of Neurosurgery, Center for Neuroscience, Children's National Hospital, Washington, DC, 20010, USA
| | - Pierre Jannin
- INSERM, LTSI - UMR 1099, University Rennes 1, Rennes, France
| |
Collapse
|
14
|
Sharma S, Nwoye CI, Mutter D, Padoy N. Rendezvous in time: an attention-based temporal fusion approach for surgical triplet recognition. Int J Comput Assist Radiol Surg 2023:10.1007/s11548-023-02914-1. [PMID: 37097518 DOI: 10.1007/s11548-023-02914-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 04/07/2023] [Indexed: 04/26/2023]
Abstract
PURPOSE One of the recent advances in surgical AI is the recognition of surgical activities as triplets of [Formula: see text]instrument, verb, target[Formula: see text]. Albeit providing detailed information for computer-assisted intervention, current triplet recognition approaches rely only on single-frame features. Exploiting the temporal cues from earlier frames would improve the recognition of surgical action triplets from videos. METHODS In this paper, we propose Rendezvous in Time (RiT)-a deep learning model that extends the state-of-the-art model, Rendezvous, with temporal modeling. Focusing more on the verbs, our RiT explores the connectedness of current and past frames to learn temporal attention-based features for enhanced triplet recognition. RESULTS We validate our proposal on the challenging surgical triplet dataset, CholecT45, demonstrating an improved recognition of the verb and triplet along with other interactions involving the verb such as [Formula: see text]instrument, verb[Formula: see text]. Qualitative results show that the RiT produces smoother predictions for most triplet instances than the state-of-the-arts. CONCLUSION We present a novel attention-based approach that leverages the temporal fusion of video frames to model the evolution of surgical actions and exploit their benefits for surgical triplet recognition.
Collapse
Affiliation(s)
- Saurav Sharma
- ICube, University of Strasbourg, CNRS, Strasbourg, France.
| | | | - Didier Mutter
- IHU Strasbourg, Strasbourg, France
- University Hospital of Strasbourg, Strasbourg, France
| | - Nicolas Padoy
- ICube, University of Strasbourg, CNRS, Strasbourg, France
- IHU Strasbourg, Strasbourg, France
| |
Collapse
|
15
|
Zhang B, Goel B, Sarhan MH, Goel VK, Abukhalil R, Kalesan B, Stottler N, Petculescu S. Surgical workflow recognition with temporal convolution and transformer for action segmentation. Int J Comput Assist Radiol Surg 2023; 18:785-794. [PMID: 36542253 DOI: 10.1007/s11548-022-02811-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 12/09/2022] [Indexed: 12/24/2022]
Abstract
PURPOSE Automatic surgical workflow recognition enabled by computer vision algorithms plays a key role in enhancing the learning experience of surgeons. It also supports building context-aware systems that allow better surgical planning and decision making which may in turn improve outcomes. Utilizing temporal information is crucial for recognizing context; hence, various recent approaches use recurrent neural networks or transformers to recognize actions. METHODS We design and implement a two-stage method for surgical workflow recognition. We utilize R(2+1)D for video clip modeling in the first stage. We propose Action Segmentation Temporal Convolutional Transformer (ASTCFormer) network for full video modeling in the second stage. ASTCFormer utilizes action segmentation transformers (ASFormers) and temporal convolutional networks (TCNs) to build a temporally aware surgical workflow recognition system. RESULTS We compare the proposed ASTCFormer with recurrent neural networks, multi-stage TCN, and ASFormer approaches. The comparison is done on a dataset comprised of 207 robotic and laparoscopic cholecystectomy surgical videos annotated for 7 surgical phases. The proposed method outperforms the compared methods achieving a [Formula: see text] relative improvement in the average segmental F1-score over the state-of-the-art ASFormer method. Moreover, our proposed method achieves state-of-the-art results on the publicly available Cholec80 dataset. CONCLUSION The improvement in the results when using the proposed method suggests that temporal context could be better captured when adding information from TCN to the ASFormer paradigm. This addition leads to better surgical workflow recognition.
Collapse
Affiliation(s)
- Bokai Zhang
- Johnson & Johnson MedTech, 1100 Olive Way, Suite 1100, Seattle, 98101, WA, USA.
| | - Bharti Goel
- Johnson & Johnson MedTech, 5490 Great America Pkwy, Santa Clara, CA, 95054, USA
| | - Mohammad Hasan Sarhan
- Johnson & Johnson MedTech, Robert-Koch-Straße 1, 22851, Norderstedt, Schleswig-Holstein, Germany
| | - Varun Kejriwal Goel
- Johnson & Johnson MedTech, 5490 Great America Pkwy, Santa Clara, CA, 95054, USA
| | - Rami Abukhalil
- Johnson & Johnson MedTech, 5490 Great America Pkwy, Santa Clara, CA, 95054, USA
| | - Bindu Kalesan
- Johnson & Johnson MedTech, 5490 Great America Pkwy, Santa Clara, CA, 95054, USA
| | - Natalie Stottler
- Johnson & Johnson MedTech, 1100 Olive Way, Suite 1100, Seattle, 98101, WA, USA
| | - Svetlana Petculescu
- Johnson & Johnson MedTech, 1100 Olive Way, Suite 1100, Seattle, 98101, WA, USA
| |
Collapse
|
16
|
Lavanchy JL, Gonzalez C, Kassem H, Nett PC, Mutter D, Padoy N. Proposal and multicentric validation of a laparoscopic Roux-en-Y gastric bypass surgery ontology. Surg Endosc 2023; 37:2070-2077. [PMID: 36289088 PMCID: PMC10017621 DOI: 10.1007/s00464-022-09745-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Accepted: 10/14/2022] [Indexed: 11/30/2022]
Abstract
BACKGROUND Phase and step annotation in surgical videos is a prerequisite for surgical scene understanding and for downstream tasks like intraoperative feedback or assistance. However, most ontologies are applied on small monocentric datasets and lack external validation. To overcome these limitations an ontology for phases and steps of laparoscopic Roux-en-Y gastric bypass (LRYGB) is proposed and validated on a multicentric dataset in terms of inter- and intra-rater reliability (inter-/intra-RR). METHODS The proposed LRYGB ontology consists of 12 phase and 46 step definitions that are hierarchically structured. Two board certified surgeons (raters) with > 10 years of clinical experience applied the proposed ontology on two datasets: (1) StraBypass40 consists of 40 LRYGB videos from Nouvel Hôpital Civil, Strasbourg, France and (2) BernBypass70 consists of 70 LRYGB videos from Inselspital, Bern University Hospital, Bern, Switzerland. To assess inter-RR the two raters' annotations of ten randomly chosen videos from StraBypass40 and BernBypass70 each, were compared. To assess intra-RR ten randomly chosen videos were annotated twice by the same rater and annotations were compared. Inter-RR was calculated using Cohen's kappa. Additionally, for inter- and intra-RR accuracy, precision, recall, F1-score, and application dependent metrics were applied. RESULTS The mean ± SD video duration was 108 ± 33 min and 75 ± 21 min in StraBypass40 and BernBypass70, respectively. The proposed ontology shows an inter-RR of 96.8 ± 2.7% for phases and 85.4 ± 6.0% for steps on StraBypass40 and 94.9 ± 5.8% for phases and 76.1 ± 13.9% for steps on BernBypass70. The overall Cohen's kappa of inter-RR was 95.9 ± 4.3% for phases and 80.8 ± 10.0% for steps. Intra-RR showed an accuracy of 98.4 ± 1.1% for phases and 88.1 ± 8.1% for steps. CONCLUSION The proposed ontology shows an excellent inter- and intra-RR and should therefore be implemented routinely in phase and step annotation of LRYGB.
Collapse
Affiliation(s)
- Joël L Lavanchy
- IHU Strasbourg, 1 Place de l'Hôpital, 67000, Strasbourg, France.
- Department of Visceral Surgery and Medicine, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland.
| | - Cristians Gonzalez
- IHU Strasbourg, 1 Place de l'Hôpital, 67000, Strasbourg, France
- University Hospital of Strasbourg, Strasbourg, France
| | - Hasan Kassem
- ICube, CNRS, University of Strasbourg, Strasbourg, France
| | - Philipp C Nett
- Department of Visceral Surgery and Medicine, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Didier Mutter
- IHU Strasbourg, 1 Place de l'Hôpital, 67000, Strasbourg, France
- University Hospital of Strasbourg, Strasbourg, France
| | - Nicolas Padoy
- IHU Strasbourg, 1 Place de l'Hôpital, 67000, Strasbourg, France
- ICube, CNRS, University of Strasbourg, Strasbourg, France
| |
Collapse
|
17
|
Zhao Y, Wang X, Che T, Bao G, Li S. Multi-task deep learning for medical image computing and analysis: A review. Comput Biol Med 2023; 153:106496. [PMID: 36634599 DOI: 10.1016/j.compbiomed.2022.106496] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 12/06/2022] [Accepted: 12/27/2022] [Indexed: 12/29/2022]
Abstract
The renaissance of deep learning has provided promising solutions to various tasks. While conventional deep learning models are constructed for a single specific task, multi-task deep learning (MTDL) that is capable to simultaneously accomplish at least two tasks has attracted research attention. MTDL is a joint learning paradigm that harnesses the inherent correlation of multiple related tasks to achieve reciprocal benefits in improving performance, enhancing generalizability, and reducing the overall computational cost. This review focuses on the advanced applications of MTDL for medical image computing and analysis. We first summarize four popular MTDL network architectures (i.e., cascaded, parallel, interacted, and hybrid). Then, we review the representative MTDL-based networks for eight application areas, including the brain, eye, chest, cardiac, abdomen, musculoskeletal, pathology, and other human body regions. While MTDL-based medical image processing has been flourishing and demonstrating outstanding performance in many tasks, in the meanwhile, there are performance gaps in some tasks, and accordingly we perceive the open challenges and the perspective trends. For instance, in the 2018 Ischemic Stroke Lesion Segmentation challenge, the reported top dice score of 0.51 and top recall of 0.55 achieved by the cascaded MTDL model indicate further research efforts in high demand to escalate the performance of current models.
Collapse
Affiliation(s)
- Yan Zhao
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
| | - Xiuying Wang
- School of Computer Science, The University of Sydney, Sydney, NSW, 2008, Australia.
| | - Tongtong Che
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
| | - Guoqing Bao
- School of Computer Science, The University of Sydney, Sydney, NSW, 2008, Australia
| | - Shuyu Li
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
18
|
Fer D, Zhang B, Abukhalil R, Goel V, Goel B, Barker J, Kalesan B, Barragan I, Gaddis ML, Kilroy PG. An artificial intelligence model that automatically labels roux-en-Y gastric bypasses, a comparison to trained surgeon annotators. Surg Endosc 2023:10.1007/s00464-023-09870-6. [PMID: 36658282 DOI: 10.1007/s00464-023-09870-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 01/04/2023] [Indexed: 01/21/2023]
Abstract
INTRODUCTION Artificial intelligence (AI) can automate certain tasks to improve data collection. Models have been created to annotate the steps of Roux-en-Y Gastric Bypass (RYGB). However, model performance has not been compared with individual surgeon annotator performance. We developed a model that automatically labels RYGB steps and compares its performance to surgeons. METHODS AND PROCEDURES 545 videos (17 surgeons) of laparoscopic RYGB procedures were collected. An annotation guide (12 steps, 52 tasks) was developed. Steps were annotated by 11 surgeons. Each video was annotated by two surgeons and a third reconciled the differences. A convolutional AI model was trained to identify steps and compared with manual annotation. For modeling, we used 390 videos for training, 95 for validation, and 60 for testing. The performance comparison between AI model versus manual annotation was performed using ANOVA (Analysis of Variance) in a subset of 60 testing videos. We assessed the performance of the model at each step and poor performance was defined (F1-score < 80%). RESULTS The convolutional model identified 12 steps in the RYGB architecture. Model performance varied at each step [F1 > 90% for 7, and > 80% for 2]. The reconciled manual annotation data (F1 > 80% for > 5 steps) performed better than trainee's (F1 > 80% for 2-5 steps for 4 annotators, and < 2 steps for 4 annotators). In testing subset, certain steps had low performance, indicating potential ambiguities in surgical landmarks. Additionally, some videos were easier to annotate than others, suggesting variability. After controlling for variability, the AI algorithm was comparable to the manual (p < 0.0001). CONCLUSION AI can be used to identify surgical landmarks in RYGB comparable to the manual process. AI was more accurate to recognize some landmarks more accurately than surgeons. This technology has the potential to improve surgical training by assessing the learning curves of surgeons at scale.
Collapse
Affiliation(s)
- Danyal Fer
- University of California, San Francisco-East Bay, General Surgery, Oakland, CA, USA.,Johnson & Johnson MedTech, New Brunswick, NJ, USA
| | - Bokai Zhang
- Johnson & Johnson MedTech, New Brunswick, NJ, USA
| | - Rami Abukhalil
- Johnson & Johnson MedTech, New Brunswick, NJ, USA. .,, 5490 Great America Parkway, Santa Clara, CA, 95054, USA.
| | - Varun Goel
- University of California, San Francisco-East Bay, General Surgery, Oakland, CA, USA.,Johnson & Johnson MedTech, New Brunswick, NJ, USA
| | - Bharti Goel
- Johnson & Johnson MedTech, New Brunswick, NJ, USA
| | | | | | | | | | | |
Collapse
|
19
|
Bombieri M, Rospocher M, Ponzetto SP, Fiorini P. Machine understanding surgical actions from intervention procedure textbooks. Comput Biol Med 2023; 152:106415. [PMID: 36527782 DOI: 10.1016/j.compbiomed.2022.106415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 11/23/2022] [Accepted: 12/04/2022] [Indexed: 12/12/2022]
Abstract
The automatic extraction of procedural surgical knowledge from surgery manuals, academic papers or other high-quality textual resources, is of the utmost importance to develop knowledge-based clinical decision support systems, to automatically execute some procedure's step or to summarize the procedural information, spread throughout the texts, in a structured form usable as a study resource by medical students. In this work, we propose a first benchmark on extracting detailed surgical actions from available intervention procedure textbooks and papers. We frame the problem as a Semantic Role Labeling task. Exploiting a manually annotated dataset, we apply different Transformer-based information extraction methods. Starting from RoBERTa and BioMedRoBERTa pre-trained language models, we first investigate a zero-shot scenario and compare the obtained results with a full fine-tuning setting. We then introduce a new ad-hoc surgical language model, named SurgicBERTa, pre-trained on a large collection of surgical materials, and we compare it with the previous ones. In the assessment, we explore different dataset splits (one in-domain and two out-of-domain) and we investigate also the effectiveness of the approach in a few-shot learning scenario. Performance is evaluated on three correlated sub-tasks: predicate disambiguation, semantic argument disambiguation and predicate-argument disambiguation. Results show that the fine-tuning of a pre-trained domain-specific language model achieves the highest performance on all splits and on all sub-tasks. All models are publicly released.
Collapse
Affiliation(s)
- Marco Bombieri
- Department of Computer Science, University of Verona, Verona, Italy.
| | - Marco Rospocher
- Department of Foreign Languages and Literatures, University of Verona, Verona, Italy
| | | | - Paolo Fiorini
- Department of Computer Science, University of Verona, Verona, Italy
| |
Collapse
|
20
|
Zhang B, Sturgeon D, Shankar AR, Goel VK, Barker J, Ghanem A, Lee P, Milecky M, Stottler N, Petculescu S. Surgical instrument recognition for instrument usage documentation and surgical video library indexing. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING: IMAGING & VISUALIZATION 2022. [DOI: 10.1080/21681163.2022.2152371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Bokai Zhang
- Digital Solutions, Johnson & Johnson MedTech, Seattle, WA, USA
| | - Darrick Sturgeon
- Digital Solutions, Johnson & Johnson MedTech, Santa Clara, CA, USA
| | | | | | - Jocelyn Barker
- Digital Solutions, Johnson & Johnson MedTech, Santa Clara, CA, USA
| | - Amer Ghanem
- Digital Solutions, Johnson & Johnson MedTech, Seattle, WA, USA
| | - Philip Lee
- Digital Solutions, Johnson & Johnson MedTech, Santa Clara, CA, USA
| | - Meghan Milecky
- Digital Solutions, Johnson & Johnson MedTech, Seattle, WA, USA
| | | | | |
Collapse
|
21
|
Quero G, Mascagni P, Kolbinger FR, Fiorillo C, De Sio D, Longo F, Schena CA, Laterza V, Rosa F, Menghi R, Papa V, Tondolo V, Cina C, Distler M, Weitz J, Speidel S, Padoy N, Alfieri S. Artificial Intelligence in Colorectal Cancer Surgery: Present and Future Perspectives. Cancers (Basel) 2022; 14:cancers14153803. [PMID: 35954466 PMCID: PMC9367568 DOI: 10.3390/cancers14153803] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 07/29/2022] [Accepted: 08/03/2022] [Indexed: 02/05/2023] Open
Abstract
Artificial intelligence (AI) and computer vision (CV) are beginning to impact medicine. While evidence on the clinical value of AI-based solutions for the screening and staging of colorectal cancer (CRC) is mounting, CV and AI applications to enhance the surgical treatment of CRC are still in their early stage. This manuscript introduces key AI concepts to a surgical audience, illustrates fundamental steps to develop CV for surgical applications, and provides a comprehensive overview on the state-of-the-art of AI applications for the treatment of CRC. Notably, studies show that AI can be trained to automatically recognize surgical phases and actions with high accuracy even in complex colorectal procedures such as transanal total mesorectal excision (TaTME). In addition, AI models were trained to interpret fluorescent signals and recognize correct dissection planes during total mesorectal excision (TME), suggesting CV as a potentially valuable tool for intraoperative decision-making and guidance. Finally, AI could have a role in surgical training, providing automatic surgical skills assessment in the operating room. While promising, these proofs of concept require further development, validation in multi-institutional data, and clinical studies to confirm AI as a valuable tool to enhance CRC treatment.
Collapse
Affiliation(s)
- Giuseppe Quero
- Digestive Surgery Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Largo Agostino Gemelli 8, 00168 Rome, Italy
- Faculty of Medicine, Università Cattolica del Sacro Cuore di Roma, Largo Francesco Vito 1, 00168 Rome, Italy
| | - Pietro Mascagni
- Faculty of Medicine, Università Cattolica del Sacro Cuore di Roma, Largo Francesco Vito 1, 00168 Rome, Italy
- Institute of Image-Guided Surgery, IHU-Strasbourg, 67000 Strasbourg, France
| | - Fiona R. Kolbinger
- Department for Visceral, Thoracic and Vascular Surgery, University Hospital and Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany
| | - Claudio Fiorillo
- Digestive Surgery Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Largo Agostino Gemelli 8, 00168 Rome, Italy
- Correspondence: ; Tel.: +39-333-8747996
| | - Davide De Sio
- Digestive Surgery Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Largo Agostino Gemelli 8, 00168 Rome, Italy
| | - Fabio Longo
- Digestive Surgery Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Largo Agostino Gemelli 8, 00168 Rome, Italy
| | - Carlo Alberto Schena
- Digestive Surgery Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Largo Agostino Gemelli 8, 00168 Rome, Italy
- Faculty of Medicine, Università Cattolica del Sacro Cuore di Roma, Largo Francesco Vito 1, 00168 Rome, Italy
| | - Vito Laterza
- Digestive Surgery Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Largo Agostino Gemelli 8, 00168 Rome, Italy
- Faculty of Medicine, Università Cattolica del Sacro Cuore di Roma, Largo Francesco Vito 1, 00168 Rome, Italy
| | - Fausto Rosa
- Digestive Surgery Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Largo Agostino Gemelli 8, 00168 Rome, Italy
- Faculty of Medicine, Università Cattolica del Sacro Cuore di Roma, Largo Francesco Vito 1, 00168 Rome, Italy
| | - Roberta Menghi
- Digestive Surgery Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Largo Agostino Gemelli 8, 00168 Rome, Italy
- Faculty of Medicine, Università Cattolica del Sacro Cuore di Roma, Largo Francesco Vito 1, 00168 Rome, Italy
| | - Valerio Papa
- Digestive Surgery Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Largo Agostino Gemelli 8, 00168 Rome, Italy
- Faculty of Medicine, Università Cattolica del Sacro Cuore di Roma, Largo Francesco Vito 1, 00168 Rome, Italy
| | - Vincenzo Tondolo
- Digestive Surgery Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Largo Agostino Gemelli 8, 00168 Rome, Italy
| | - Caterina Cina
- Digestive Surgery Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Largo Agostino Gemelli 8, 00168 Rome, Italy
| | - Marius Distler
- Department for Visceral, Thoracic and Vascular Surgery, University Hospital and Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany
| | - Juergen Weitz
- Department for Visceral, Thoracic and Vascular Surgery, University Hospital and Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany
| | - Stefanie Speidel
- National Center for Tumor Diseases (NCT), Partner Site Dresden, 01307 Dresden, Germany
| | - Nicolas Padoy
- Institute of Image-Guided Surgery, IHU-Strasbourg, 67000 Strasbourg, France
- ICube, Centre National de la Recherche Scientifique (CNRS), University of Strasbourg, 67000 Strasbourg, France
| | - Sergio Alfieri
- Digestive Surgery Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Largo Agostino Gemelli 8, 00168 Rome, Italy
- Faculty of Medicine, Università Cattolica del Sacro Cuore di Roma, Largo Francesco Vito 1, 00168 Rome, Italy
| |
Collapse
|
22
|
Surgical Tool Datasets for Machine Learning Research: A Survey. Int J Comput Vis 2022. [DOI: 10.1007/s11263-022-01640-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
AbstractThis paper is a comprehensive survey of datasets for surgical tool detection and related surgical data science and machine learning techniques and algorithms. The survey offers a high level perspective of current research in this area, analyses the taxonomy of approaches adopted by researchers using surgical tool datasets, and addresses key areas of research, such as the datasets used, evaluation metrics applied and deep learning techniques utilised. Our presentation and taxonomy provides a framework that facilitates greater understanding of current work, and highlights the challenges and opportunities for further innovative and useful research.
Collapse
|
23
|
Hybrid Spatiotemporal Contrastive Representation Learning for Content-Based Surgical Video Retrieval. ELECTRONICS 2022. [DOI: 10.3390/electronics11091353] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
In the medical field, due to their economic and clinical benefits, there is a growing interest in minimally invasive surgeries and microscopic surgeries. These types of surgeries are often recorded during operations, and these recordings have become a key resource for education, patient disease analysis, surgical error analysis, and surgical skill assessment. However, manual searching in this collection of long-term surgical videos is an extremely labor-intensive and long-term task, requiring an effective content-based video analysis system. In this regard, previous methods for surgical video retrieval are based on handcrafted features which do not represent the video effectively. On the other hand, deep learning-based solutions were found to be effective in both surgical image and video analysis, where CNN-, LSTM- and CNN-LSTM-based methods were proposed in most surgical video analysis tasks. In this paper, we propose a hybrid spatiotemporal embedding method to enhance spatiotemporal representations using an adaptive fusion layer on top of the LSTM and temporal causal convolutional modules. To learn surgical video representations, we propose exploring the supervised contrastive learning approach to leverage label information in addition to augmented versions. By validating our approach to a video retrieval task on two datasets, Surgical Actions 160 and Cataract-101, we significantly improve on previous results in terms of mean average precision, 30.012 ± 1.778 vs. 22.54 ± 1.557 for Surgical Actions 160 and 81.134 ± 1.28 vs. 33.18 ± 1.311 for Cataract-101. We also validate the proposed method’s suitability for surgical phase recognition task using the benchmark Cholec80 surgical dataset, where our approach outperforms (with 90.2% accuracy) the state of the art.
Collapse
|
24
|
Das A, Bano S, Vasconcelos F, Khan DZ, Marcus HJ, Stoyanov D. Reducing Prediction volatility in the surgical workflow recognition of endoscopic pituitary surgery. Int J Comput Assist Radiol Surg 2022; 17:1445-1452. [PMID: 35362848 PMCID: PMC9307536 DOI: 10.1007/s11548-022-02599-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 03/08/2022] [Indexed: 11/25/2022]
Abstract
Purpose: Workflow recognition can aid surgeons before an operation when used as a training tool, during an operation by increasing operating room efficiency, and after an operation in the completion of operation notes. Although several methods have been applied to this task, they have been tested on few surgical datasets. Therefore, their generalisability is not well tested, particularly for surgical approaches utilising smaller working spaces which are susceptible to occlusion and necessitate frequent withdrawal of the endoscope. This leads to rapidly changing predictions, which reduces the clinical confidence of the methods, and hence limits their suitability for clinical translation. Methods: Firstly, the optimal neural network is found using established methods, using endoscopic pituitary surgery as an exemplar. Then, prediction volatility is formally defined as a new evaluation metric as a proxy for uncertainty, and two temporal smoothing functions are created. The first (modal, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$M_n$$\end{document}Mn) mode-averages over the previous n predictions, and the second (threshold, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$T_n$$\end{document}Tn) ensures a class is only changed after being continuously predicted for n predictions. Both functions are independently applied to the predictions of the optimal network. Results: The methods are evaluated on a 50-video dataset using fivefold cross-validation, and the optimised evaluation metric is weighted-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$F_1$$\end{document}F1 score. The optimal model is ResNet-50+LSTM achieving 0.84 in 3-phase classification and 0.74 in 7-step classification. Applying threshold smoothing further improves these results, achieving 0.86 in 3-phase classification, and 0.75 in 7-step classification, while also drastically reducing the prediction volatility. Conclusion: The results confirm the established methods generalise to endoscopic pituitary surgery, and show simple temporal smoothing not only reduces prediction volatility, but actively improves performance.
Collapse
Affiliation(s)
- Adrito Das
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London, United Kingdom.
| | - Sophia Bano
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London, United Kingdom
| | - Francisco Vasconcelos
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London, United Kingdom
| | - Danyal Z Khan
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London, United Kingdom
- Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London, United Kingdom
| | - Hani J Marcus
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London, United Kingdom
- Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London, United Kingdom
| | - Danail Stoyanov
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London, United Kingdom
| |
Collapse
|
25
|
Nwoye CI, Yu T, Gonzalez C, Seeliger B, Mascagni P, Mutter D, Marescaux J, Padoy N. Rendezvous: attention mechanisms for the recognition of surgical action triplets in endoscopic videos. Med Image Anal 2022; 78:102433. [DOI: 10.1016/j.media.2022.102433] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 02/25/2022] [Accepted: 03/21/2022] [Indexed: 10/18/2022]
|
26
|
Video-based fully automatic assessment of open surgery suturing skills. Int J Comput Assist Radiol Surg 2022; 17:437-448. [PMID: 35103921 PMCID: PMC8805431 DOI: 10.1007/s11548-022-02559-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 01/03/2022] [Indexed: 01/09/2023]
Abstract
Purpose The goal of this study was to develop a new reliable open surgery suturing simulation system for training medical students in situations where resources are limited or in the domestic setup. Namely, we developed an algorithm for tools and hands localization as well as identifying the interactions between them based on simple webcam video data, calculating motion metrics for assessment of surgical skill. Methods Twenty-five participants performed multiple suturing tasks using our simulator. The YOLO network was modified to a multi-task network for the purpose of tool localization and tool–hand interaction detection. This was accomplished by splitting the YOLO detection heads so that they supported both tasks with minimal addition to computer run-time. Furthermore, based on the outcome of the system, motion metrics were calculated. These metrics included traditional metrics such as time and path length as well as new metrics assessing the technique participants use for holding the tools. Results The dual-task network performance was similar to that of two networks, while computational load was only slightly bigger than one network. In addition, the motion metrics showed significant differences between experts and novices. Conclusion While video capture is an essential part of minimal invasive surgery, it is not an integral component of open surgery. Thus, new algorithms, focusing on the unique challenges open surgery videos present, are required. In this study, a dual-task network was developed to solve both a localization task and a hand–tool interaction task. The dual network may be easily expanded to a multi-task network, which may be useful for images with multiple layers and for evaluating the interaction between these different layers. Supplementary Information The online version contains supplementary material available at 10.1007/s11548-022-02559-6.
Collapse
|
27
|
Becker M, Dai J, Chang AL, Feyaerts D, Stelzer IA, Zhang M, Berson E, Saarunya G, De Francesco D, Espinosa C, Kim Y, Marić I, Mataraso S, Payrovnaziri SN, Phongpreecha T, Ravindra NG, Shome S, Tan Y, Thuraiappah M, Xue L, Mayo JA, Quaintance CC, Laborde A, King LS, Dhabhar FS, Gotlib IH, Wong RJ, Angst MS, Shaw GM, Stevenson DK, Gaudilliere B, Aghaeepour N. Revealing the impact of lifestyle stressors on the risk of adverse pregnancy outcomes with multitask machine learning. Front Pediatr 2022; 10:933266. [PMID: 36582513 PMCID: PMC9793100 DOI: 10.3389/fped.2022.933266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 11/14/2022] [Indexed: 12/15/2022] Open
Abstract
UNLABELLED Psychosocial and stress-related factors (PSFs), defined as internal or external stimuli that induce biological changes, are potentially modifiable factors and accessible targets for interventions that are associated with adverse pregnancy outcomes (APOs). Although individual APOs have been shown to be connected to PSFs, they are biologically interconnected, relatively infrequent, and therefore challenging to model. In this context, multi-task machine learning (MML) is an ideal tool for exploring the interconnectedness of APOs on the one hand and building on joint combinatorial outcomes to increase predictive power on the other hand. Additionally, by integrating single cell immunological profiling of underlying biological processes, the effects of stress-based therapeutics may be measurable, facilitating the development of precision medicine approaches. OBJECTIVES The primary objectives were to jointly model multiple APOs and their connection to stress early in pregnancy, and to explore the underlying biology to guide development of accessible and measurable interventions. MATERIALS AND METHODS In a prospective cohort study, PSFs were assessed during the first trimester with an extensive self-filled questionnaire for 200 women. We used MML to simultaneously model, and predict APOs (severe preeclampsia, superimposed preeclampsia, gestational diabetes and early gestational age) as well as several risk factors (BMI, diabetes, hypertension) for these patients based on PSFs. Strongly interrelated stressors were categorized to identify potential therapeutic targets. Furthermore, for a subset of 14 women, we modeled the connection of PSFs to the maternal immune system to APOs by building corresponding ML models based on an extensive single cell immune dataset generated by mass cytometry time of flight (CyTOF). RESULTS Jointly modeling APOs in a MML setting significantly increased modeling capabilities and yielded a highly predictive integrated model of APOs underscoring their interconnectedness. Most APOs were associated with mental health, life stress, and perceived health risks. Biologically, stressors were associated with specific immune characteristics revolving around CD4/CD8 T cells. Immune characteristics predicted based on stress were in turn found to be associated with APOs. CONCLUSIONS Elucidating connections among stress, multiple APOs simultaneously, and immune characteristics has the potential to facilitate the implementation of ML-based, individualized, integrative models of pregnancy in clinical decision making. The modifiable nature of stressors may enable the development of accessible interventions, with success tracked through immune characteristics.
Collapse
Affiliation(s)
- Martin Becker
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States.,Chair for Intelligent Data Analytics, Institute for Visual and Analytic Computing, Department of Computer Science and Electrical Engineering, University of Rostock, Rostock, Germany
| | - Jennifer Dai
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Alan L Chang
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Dorien Feyaerts
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States
| | - Ina A Stelzer
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States
| | - Miao Zhang
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Eloise Berson
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Pathology, Stanford University, Palo Alto, CA, United States
| | - Geetha Saarunya
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Davide De Francesco
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Camilo Espinosa
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Yeasul Kim
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Ivana Marić
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Samson Mataraso
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Seyedeh Neelufar Payrovnaziri
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Thanaphong Phongpreecha
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States.,Department of Pathology, Stanford University, Palo Alto, CA, United States
| | - Neal G Ravindra
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Sayane Shome
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Yuqi Tan
- Department of Microbiology & Immunology, Stanford University, Palo Alto, CA, United States.,Baxter Laboratory for Stem Cell Biology, Stanford University, Palo Alto, CA, United States
| | - Melan Thuraiappah
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Lei Xue
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Jonathan A Mayo
- Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | | | - Ana Laborde
- Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Lucy S King
- Department of Psychology, Stanford University, Palo Alto, CA, United States
| | - Firdaus S Dhabhar
- Department of Psychiatry & Behavioral Science, University of Miami, Miami, FL, United States.,Department of Microbiology & Immunology, University of Miami, Miami, FL, United States.,Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL, United States.,Miller School of Medicine, University of Miami, Miami, FL, United States
| | - Ian H Gotlib
- Department of Psychology, Stanford University, Palo Alto, CA, United States
| | - Ronald J Wong
- Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| | - Martin S Angst
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States
| | - Gary M Shaw
- Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - David K Stevenson
- Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Brice Gaudilliere
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Nima Aghaeepour
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University, Palo Alto, CA, United States.,Department of Pediatrics, Stanford University, Palo Alto, CA, United States.,Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States
| |
Collapse
|
28
|
An Interaction-Based Bayesian Network Framework for Surgical Workflow Segmentation. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph18126401. [PMID: 34199188 PMCID: PMC8296226 DOI: 10.3390/ijerph18126401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 06/03/2021] [Accepted: 06/08/2021] [Indexed: 11/25/2022]
Abstract
Recognizing and segmenting surgical workflow is important for assessing surgical skills as well as hospital effectiveness, and plays a crucial role in maintaining and improving surgical and healthcare systems. Most evidence supporting this remains signal-, video-, and/or image-based. Furthermore, casual evidence of the interaction between surgical staff remains challenging to gather and is largely absent. Here, we collected the real-time movement data of the surgical staff during a neurosurgery to explore cooperation networks among different surgical roles, namely surgeon, assistant nurse, scrub nurse, and anesthetist, and to segment surgical workflows to further assess surgical effectiveness. We installed a zone position system (ZPS) in an operating room (OR) to effectively record high-frequency high-resolution movements of all surgical staff. Measuring individual interactions in a closed, small area is difficult, and surgical workflow classification has uncertainties associated with the surgical staff in terms of their varied training and operation skills, patients in terms of their initial states and biological differences, and surgical procedures in terms of their complexities. We proposed an interaction-based framework to recognize the surgical workflow and integrated a Bayesian network (BN) to solve the uncertainty issues. Our results suggest that the proposed BN method demonstrates good performance with a high accuracy of 70%. Furthermore, it semantically explains the interaction and cooperation among surgical staff.
Collapse
|