1
|
Villani FP, Fiorentino MC, Federici L, Piazza C, Frontoni E, Paderno A, Moccia S. A Deep-Learning Approach for Vocal Fold Pose Estimation in Videoendoscopy. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01431-8. [PMID: 39939476 DOI: 10.1007/s10278-025-01431-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Revised: 01/24/2025] [Accepted: 01/27/2025] [Indexed: 02/14/2025]
Abstract
Accurate vocal fold (VF) pose estimation is crucial for diagnosing larynx diseases that can eventually lead to VF paralysis. The videoendoscopic examination is used to assess VF motility, usually estimating the change in the anterior glottic angle (AGA). This is a subjective and time-consuming procedure requiring extensive expertise. This research proposes a deep learning framework to estimate VF pose from laryngoscopy frames acquired in the actual clinical practice. The framework performs heatmap regression relying on three anatomically relevant keypoints as a prior for AGA computation, which is estimated from the coordinates of the predicted points. The assessment of the proposed framework is performed using a newly collected dataset of 471 laryngoscopy frames from 124 patients, 28 of whom with cancer. The framework was tested in various configurations and compared with other state-of-the-art approaches (direct keypoints regression and glottal segmentation) for both pose estimation, and AGA evaluation. The proposed framework obtained the lowest root mean square error (RMSE) computed on all the keypoints (5.09, 6.56, and 6.40 pixels, respectively) among all the models tested for VF pose estimation. Also for the AGA evaluation, heatmap regression reached the lowest mean average error (MAE) ( 5 . 87 ∘ ). Results show that relying on keypoints heatmap regression allows to perform VF pose estimation with a small error, overcoming drawbacks of state-of-the-art algorithms, especially in challenging images such as pathologic subjects, presence of noise, and occlusion.
Collapse
Affiliation(s)
- Francesca Pia Villani
- Department of Information Engineering, Universitá Politecnica delle Marche, Ancona, Italy.
| | | | - Lorenzo Federici
- Department of Information Engineering, Universitá Politecnica delle Marche, Ancona, Italy
| | - Cesare Piazza
- Department of Otolaryngology-Head and Neck Surgery, ASST Spedali Civili of Brescia, University of Brescia, Brescia, Italy
| | - Emanuele Frontoni
- Department of Political Sciences, Communication and International Relations, Università degli Studi di Macerata, Macerata, Italy
| | - Alberto Paderno
- Department of Otolaryngology-Head and Neck Surgery, ASST Spedali Civili of Brescia, University of Brescia, Brescia, Italy
| | - Sara Moccia
- Department of Innovative Technologies in Medicine and Dentistry, Università degli Studi "G. d'Annunzio", Chieti - Pescara, Italy
| |
Collapse
|
2
|
Magro M, Covallero N, Gambaro E, Ruffaldi E, De Momi E. A dual-instrument Kalman-based tracker to enhance robustness of microsurgical tools tracking. Int J Comput Assist Radiol Surg 2024; 19:2351-2362. [PMID: 39133431 DOI: 10.1007/s11548-024-03246-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 07/26/2024] [Indexed: 08/13/2024]
Abstract
PURPOSE The integration of a surgical robotic instrument tracking module within optical microscopes holds the potential to advance microsurgery practices, as it facilitates automated camera movements, thereby augmenting the surgeon's capability in executing surgical procedures. METHODS In the present work, an innovative detection backbone based on spatial attention module is implemented to enhance the detection accuracy of small objects within the image. Additionally, we have introduced a robust data association technique, capable to re-track surgical instrument, mainly based on the knowledge of the dual-instrument robotics system, Intersection over Union metric and Kalman filter. RESULTS The effectiveness of this pipeline was evaluated through testing on a dataset comprising ten manually annotated videos of anastomosis procedures involving either animal or phantom vessels, exploiting the Symani®Surgical System-a dedicated robotic platform designed for microsurgery. The multiple object tracking precision (MOTP) and the multiple object tracking accuracy (MOTA) are used to evaluate the performance of the proposed approach, and a new metric is computed to demonstrate the efficacy in stabilizing the tracking result along the video frames. An average MOTP of 74±0.06% and a MOTA of 99±0.03% over the test videos were found. CONCLUSION These results confirm the potential of the proposed approach in enhancing precision and reliability in microsurgical instrument tracking. Thus, the integration of attention mechanisms and a tailored data association module could be a solid base for automatizing the motion of optical microscopes.
Collapse
Affiliation(s)
- Mattia Magro
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy.
- Medical Microinstruments, Inc., Wilmington, USA.
| | | | | | | | - Elena De Momi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| |
Collapse
|
3
|
Liu Y, Hayashi Y, Oda M, Kitasaka T, Mori K. YOLOv7-RepFPN: Improving real-time performance of laparoscopic tool detection on embedded systems. Healthc Technol Lett 2024; 11:157-166. [PMID: 38638498 PMCID: PMC11022232 DOI: 10.1049/htl2.12072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 12/09/2023] [Indexed: 04/20/2024] Open
Abstract
This study focuses on enhancing the inference speed of laparoscopic tool detection on embedded devices. Laparoscopy, a minimally invasive surgery technique, markedly reduces patient recovery times and postoperative complications. Real-time laparoscopic tool detection helps assisting laparoscopy by providing information for surgical navigation, and its implementation on embedded devices is gaining interest due to the portability, network independence and scalability of the devices. However, embedded devices often face computation resource limitations, potentially hindering inference speed. To mitigate this concern, the work introduces a two-fold modification to the YOLOv7 model: the feature channels and integrate RepBlock is halved, yielding the YOLOv7-RepFPN model. This configuration leads to a significant reduction in computational complexity. Additionally, the focal EIoU (efficient intersection of union) loss function is employed for bounding box regression. Experimental results on an embedded device demonstrate that for frame-by-frame laparoscopic tool detection, the proposed YOLOv7-RepFPN achieved an mAP of 88.2% (with IoU set to 0.5) on a custom dataset based on EndoVis17, and an inference speed of 62.9 FPS. Contrasting with the original YOLOv7, which garnered an 89.3% mAP and 41.8 FPS under identical conditions, the methodology enhances the speed by 21.1 FPS while maintaining detection accuracy. This emphasizes the effectiveness of the work.
Collapse
Affiliation(s)
- Yuzhang Liu
- Graduate School of InformaticsNagoya UniversityAichi, NagoyaJapan
| | - Yuichiro Hayashi
- Graduate School of InformaticsNagoya UniversityAichi, NagoyaJapan
| | - Masahiro Oda
- Graduate School of InformaticsNagoya UniversityAichi, NagoyaJapan
- Information and CommunicationsNagoya UniversityAichi NagoyaJapan
| | - Takayuki Kitasaka
- Department of Information ScienceAichi Institute of TechnologyAichi, NagoyaJapan
| | - Kensaku Mori
- Graduate School of InformaticsNagoya UniversityAichi, NagoyaJapan
- Information and CommunicationsNagoya UniversityAichi NagoyaJapan
- Research Center of Medical BigdataNational Institute of InformaticsTokyoJapan
| |
Collapse
|
4
|
Smithmaitrie P, Khaonualsri M, Sae-Lim W, Wangkulangkul P, Jearanai S, Cheewatanakornkul S. Development of deep learning framework for anatomical landmark detection and guided dissection line during laparoscopic cholecystectomy. Heliyon 2024; 10:e25210. [PMID: 38327394 PMCID: PMC10847946 DOI: 10.1016/j.heliyon.2024.e25210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Revised: 11/15/2023] [Accepted: 01/23/2024] [Indexed: 02/09/2024] Open
Abstract
Background Bile duct injuries during laparoscopic cholecystectomy can arise from misinterpretation of biliary anatomy, leading to dissection in improper areas. The integration of a deep learning framework into laparoscopic procedures offers the potential for real-time anatomical landmark recognition, ensuring accurate dissection. The objective of this study is to develop a deep learning framework that can precisely identify anatomical landmarks, including Rouviere's sulcus and the liver base of segment IV, and provide a guided dissection line during laparoscopic cholecystectomy. Methods We retrospectively collected 40 laparoscopic cholecystectomy videos and extracted 80 images form each video to establish the dataset. Three surgeons annotated the bounding boxes of anatomical landmarks on a total of 3200 images. The YOLOv7 model was trained to detect Rouviere's sulcus and the liver base of segment IV as anatomical landmarks. Additionally, the guided dissection line was generated between these two landmarks by the proposed algorithm. To evaluate the performance of the detection model, mean average precision (mAP), precision, and recall were calculated. Furthermore, the accuracy of the guided dissection line was evaluated by three surgeons. The performance of the detection model was compared to the scaled-YOLOv4 and YOLOv5 models. Finally, the proposed framework was deployed in the operating room for real-time detection and visualization. Results The overall performance of the YOLOv7 model on validation set and testing set were 98.1 % and 91.3 %, respectively. Surgeons accepted the visualization of guide dissection line with a rate of 95.71 %. In the operating room, the well-trained model accurately identified the anatomical landmarks and generated the guided dissection line in real-time. Conclusions The proposed framework effectively identifies anatomical landmarks and generates a guided dissection line in real-time during laparoscopic cholecystectomy. This research underscores the potential of using deep learning models as computer-assisted tools in surgery, providing an assistant tool to accommodate with surgeons.
Collapse
Affiliation(s)
- Pruittikorn Smithmaitrie
- Department of Mechanical and Mechatronics Engineering, Faculty of Engineering, Prince of Songkla University, Thailand
| | - Methasit Khaonualsri
- Department of Mechanical and Mechatronics Engineering, Faculty of Engineering, Prince of Songkla University, Thailand
| | - Wannipa Sae-Lim
- Department of Computer Science, Faculty of Science, Prince of Songkla University, Thailand
| | - Piyanun Wangkulangkul
- Minimally Invasive Surgery Unit, Department of Surgery, Faculty of Medicine, Prince of Songkla University, Thailand
| | - Supakool Jearanai
- Minimally Invasive Surgery Unit, Department of Surgery, Faculty of Medicine, Prince of Songkla University, Thailand
| | - Siripong Cheewatanakornkul
- Minimally Invasive Surgery Unit, Department of Surgery, Faculty of Medicine, Prince of Songkla University, Thailand
| |
Collapse
|
5
|
Chen Z, Cruciani L, Lievore E, Fontana M, De Cobelli O, Musi G, Ferrigno G, De Momi E. Spatio-temporal layers based intra-operative stereo depth estimation network via hierarchical prediction and progressive training. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 244:107937. [PMID: 38006707 DOI: 10.1016/j.cmpb.2023.107937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 11/18/2023] [Accepted: 11/19/2023] [Indexed: 11/27/2023]
Abstract
BACKGROUND AND OBJECTIVE Safety of robotic surgery can be enhanced through augmented vision or artificial constraints to the robotl motion, and intra-operative depth estimation is the cornerstone of these applications because it provides precise position information of surgical scenes in 3D space. High-quality depth estimation of endoscopic scenes has been a valuable issue, and the development of deep learning provides more possibility and potential to address this issue. METHODS In this paper, a deep learning-based approach is proposed to recover 3D information of intra-operative scenes. To this aim, a fully 3D encoder-decoder network integrating spatio-temporal layers is designed, and it adopts hierarchical prediction and progressive learning to enhance prediction accuracy and shorten training time. RESULTS Our network gets the depth estimation accuracy of MAE 2.55±1.51 (mm) and RMSE 5.23±1.40 (mm) using 8 surgical videos with a resolution of 1280×1024, which performs better compared with six other state-of-the-art methods that were trained on the same data. CONCLUSIONS Our network can implement a promising depth estimation performance in intra-operative scenes using stereo images, allowing the integration in robot-assisted surgery to enhance safety.
Collapse
Affiliation(s)
- Ziyang Chen
- Politecnico di Milano, Department of Electronics, Information and Bioengineering, Milano, 20133, Italy.
| | - Laura Cruciani
- Politecnico di Milano, Department of Electronics, Information and Bioengineering, Milano, 20133, Italy
| | - Elena Lievore
- European Institute of Oncology, Department of Urology, IRCCS, Milan, 20141, Italy
| | - Matteo Fontana
- European Institute of Oncology, Department of Urology, IRCCS, Milan, 20141, Italy
| | - Ottavio De Cobelli
- European Institute of Oncology, Department of Urology, IRCCS, Milan, 20141, Italy; University of Milan, Department of Oncology and Onco-haematology, Faculty of Medicine and Surgery, Milan, Italy
| | - Gennaro Musi
- European Institute of Oncology, Department of Urology, IRCCS, Milan, 20141, Italy; University of Milan, Department of Oncology and Onco-haematology, Faculty of Medicine and Surgery, Milan, Italy
| | - Giancarlo Ferrigno
- Politecnico di Milano, Department of Electronics, Information and Bioengineering, Milano, 20133, Italy
| | - Elena De Momi
- Politecnico di Milano, Department of Electronics, Information and Bioengineering, Milano, 20133, Italy; European Institute of Oncology, Department of Urology, IRCCS, Milan, 20141, Italy
| |
Collapse
|
6
|
Bordbar M, Helfroush MS, Danyali H, Ejtehadi F. Wireless capsule endoscopy multiclass classification using three-dimensional deep convolutional neural network model. Biomed Eng Online 2023; 22:124. [PMID: 38098015 PMCID: PMC10722702 DOI: 10.1186/s12938-023-01186-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 11/29/2023] [Indexed: 12/17/2023] Open
Abstract
BACKGROUND Wireless capsule endoscopy (WCE) is a patient-friendly and non-invasive technology that scans the whole of the gastrointestinal tract, including difficult-to-access regions like the small bowel. Major drawback of this technology is that the visual inspection of a large number of video frames produced during each examination makes the physician diagnosis process tedious and prone to error. Several computer-aided diagnosis (CAD) systems, such as deep network models, have been developed for the automatic recognition of abnormalities in WCE frames. Nevertheless, most of these studies have only focused on spatial information within individual WCE frames, missing the crucial temporal data within consecutive frames. METHODS In this article, an automatic multiclass classification system based on a three-dimensional deep convolutional neural network (3D-CNN) is proposed, which utilizes the spatiotemporal information to facilitate the WCE diagnosis process. The 3D-CNN model fed with a series of sequential WCE frames in contrast to the two-dimensional (2D) model, which exploits frames as independent ones. Moreover, the proposed 3D deep model is compared with some pre-trained networks. The proposed models are trained and evaluated with 29 subject WCE videos (14,691 frames before augmentation). The performance advantages of 3D-CNN over 2D-CNN and pre-trained networks are verified in terms of sensitivity, specificity, and accuracy. RESULTS 3D-CNN outperforms the 2D technique in all evaluation metrics (sensitivity: 98.92 vs. 98.05, specificity: 99.50 vs. 86.94, accuracy: 99.20 vs. 92.60). In conclusion, a novel 3D-CNN model for lesion detection in WCE frames is proposed in this study. CONCLUSION The results indicate the performance of 3D-CNN over 2D-CNN and some well-known pre-trained classifier networks. The proposed 3D-CNN model uses the rich temporal information in adjacent frames as well as spatial data to develop an accurate and efficient model.
Collapse
Affiliation(s)
- Mehrdokht Bordbar
- Department of Electrical Engineering, Shiraz University of Technology, Shiraz, Iran
| | | | - Habibollah Danyali
- Department of Electrical Engineering, Shiraz University of Technology, Shiraz, Iran
| | - Fardad Ejtehadi
- Department of Internal Medicine, Gastroenterohepatology Research Center, School of Medicine, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
7
|
Villani FP, Paderno A, Fiorentino MC, Casella A, Piazza C, Moccia S. Classifying Vocal Folds Fixation from Endoscopic Videos with Machine Learning. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38082565 DOI: 10.1109/embc40787.2023.10340017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Vocal folds motility evaluation is paramount in both the assessment of functional deficits and in the accurate staging of neoplastic disease of the glottis. Diagnostic endoscopy, and in particular videoendoscopy, is nowadays the method through which the motility is estimated. The clinical diagnosis, however, relies on the examination of the videoendoscopic frames, which is a subjective and professional-dependent task. Hence, a more rigorous, objective, reliable, and repeatable method is needed. To support clinicians, this paper proposes a machine learning (ML) approach for vocal cords motility classification. From the endoscopic videos of 186 patients with both vocal cords preserved motility and fixation, a dataset of 558 images relative to the two classes was extracted. Successively, a number of features was retrieved from the images and used to train and test four well-grounded ML classifiers. From test results, the best performance was achieved using XGBoost, with precision = 0.82, recall = 0.82, F1 score = 0.82, and accuracy = 0.82. After comparing the most relevant ML models, we believe that this approach could provide precise and reliable support to clinical evaluation.Clinical Relevance- This research represents an important advancement in the state-of-the-art of computer-assisted otolaryngology, to develop an effective tool for motility assessment in the clinical practice.
Collapse
|
8
|
Louis N, Zhou L, Yule SJ, Dias RD, Manojlovich M, Pagani FD, Likosky DS, Corso JJ. Temporally guided articulated hand pose tracking in surgical videos. Int J Comput Assist Radiol Surg 2023; 18:117-125. [PMID: 36190616 PMCID: PMC9883342 DOI: 10.1007/s11548-022-02761-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 09/13/2022] [Indexed: 02/01/2023]
Abstract
PURPOSE Articulated hand pose tracking is an under-explored problem that carries the potential for use in an extensive number of applications, especially in the medical domain. With a robust and accurate tracking system on surgical videos, the motion dynamics and movement patterns of the hands can be captured and analyzed for many rich tasks. METHODS In this work, we propose a novel hand pose estimation model, CondPose, which improves detection and tracking accuracy by incorporating a pose prior into its prediction. We show improvements over state-of-the-art methods which provide frame-wise independent predictions, by following a temporally guided approach that effectively leverages past predictions. RESULTS We collect Surgical Hands, the first dataset that provides multi-instance articulated hand pose annotations for videos. Our dataset provides over 8.1k annotated hand poses from publicly available surgical videos and bounding boxes, pose annotations, and tracking IDs to enable multi-instance tracking. When evaluated on Surgical Hands, we show our method outperforms the state-of-the-art approach using mean Average Precision, to measure pose estimation accuracy, and Multiple Object Tracking Accuracy, to assess pose tracking performance. CONCLUSION In comparison to a frame-wise independent strategy, we show greater performance in detecting and tracking hand poses and more substantial impact on localization accuracy. This has positive implications in generating more accurate representations of hands in the scene to be used for targeted downstream tasks.
Collapse
Affiliation(s)
| | | | - Steven J. Yule
- Clinical Surgery, University of Edinburgh, Edinburgh, Scotland, UK
| | - Roger D. Dias
- Emergency Medicine, Harvard Medical School, Boston, MA USA
| | | | | | | | | |
Collapse
|
9
|
De Simone B, Chouillard E, Gumbs AA, Loftus TJ, Kaafarani H, Catena F. Artificial intelligence in surgery: the emergency surgeon's perspective (the ARIES project). DISCOVER HEALTH SYSTEMS 2022; 1:9. [PMID: 37521114 PMCID: PMC9734362 DOI: 10.1007/s44250-022-00014-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 11/25/2022] [Indexed: 12/12/2022]
Abstract
Artificial Intelligence (AI) has been developed and implemented in healthcare with the valuable potential to reduce health, social, and economic inequities, help actualize universal health coverage, and improve health outcomes on a global scale. The application of AI in emergency surgery settings could improve clinical practice and operating rooms management by promoting consistent, high-quality decision making while preserving the importance of bedside assessment and human intuition as well as respect for human rights and equitable surgical care, but ethical and legal issues are slowing down surgeons' enthusiasm. Emergency surgeons are aware that prioritizing education, increasing the availability of high AI technologies for emergency and trauma surgery, and funding to support research projects that use AI to provide decision support in the operating room are crucial to create an emergency "intelligent" surgery.
Collapse
Affiliation(s)
- Belinda De Simone
- Department of Emergency, Digestive and Metabolic Minimally Invasive Surgery, Poissy and St Germain en Laye Hospitals, Poissy, France
| | - Elie Chouillard
- Department of Emergency, Digestive and Metabolic Minimally Invasive Surgery, Poissy and St Germain en Laye Hospitals, Poissy, France
| | - Andrew A. Gumbs
- Department of Emergency, Digestive and Metabolic Minimally Invasive Surgery, Poissy and St Germain en Laye Hospitals, Poissy, France
| | - Tyler J. Loftus
- Department of Surgery, University of Florida Health, Gainesville, USA
| | - Haytham Kaafarani
- Division of Trauma, Emergency Surgery and Surgical Critical Care, Massachusetts General Hospital, Boston, USA
| | - Fausto Catena
- Department of Emergency and General Surgery, Level I Trauma Center, Bufalini Hospital, Cesena, Italy
| |
Collapse
|
10
|
Zhang B, Sturgeon D, Shankar AR, Goel VK, Barker J, Ghanem A, Lee P, Milecky M, Stottler N, Petculescu S. Surgical instrument recognition for instrument usage documentation and surgical video library indexing. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING: IMAGING & VISUALIZATION 2022. [DOI: 10.1080/21681163.2022.2152371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Bokai Zhang
- Digital Solutions, Johnson & Johnson MedTech, Seattle, WA, USA
| | - Darrick Sturgeon
- Digital Solutions, Johnson & Johnson MedTech, Santa Clara, CA, USA
| | | | | | - Jocelyn Barker
- Digital Solutions, Johnson & Johnson MedTech, Santa Clara, CA, USA
| | - Amer Ghanem
- Digital Solutions, Johnson & Johnson MedTech, Seattle, WA, USA
| | - Philip Lee
- Digital Solutions, Johnson & Johnson MedTech, Santa Clara, CA, USA
| | - Meghan Milecky
- Digital Solutions, Johnson & Johnson MedTech, Seattle, WA, USA
| | | | | |
Collapse
|
11
|
Owen D, Grammatikopoulou M, Luengo I, Stoyanov D. Automated identification of critical structures in laparoscopic cholecystectomy. Int J Comput Assist Radiol Surg 2022; 17:2173-2181. [PMID: 36272018 DOI: 10.1007/s11548-022-02771-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 09/30/2022] [Indexed: 11/05/2022]
Abstract
PURPOSE Bile duct injury is a significant problem in laparoscopic cholecystectomy and can have grave consequences for patient outcomes. Automatic identification of the critical structures (cystic duct and cystic artery) could potentially reduce complications during surgery by helping the surgeon establish Critical View of Safety, or eventually may even provide real time intra-operative guidance. METHODS A computer vision model was trained to identify the critical structures. Label relaxation enabled the model to cope with ambiguous spatial extent and high annotation variability. Pseudo-label self-supervision allowed the model to use unlabelled data, which can be particularly beneficial when scarce labelled data is available for training. Intrinsic variability in annotations was assessed across several annotators, quantifying the extent of annotation ambiguity and setting a baseline for model accuracy. RESULTS Using 3050 labelled and 3682 unlabelled cholecystectomy frames, the model achieved an IoU of 65% and presence detection F1 score of 75%. Inter-annotator IoU agreement was 70%, demonstrating the model was near human-level agreement on average in this dataset. The model's outputs were validated by three expert surgeons, who confirmed that its outputs were accurate and promising for future usage. CONCLUSION Identification of critical structures can achieve high accuracy, and is a promising step towards computer-assisted intervention in addition to potential applications in analytics and education. High accuracy and surgeon approval is maintained when detecting the structures separately as distinct classes. Future work will focus on guaranteeing safe identification of critical anatomy, including the bile duct, and validating the performance of automated approaches.
Collapse
Affiliation(s)
| | | | | | - Danail Stoyanov
- Digital Surgery, Medtronic, London, UK.,Wellcome / EPSRC Centre for Interventional and Surgical Sciences, University College London, London, UK
| |
Collapse
|
12
|
Surgical Tool Datasets for Machine Learning Research: A Survey. Int J Comput Vis 2022. [DOI: 10.1007/s11263-022-01640-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
AbstractThis paper is a comprehensive survey of datasets for surgical tool detection and related surgical data science and machine learning techniques and algorithms. The survey offers a high level perspective of current research in this area, analyses the taxonomy of approaches adopted by researchers using surgical tool datasets, and addresses key areas of research, such as the datasets used, evaluation metrics applied and deep learning techniques utilised. Our presentation and taxonomy provides a framework that facilitates greater understanding of current work, and highlights the challenges and opportunities for further innovative and useful research.
Collapse
|
13
|
Tiryaki ME, Demir SO, Sitti M. Deep Learning-based 3D Magnetic Microrobot Tracking using 2D MR Images. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3179509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Mehmet Efe Tiryaki
- Physical Intelligence Department, Max Planck Institute for Intelligent Systems, Stuttgart, Germany
| | - Sinan Ozgun Demir
- Physical Intelligence Department, Max Planck Institute for Intelligent Systems, Stuttgart, Germany
| | - Metin Sitti
- Physical Intelligence Department, Max Planck Institute for Intelligent Systems, Stuttgart, Germany
| |
Collapse
|
14
|
Xue Y, Liu S, Li Y, Wang P, Qian X. A new weakly supervised strategy for surgical tool detection. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107860] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
15
|
AIM in Medical Robotics. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_64] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
16
|
AIM in Endoscopy Procedures. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
17
|
Moglia A, Georgiou K, Georgiou E, Satava RM, Cuschieri A. A systematic review on artificial intelligence in robot-assisted surgery. Int J Surg 2021; 95:106151. [PMID: 34695601 DOI: 10.1016/j.ijsu.2021.106151] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 10/04/2021] [Accepted: 10/19/2021] [Indexed: 12/12/2022]
Abstract
BACKGROUND Despite the extensive published literature on the significant potential of artificial intelligence (AI) there are no reports on its efficacy in improving patient safety in robot-assisted surgery (RAS). The purposes of this work are to systematically review the published literature on AI in RAS, and to identify and discuss current limitations and challenges. MATERIALS AND METHODS A literature search was conducted on PubMed, Web of Science, Scopus, and IEEExplore according to PRISMA 2020 statement. Eligible articles were peer-review studies published in English language from January 1, 2016 to December 31, 2020. Amstar 2 was used for quality assessment. Risk of bias was evaluated with the Newcastle Ottawa Quality assessment tool. Data of the studies were visually presented in tables using SPIDER tool. RESULTS Thirty-five publications, representing 3436 patients, met the search criteria and were included in the analysis. The selected reports concern: motion analysis (n = 17), urology (n = 12), gynecology (n = 1), other specialties (n = 1), training (n = 3), and tissue retraction (n = 1). Precision for surgical tools detection varied from 76.0% to 90.6%. Mean absolute error on prediction of urinary continence after robot-assisted radical prostatectomy (RARP) ranged from 85.9 to 134.7 days. Accuracy on prediction of length of stay after RARP was 88.5%. Accuracy on recognition of the next surgical task during robot-assisted partial nephrectomy (RAPN) achieved 75.7%. CONCLUSION The reviewed studies were of low quality. The findings are limited by the small size of the datasets. Comparison between studies on the same topic was restricted due to algorithms and datasets heterogeneity. There is no proof that currently AI can identify the critical tasks of RAS operations, which determine patient outcome. There is an urgent need for studies on large datasets and external validation of the AI algorithms used. Furthermore, the results should be transparent and meaningful to surgeons, enabling them to inform patients in layman's words. REGISTRATION Review Registry Unique Identifying Number: reviewregistry1225.
Collapse
Affiliation(s)
- Andrea Moglia
- EndoCAS, Center for Computer Assisted Surgery, University of Pisa, 56124, Pisa, Italy 1st Propaedeutic Surgical Unit, Hippocrateion Athens General Hospital, Athens Medical School, National and Kapodistrian University of Athens, Greece MPLSC, Athens Medical School, National and Kapodistrian University of Athens, Greece Department of Surgery, University of Washington Medical Center, Seattle, WA, United States Scuola Superiore Sant'Anna of Pisa, 56214, Pisa, Italy Institute for Medical Science and Technology, University of Dundee, Dundee, DD2 1FD, United Kingdom
| | | | | | | | | |
Collapse
|
18
|
Ravigopal SR, Nayar NU, Desai JP. Towards Real-time pose estimation of the Mitral Valve Robot under C-arm X-ray Fluoroscopy. IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS 2021; 3:928-935. [PMID: 35756715 PMCID: PMC9232099 DOI: 10.1109/tmrb.2021.3122351] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Mitral regurgitation (MR) is a condition caused by a deformity in the mitral valve leading to the backflow of blood into the left atrium. MR can be treated through a minimally invasive procedure and our lab is currently developing a robot that could potentially be used to treat MR. The robot would carry a clip that latches onto the valve's leaflets and closes them to minimize leakage. The robot's accurate localization is needed to navigate the clip to the leaflets successfully. This paper discusses algorithms used to track the clip's position and orientation under real-time using C-arm fluoroscopy. The positions are found through a deep learning semantic segmentation framework and the pose is found by calculating its bending and rotational angles. The robot's bending angle and the clip's rotational angle is found through an equivalent ellipse algorithm and an SVM classifier, respectively, and were validated by comparing orientations obtained from an electromagnetic tracker. The bending angle calculation has an average error of 7.7° and the rotational angle calculation is 76% for classifying them into five classes. Execution times are within 100ms and hence this could be a promising approach in real-time pose estimation.
Collapse
Affiliation(s)
- Sharan R Ravigopal
- Medical Robotics and Automation (RoboMed) Laboratory, Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA
| | - Namrata U Nayar
- Medical Robotics and Automation (RoboMed) Laboratory, Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA
| | - Jaydev P Desai
- Medical Robotics and Automation (RoboMed) Laboratory, Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA
| |
Collapse
|
19
|
Wang J, Jin Y, Cai S, Xu H, Heng PA, Qin J, Wang L. Real-time landmark detection for precise endoscopic submucosal dissection via shape-aware relation network. Med Image Anal 2021; 75:102291. [PMID: 34753019 DOI: 10.1016/j.media.2021.102291] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 10/22/2021] [Accepted: 10/25/2021] [Indexed: 10/19/2022]
Abstract
We propose a novel shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection (ESD) surgery. This task is of great clinical significance but extremely challenging due to bleeding, lighting reflection, and motion blur in the complicated surgical environment. Compared with existing solutions, which either neglect geometric relationships among targeting objects or capture the relationships by using complicated aggregation schemes, the proposed network is capable of achieving satisfactory accuracy while maintaining real-time performance by taking full advantage of the spatial relations among landmarks. We first devise an algorithm to automatically generate relation keypoint heatmaps, which are able to intuitively represent the prior knowledge of spatial relations among landmarks without using any extra manual annotation efforts. We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process. While one scheme introduces pixel-level regularization by multi-task learning, the other integrates global-level regularization by harnessing a newly designed grouped consistency evaluator, which adds relation constraints to the proposed network in an adversarial manner. Both schemes are beneficial to the model in training, and can be readily unloaded in inference to achieve real-time detection. We establish a large in-house dataset of ESD surgery for esophageal cancer to validate the effectiveness of our proposed method. Extensive experimental results demonstrate that our approach outperforms state-of-the-art methods in terms of accuracy and efficiency, achieving better detection results faster. Promising results on two downstream applications further corroborate the great potential of our method in ESD clinical practice.
Collapse
Affiliation(s)
- Jiacheng Wang
- Department of Computer Science at School of Informatics, Xiamen University, Xiamen 361005, China
| | - Yueming Jin
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, China
| | - Shuntian Cai
- Department of Gastroenterology, Zhongshan Hospital affiliated to Xiamen University, Xiamen, China
| | - Hongzhi Xu
- Department of Gastroenterology, Zhongshan Hospital affiliated to Xiamen University, Xiamen, China
| | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, China
| | - Jing Qin
- Center for Smart Health, School of Nursing, The Hong Kong Polytechnic University, Hong Kong
| | - Liansheng Wang
- Department of Computer Science at School of Informatics, Xiamen University, Xiamen 361005, China.
| |
Collapse
|
20
|
Xue Y, Li Y, Liu S, Wang P, Qian X. Oriented Localization of Surgical Tools by Location Encoding. IEEE Trans Biomed Eng 2021; 69:1469-1480. [PMID: 34652994 DOI: 10.1109/tbme.2021.3120430] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Surgical tool localization is the foundation to a series of advanced surgical functions e.g. image guided surgical navigation. For precise scenarios like surgical tool localization, sophisticated tools and sensitive tissues can be quite close. This requires a higher localization accuracy than general object localization. And it is also meaningful to know the orientation of tools. To achieve these, this paper proposes a Compressive Sensing based Location Encoding scheme, which formulates the task of surgical tool localization in pixel space into a task of vector regression in encoding space. Furthermore with this scheme, the method is able to capture orientation of surgical tools rather than simply outputting horizontal bounding boxes. To prevent gradient vanishing, a novel back-propagation rule for sparse reconstruction is derived. The back-propagation rule is applicable to different implementations of sparse reconstruction and renders the entire network end-to-end trainable. Finally, the proposed approach gives more accurate bounding boxes as well as capturing the orientation of tools, and achieves state-of-the-art performance compared with 9 competitive both oriented and non-oriented localization methods (RRD, RefineDet, etc) on a mainstream surgical image dataset: m2cai16-tool-locations. A range of experiments support our claim that regression in CSLE space performs better than traditionally detecting bounding boxes in pixel space.
Collapse
|
21
|
Chen IHA, Ghazi A, Sridhar A, Stoyanov D, Slack M, Kelly JD, Collins JW. Evolving robotic surgery training and improving patient safety, with the integration of novel technologies. World J Urol 2021; 39:2883-2893. [PMID: 33156361 PMCID: PMC8405494 DOI: 10.1007/s00345-020-03467-7] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 09/21/2020] [Indexed: 12/18/2022] Open
Abstract
INTRODUCTION Robot-assisted surgery is becoming increasingly adopted by multiple surgical specialties. There is evidence of inherent risks of utilising new technologies that are unfamiliar early in the learning curve. The development of standardised and validated training programmes is crucial to deliver safe introduction. In this review, we aim to evaluate the current evidence and opportunities to integrate novel technologies into modern digitalised robotic training curricula. METHODS A systematic literature review of the current evidence for novel technologies in surgical training was conducted online and relevant publications and information were identified. Evaluation was made on how these technologies could further enable digitalisation of training. RESULTS Overall, the quality of available studies was found to be low with current available evidence consisting largely of expert opinion, consensus statements and small qualitative studies. The review identified that there are several novel technologies already being utilised in robotic surgery training. There is also a trend towards standardised validated robotic training curricula. Currently, the majority of the validated curricula do not incorporate novel technologies and training is delivered with more traditional methods that includes centralisation of training services with wet laboratories that have access to cadavers and dedicated training robots. CONCLUSIONS Improvements to training standards and understanding performance data have good potential to significantly lower complications in patients. Digitalisation automates data collection and brings data together for analysis. Machine learning has potential to develop automated performance feedback for trainees. Digitalised training aims to build on the current gold standards and to further improve the 'continuum of training' by integrating PBP training, 3D-printed models, telementoring, telemetry and machine learning.
Collapse
Affiliation(s)
- I-Hsuan Alan Chen
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, University College London, London, UK.
- Department of Surgery, Division of Urology, Kaohsiung Veterans General Hospital, No. 386, Dazhong 1st Rd., Zuoying District, Kaohsiung, 81362, Taiwan.
- Wellcome/ESPRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, UK.
| | - Ahmed Ghazi
- Department of Urology, Simulation Innovation Laboratory, University of Rochester, New York, USA
| | - Ashwin Sridhar
- Division of Uro-Oncology, University College London Hospital, London, UK
| | - Danail Stoyanov
- Wellcome/ESPRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, UK
| | | | - John D Kelly
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, University College London, London, UK
- Wellcome/ESPRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, UK
- Division of Uro-Oncology, University College London Hospital, London, UK
| | - Justin W Collins
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, University College London, London, UK.
- Wellcome/ESPRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, UK.
- Division of Uro-Oncology, University College London Hospital, London, UK.
| |
Collapse
|
22
|
Battaglia E, Boehm J, Zheng Y, Jamieson AR, Gahan J, Majewicz Fey A. Rethinking Autonomous Surgery: Focusing on Enhancement over Autonomy. Eur Urol Focus 2021; 7:696-705. [PMID: 34246619 PMCID: PMC10394949 DOI: 10.1016/j.euf.2021.06.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Revised: 05/28/2021] [Accepted: 06/17/2021] [Indexed: 12/12/2022]
Abstract
CONTEXT As robot-assisted surgery is increasingly used in surgical care, the engineering research effort towards surgical automation has also increased significantly. Automation promises to enhance surgical outcomes, offload mundane or repetitive tasks, and improve workflow. However, we must ask an important question: should autonomous surgery be our long-term goal? OBJECTIVE To provide an overview of the engineering requirements for automating control systems, summarize technical challenges in automated robotic surgery, and review sensing and modeling techniques to capture real-time human behaviors for integration into the robotic control loop for enhanced shared or collaborative control. EVIDENCE ACQUISITION We performed a nonsystematic search of the English language literature up to March 25, 2021. We included original studies related to automation in robot-assisted laparoscopic surgery and human-centered sensing and modeling. EVIDENCE SYNTHESIS We identified four comprehensive review papers that present techniques for automating portions of surgical tasks. Sixteen studies relate to human-centered sensing technologies and 23 to computer vision and/or advanced artificial intelligence or machine learning methods for skill assessment. Twenty-two studies evaluate or review the role of haptic or adaptive guidance during some learning task, with only a few applied to robotic surgery. Finally, only three studies discuss the role of some form of training in patient outcomes and none evaluated the effects of full or semi-autonomy on patient outcomes. CONCLUSIONS Rather than focusing on autonomy, which eliminates the surgeon from the loop, research centered on more fully understanding the surgeon's behaviors, goals, and limitations could facilitate a superior class of collaborative surgical robots that could be more effective and intelligent than automation alone. PATIENT SUMMARY We reviewed the literature for studies on automation in surgical robotics and on modeling of human behavior in human-machine interaction. The main application is to enhance the ability of surgical robotic systems to collaborate more effectively and intelligently with human surgeon operators.
Collapse
Affiliation(s)
- Edoardo Battaglia
- Department of Mechanical Engineering, University of Texas at Austin, Austin, TX, USA
| | - Jacob Boehm
- Department of Mechanical Engineering, University of Texas at Austin, Austin, TX, USA
| | - Yi Zheng
- Department of Mechanical Engineering, University of Texas at Austin, Austin, TX, USA
| | - Andrew R Jamieson
- Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX, USA
| | - Jeffrey Gahan
- Department of Urology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Ann Majewicz Fey
- Department of Mechanical Engineering, University of Texas at Austin, Austin, TX, USA.
| |
Collapse
|
23
|
Tip estimation approach for concentric tube robots using 2D ultrasound images and kinematic model. Med Biol Eng Comput 2021; 59:1461-1473. [PMID: 34156603 DOI: 10.1007/s11517-021-02369-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 04/27/2021] [Indexed: 10/21/2022]
Abstract
Concentric tube robot (CTR) is an efficient approach for minimally invasive surgery (MIS) and diagnosis due to its small size and high dexterity. To manipulate the robot accurately and safely inside the human body, tip position and shape information need to be well measured. In this paper, we propose a tip estimation method based on 2D ultrasound images with the help of the forward kinematic model of CTR. The forward kinematic model can help to provide a fast ultrasound scanning path and narrow the region of interest in ultrasound images. For each tube, only three scan positions are needed by combining the kinematic model prediction as prior knowledge. After that, the curve fitting method is used for its shape reconstruction, while its tip position can be estimated based on the constraints of its structure and length.7 This method provides the advantage that only three scan positions are needed for estimating the tip of each telescoping section. Moreover, no structure modification is needed on the robot, which makes it an appropriate approach for existing flexible surgical robots. Experimental results verified the feasibility of the proposed method and the tip estimation error is 0.59 mm. Graphical abstract In this paper, we propose a tip estimation method based on 2D Ultrasound images with the help of the forward kinematic model of CTR. The forward kinematic model can help to provide a fast Ultrasound scanning path and narrow the region of interest in Ultrasound images. For each tube, only three scan positions are needed by combining the kinematic model prediction as prior knowledge. After that, the curve fitting method is used for its shape reconstruction, while its tip position can be estimated based on the constraints of its structure and length.
Collapse
|
24
|
Penso M, Moccia S, Scafuri S, Muscogiuri G, Pontone G, Pepi M, Caiani EG. Automated left and right ventricular chamber segmentation in cardiac magnetic resonance images using dense fully convolutional neural network. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 204:106059. [PMID: 33812305 DOI: 10.1016/j.cmpb.2021.106059] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 03/16/2021] [Indexed: 06/12/2023]
Abstract
BACKGROUND AND OBJECTIVE Segmentation of the left ventricular (LV) myocardium (Myo) and RV endocardium on cine cardiac magnetic resonance (CMR) images represents an essential step for cardiac-function evaluation and diagnosis. In order to have a common reference for comparing segmentation algorithms, several CMR image datasets were made available, but in general they do not include the most apical and basal slices, and/or gold standard tracing is limited to only one of the two ventricles, thus not fully corresponding to real clinical practice. Our aim was to develop a deep learning (DL) approach for automated segmentation of both RV and LV chambers from short-axis (SAX) CMR images, reporting separately the performance for basal slices, together with the applied criterion of choice. METHOD A retrospectively selected database (DB1) of 210 cine sequences (3 pathology groups) was considered: images (GE, 1.5 T) were acquired at Centro Cardiologico Monzino (Milan, Italy), and end-diastolic (ED) and end-systolic frames (ES) were manually segmented (gold standard, GS). Automatic ED and ES RV and LV segmentation were performed with a U-Net inspired architecture, where skip connections were redesigned introducing dense blocks to alleviate the semantic gap between the U-Net encoder and decoder. The proposed architecture was trained including: A) the basal slices where the Myo surrounded the LV for at least the 50% and all the other slice; B) all the slices where the Myo completely surrounded the LV. To evaluate the clinical relevance of the proposed architecture in a practical use case scenario, a graphical user interface was developed to allow clinicians to revise, and correct when needed, the automatic segmentation. Additionally, to assess generalizability, analysis of CMR images obtained in 12 healthy volunteers (DB2) with different equipment (Siemens, 3T) and settings was performed. RESULTS The proposed architecture outperformed the original U-Net. Comparing the performance on DB1 between the two criteria, no significant differences were measured when considering all slices together, but were present when only basal slices were examined. Automatic and manually-adjusted segmentation performed similarly compared to the GS (bias±95%LoA): LVEDV -1±12 ml, LVESV -1±14 ml, RVEDV 6±12 ml, RVESV 6±14 ml, ED LV mass 6±26 g, ES LV mass 5±26 g). Also, generalizability showed very similar performance, with Dice scores of 0.944 (LV), 0.908 (RV) and 0.852 (Myo) on DB1, and 0.940 (LV), 0.880 (RV), and 0.856 (Myo) on DB2. CONCLUSIONS Our results support the potential of DL methods for accurate LV and RV contours segmentation and the advantages of dense skip connections in alleviating the semantic gap generated when high level features are concatenated with lower level feature. The evaluation on our dataset, considering separately the performance on basal and apical slices, reveals the potential of DL approaches for fast, accurate and reliable automated cardiac segmentation in a real clinical setting.
Collapse
Affiliation(s)
- Marco Penso
- Department of Cardiovascular Imaging, Centro Cardiologico Monzino IRCCS, Milan, Italy.
| | - Sara Moccia
- Department of Information Engineering, Università Politecnica delle Marche, Ancona, Italy; The BioRobotics Institute, Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Pisa, Italy.
| | - Stefano Scafuri
- Department of Cardiovascular Imaging, Centro Cardiologico Monzino IRCCS, Milan, Italy.
| | - Giuseppe Muscogiuri
- Department of Cardiovascular Imaging, Centro Cardiologico Monzino IRCCS, Milan, Italy.
| | - Gianluca Pontone
- Department of Cardiovascular Imaging, Centro Cardiologico Monzino IRCCS, Milan, Italy.
| | - Mauro Pepi
- Department of Cardiovascular Imaging, Centro Cardiologico Monzino IRCCS, Milan, Italy.
| | - Enrico Gianluca Caiani
- Department of Electronics, Information and Biomedical engineering, Politecnico di Milano, Milan, Italy; Consiglio Nazionale delle Ricerche, Istituto di Elettronica e di Ingegneria dell'Informazione e delle Telecomunicazioni, Milan, Italy.
| |
Collapse
|
25
|
Robu M, Kadkhodamohammadi A, Luengo I, Stoyanov D. Towards real-time multiple surgical tool tracking. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING: IMAGING & VISUALIZATION 2021. [DOI: 10.1080/21681163.2020.1835553] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Maria Robu
- Digital Surgery, Medtronic Company, London, UK
| | | | | | - Danail Stoyanov
- Digital Surgery, Medtronic Company, London, UK
- University College London, London, UK
| |
Collapse
|
26
|
Lazo JF, Marzullo A, Moccia S, Catellani M, Rosa B, de Mathelin M, De Momi E. Using spatial-temporal ensembles of convolutional neural networks for lumen segmentation in ureteroscopy. Int J Comput Assist Radiol Surg 2021; 16:915-922. [PMID: 33909264 PMCID: PMC8166718 DOI: 10.1007/s11548-021-02376-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 04/09/2021] [Indexed: 11/05/2022]
Abstract
Purpose Ureteroscopy is an efficient endoscopic minimally invasive technique for the diagnosis and treatment of upper tract urothelial carcinoma. During ureteroscopy, the automatic segmentation of the hollow lumen is of primary importance, since it indicates the path that the endoscope should follow. In order to obtain an accurate segmentation of the hollow lumen, this paper presents an automatic method based on convolutional neural networks (CNNs). Methods The proposed method is based on an ensemble of 4 parallel CNNs to simultaneously process single and multi-frame information. Of these, two architectures are taken as core-models, namely U-Net based in residual blocks (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$m_1$$\end{document}m1) and Mask-RCNN (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$m_2$$\end{document}m2), which are fed with single still-frames I(t). The other two models (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$M_1$$\end{document}M1, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$M_2$$\end{document}M2) are modifications of the former ones consisting on the addition of a stage which makes use of 3D convolutions to process temporal information. \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$M_1$$\end{document}M1, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$M_2$$\end{document}M2 are fed with triplets of frames (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$I(t-1)$$\end{document}I(t-1), I(t), \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$I(t+1)$$\end{document}I(t+1)) to produce the segmentation for I(t). Results The proposed method was evaluated using a custom dataset of 11 videos (2673 frames) which were collected and manually annotated from 6 patients. We obtain a Dice similarity coefficient of 0.80, outperforming previous state-of-the-art methods. Conclusion The obtained results show that spatial-temporal information can be effectively exploited by the ensemble model to improve hollow lumen segmentation in ureteroscopic images. The method is effective also in the presence of poor visibility, occasional bleeding, or specular reflections. Supplementary Information The online version supplementary material available at 10.1007/s11548-021-02376-3.
Collapse
Affiliation(s)
- Jorge F Lazo
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy. .,ICube, UMR 7357, CNRS-Université de Strasbourg, Strasbourg, France.
| | - Aldo Marzullo
- Department of Mathematics and Computer Science, University of Calabria, Rende, CS, Italy
| | - Sara Moccia
- The BioRobotics Institute, Scuola Superiore Sant'Anna, Pisa, Italy.,Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Pisa, Italy
| | | | - Benoit Rosa
- ICube, UMR 7357, CNRS-Université de Strasbourg, Strasbourg, France
| | | | - Elena De Momi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| |
Collapse
|
27
|
Marzullo A, Moccia S, Catellani M, Calimeri F, Momi ED. Towards realistic laparoscopic image generation using image-domain translation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 200:105834. [PMID: 33229016 DOI: 10.1016/j.cmpb.2020.105834] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Accepted: 11/05/2020] [Indexed: 06/11/2023]
Abstract
Background and ObjectivesOver the last decade, Deep Learning (DL) has revolutionized data analysis in many areas, including medical imaging. However, there is a bottleneck in the advancement of DL in the surgery field, which can be seen in a shortage of large-scale data, which in turn may be attributed to the lack of a structured and standardized methodology for storing and analyzing surgical images in clinical centres. Furthermore, accurate annotations manually added are expensive and time consuming. A great help can come from the synthesis of artificial images; in this context, in the latest years, the use of Generative Adversarial Neural Networks (GANs) achieved promising results in obtaining photo-realistic images. MethodsIn this study, a method for Minimally Invasive Surgery (MIS) image synthesis is proposed. To this aim, the generative adversarial network pix2pix is trained to generate paired annotated MIS images by transforming rough segmentation of surgical instruments and tissues into realistic images. An additional regularization term was added to the original optimization problem, in order to enhance realism of surgical tools with respect to the background. Results Quantitative and qualitative (i.e., human-based) evaluations of generated images have been carried out in order to assess the effectiveness of the method. ConclusionsExperimental results show that the proposed method is actually able to translate MIS segmentations to realistic MIS images, which can in turn be used to augment existing data sets and help at overcoming the lack of useful images; this allows physicians and algorithms to take advantage from new annotated instances for their training.
Collapse
Affiliation(s)
- Aldo Marzullo
- Department of Mathematics and Computer Science, University of Calabria, Rende, Italy.
| | - Sara Moccia
- Department of Information Engineering, Unviersitá Politecnica delle Marche, Ancona, Italy; Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genoa, Italy
| | - Michele Catellani
- Department of urology, European Institute of Oncology, IRCCS, Milan, Italy
| | - Francesco Calimeri
- Department of Mathematics and Computer Science, University of Calabria, Rende, Italy
| | - Elena De Momi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| |
Collapse
|
28
|
Casella A, Moccia S, Paladini D, Frontoni E, De Momi E, Mattos LS. A shape-constraint adversarial framework with instance-normalized spatio-temporal features for inter-fetal membrane segmentation. Med Image Anal 2021; 70:102008. [PMID: 33647785 DOI: 10.1016/j.media.2021.102008] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Revised: 12/17/2020] [Accepted: 02/16/2021] [Indexed: 12/01/2022]
Abstract
BACKGROUND AND OBJECTIVES During Twin-to-Twin Transfusion Syndrome (TTTS), abnormal vascular anastomoses in the monochorionic placenta can produce uneven blood flow between the fetuses. In the current practice, this syndrome is surgically treated by closing the abnormal connections using laser ablation. Surgeons commonly use the inter-fetal membrane as a reference. Limited field of view, low fetoscopic image quality and high inter-subject variability make the membrane identification a challenging task. However, currently available tools are not optimal for automatic membrane segmentation in fetoscopic videos, due to membrane texture homogeneity and high illumination variability. METHODS To tackle these challenges, we present a new deep-learning framework for inter-fetal membrane segmentation on in-vivo fetoscopic videos. The framework enhances existing architectures by (i) encoding a novel (instance-normalized) dense block, invariant to illumination changes, that extracts spatio-temporal features to enforce pixel connectivity in time, and (ii) relying on an adversarial training, which constrains macro appearance. RESULTS We performed a comprehensive validation using 20 different videos (2000 frames) from 20 different surgeries, achieving a mean Dice Similarity Coefficient of 0.8780±0.1383. CONCLUSIONS The proposed framework has great potential to positively impact the actual surgical practice for TTTS treatment, allowing the implementation of surgical guidance systems that can enhance context awareness and potentially lower the duration of the surgeries.
Collapse
Affiliation(s)
- Alessandro Casella
- Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genoa, Italy; Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy.
| | - Sara Moccia
- The BioRobotics Institute, Scuola Superiore Sant'Anna, Pisa, Italy; Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Pisa, Italy
| | - Dario Paladini
- Department of Fetal and Perinatal Medicine, Istituto "Giannina Gaslini", Genoa, Italy
| | - Emanuele Frontoni
- Department of Information Engineering, Universitá Politecnica delle Marche, Ancona, Italy
| | - Elena De Momi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Leonard S Mattos
- Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genoa, Italy
| |
Collapse
|
29
|
Shimizu T, Hachiuma R, Kajita H, Takatsume Y, Saito H. Hand Motion-Aware Surgical Tool Localization and Classification from an Egocentric Camera. J Imaging 2021; 7:15. [PMID: 34460614 PMCID: PMC8321273 DOI: 10.3390/jimaging7020015] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 01/15/2021] [Accepted: 01/18/2021] [Indexed: 11/16/2022] Open
Abstract
Detecting surgical tools is an essential task for the analysis and evaluation of surgical videos. However, in open surgery such as plastic surgery, it is difficult to detect them because there are surgical tools with similar shapes, such as scissors and needle holders. Unlike endoscopic surgery, the tips of the tools are often hidden in the operating field and are not captured clearly due to low camera resolution, whereas the movements of the tools and hands can be captured. As a result that the different uses of each tool require different hand movements, it is possible to use hand movement data to classify the two types of tools. We combined three modules for localization, selection, and classification, for the detection of the two tools. In the localization module, we employed the Faster R-CNN to detect surgical tools and target hands, and in the classification module, we extracted hand movement information by combining ResNet-18 and LSTM to classify two tools. We created a dataset in which seven different types of open surgery were recorded, and we provided the annotation of surgical tool detection. Our experiments show that our approach successfully detected the two different tools and outperformed the two baseline methods.
Collapse
Affiliation(s)
- Tomohiro Shimizu
- Faculty of Science and Technology, Keio University, Yokohama, Kanagawa 223-8852, Japan; (R.H.); (H.S.)
| | - Ryo Hachiuma
- Faculty of Science and Technology, Keio University, Yokohama, Kanagawa 223-8852, Japan; (R.H.); (H.S.)
| | - Hiroki Kajita
- Keio University School of Medicine, Shinjuku-ku 160-8582, Tokyo, Japan; (H.K.); (Y.T.)
| | - Yoshifumi Takatsume
- Keio University School of Medicine, Shinjuku-ku 160-8582, Tokyo, Japan; (H.K.); (Y.T.)
| | - Hideo Saito
- Faculty of Science and Technology, Keio University, Yokohama, Kanagawa 223-8852, Japan; (R.H.); (H.S.)
| |
Collapse
|
30
|
Ghatwary N, Zolgharni M, Janan F, Ye X. Learning Spatiotemporal Features for Esophageal Abnormality Detection From Endoscopic Videos. IEEE J Biomed Health Inform 2021; 25:131-142. [PMID: 32750901 DOI: 10.1109/jbhi.2020.2995193] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Esophageal cancer is categorized as a type of disease with a high mortality rate. Early detection of esophageal abnormalities (i.e. precancerous and early cancerous) can improve the survival rate of the patients. Recent deep learning-based methods for selected types of esophageal abnormality detection from endoscopic images have been proposed. However, no methods have been introduced in the literature to cover the detection from endoscopic videos, detection from challenging frames and detection of more than one esophageal abnormality type. In this paper, we present an efficient method to automatically detect different types of esophageal abnormalities from endoscopic videos. We propose a novel 3D Sequential DenseConvLstm network that extracts spatiotemporal features from the input video. Our network incorporates 3D Convolutional Neural Network (3DCNN) and Convolutional Lstm (ConvLstm) to efficiently learn short and long term spatiotemporal features. The generated feature map is utilized by a region proposal network and ROI pooling layer to produce a bounding box that detects abnormality regions in each frame throughout the video. Finally, we investigate a post-processing method named Frame Search Conditional Random Field (FS-CRF) that improves the overall performance of the model by recovering the missing regions in neighborhood frames within the same clip. We extensively validate our model on an endoscopic video dataset that includes a variety of esophageal abnormalities. Our model achieved high performance using different evaluation metrics showing 93.7% recall, 92.7% precision, and 93.2% F-measure. Moreover, as no results have been reported in the literature for the esophageal abnormality detection from endoscopic videos, to validate the robustness of our model, we have tested the model on a publicly available colonoscopy video dataset, achieving the polyp detection performance in a recall of 81.18%, precision of 96.45% and F-measure 88.16%, compared to the state-of-the-art results of 78.84% recall, 90.51% precision and 84.27% F-measure using the same dataset. This demonstrates that the proposed method can be adapted to different gastrointestinal endoscopic video applications with a promising performance.
Collapse
|
31
|
Moccia S, De Momi E. AIM in Medical Robotics. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_64-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
32
|
Fiorentino MC, Moccia S, Capparuccini M, Giamberini S, Frontoni E. A regression framework to head-circumference delineation from US fetal images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 198:105771. [PMID: 33049451 DOI: 10.1016/j.cmpb.2020.105771] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Accepted: 09/20/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND AND OBJECTIVES Measuring head-circumference (HC) length from ultrasound (US) images is a crucial clinical task to assess fetus growth. To lower intra- and inter-operator variability in HC length measuring, several computer-assisted solutions have been proposed in the years. Recently, a large number of deep-learning approaches is addressing the problem of HC delineation through the segmentation of the whole fetal head via convolutional neural networks (CNNs). Since the task is a edge-delineation problem, we propose a different strategy based on regression CNNs. METHODS The proposed framework consists of a region-proposal CNN for head localization and centering, and a regression CNN for accurately delineate the HC. The first CNN is trained exploiting transfer learning, while we propose a training strategy for the regression CNN based on distance fields. RESULTS The framework was tested on the HC18 Challenge dataset, which consists of 999 training and 335 testing images. A mean absolute difference of 1.90 ( ± 1.76) mm and a Dice similarity coefficient of 97.75 ( ± 1.32) % were achieved, overcoming approaches in the literature. CONCLUSIONS The experimental results showed the effectiveness of the proposed framework, proving its potential in supporting clinicians during the clinical practice.
Collapse
Affiliation(s)
- Maria Chiara Fiorentino
- Department of Information Engineering, Universita Politecnica delle Marche, Via Brecce Bianche, 12, Ancona 60131, Italy
| | - Sara Moccia
- Department of Information Engineering, Universita Politecnica delle Marche, Via Brecce Bianche, 12, Ancona 60131, Italy; Department of Advanced Robotics, Istituto Italiano di Tecnologia, Via Morego, 30, Genova 16163, Italy.
| | - Morris Capparuccini
- Department of Information Engineering, Universita Politecnica delle Marche, Via Brecce Bianche, 12, Ancona 60131, Italy
| | - Sara Giamberini
- Department of Information Engineering, Universita Politecnica delle Marche, Via Brecce Bianche, 12, Ancona 60131, Italy
| | - Emanuele Frontoni
- Department of Information Engineering, Universita Politecnica delle Marche, Via Brecce Bianche, 12, Ancona 60131, Italy
| |
Collapse
|
33
|
Marzullo A, Moccia S, Calimeri F, De Momi E. AIM in Endoscopy Procedures. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_164-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2022]
|
34
|
Fu Z, Jin Z, Zhang C, He Z, Zha Z, Hu C, Gan T, Yan Q, Wang P, Ye X. The Future of Endoscopic Navigation: A Review of Advanced Endoscopic Vision Technology. IEEE ACCESS 2021; 9:41144-41167. [DOI: 10.1109/access.2021.3065104] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
|
35
|
Chadebecq F, Vasconcelos F, Mazomenos E, Stoyanov D. Computer Vision in the Surgical Operating Room. Visc Med 2020; 36:456-462. [PMID: 33447601 DOI: 10.1159/000511934] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 09/30/2020] [Indexed: 12/20/2022] Open
Abstract
Background Multiple types of surgical cameras are used in modern surgical practice and provide a rich visual signal that is used by surgeons to visualize the clinical site and make clinical decisions. This signal can also be used by artificial intelligence (AI) methods to provide support in identifying instruments, structures, or activities both in real-time during procedures and postoperatively for analytics and understanding of surgical processes. Summary In this paper, we provide a succinct perspective on the use of AI and especially computer vision to power solutions for the surgical operating room (OR). The synergy between data availability and technical advances in computational power and AI methodology has led to rapid developments in the field and promising advances. Key Messages With the increasing availability of surgical video sources and the convergence of technologies around video storage, processing, and understanding, we believe clinical solutions and products leveraging vision are going to become an important component of modern surgical capabilities. However, both technical and clinical challenges remain to be overcome to efficiently make use of vision-based approaches into the clinic.
Collapse
Affiliation(s)
- François Chadebecq
- Department of Computer Science, Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, United Kingdom
| | - Francisco Vasconcelos
- Department of Computer Science, Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, United Kingdom
| | - Evangelos Mazomenos
- Department of Computer Science, Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, United Kingdom
| | - Danail Stoyanov
- Department of Computer Science, Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, United Kingdom
| |
Collapse
|
36
|
Attanasio A, Scaglioni B, Leonetti M, Frangi AF, Cross W, Biyani CS, Valdastri P. Autonomous Tissue Retraction in Robotic Assisted Minimally Invasive Surgery – A Feasibility Study. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.3013914] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
37
|
Yang C, Zhao Z, Hu S. Image-based laparoscopic tool detection and tracking using convolutional neural networks: a review of the literature. Comput Assist Surg (Abingdon) 2020; 25:15-28. [PMID: 32886540 DOI: 10.1080/24699322.2020.1801842] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
Abstract
Intraoperative detection and tracking of minimally invasive instruments is a prerequisite for computer- and robotic-assisted surgery. Since additional hardware, such as tracking systems or the robot encoders, are cumbersome and lack accuracy, surgical vision is evolving as a promising technique to detect and track the instruments using only endoscopic images. The present paper presents a review of the literature regarding image-based laparoscopic tool detection and tracking using convolutional neural networks (CNNs) and consists of four primary parts: (1) fundamentals of CNN; (2) public datasets; (3) CNN-based methods for the detection and tracking of laparoscopic instruments; and (4) discussion and conclusion. To help researchers quickly understand the various existing CNN-based algorithms, some basic information and a quantitative estimation of several performances are analyzed and compared from the perspective of 'partial CNN approaches' and 'full CNN approaches'. Moreover, we highlight the challenges related to research of CNN-based detection algorithms and provide possible future developmental directions.
Collapse
Affiliation(s)
- Congmin Yang
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Zijian Zhao
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Sanyuan Hu
- Department of General surgery, First Affiliated Hospital of Shandong First Medical University, Jinan, China
| |
Collapse
|
38
|
Zaffino P, Moccia S, De Momi E, Spadea MF. A Review on Advances in Intra-operative Imaging for Surgery and Therapy: Imagining the Operating Room of the Future. Ann Biomed Eng 2020; 48:2171-2191. [PMID: 32601951 DOI: 10.1007/s10439-020-02553-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Accepted: 06/17/2020] [Indexed: 12/19/2022]
Abstract
With the advent of Minimally Invasive Surgery (MIS), intra-operative imaging has become crucial for surgery and therapy guidance, allowing to partially compensate for the lack of information typical of MIS. This paper reviews the advancements in both classical (i.e. ultrasounds, X-ray, optical coherence tomography and magnetic resonance imaging) and more recent (i.e. multispectral, photoacoustic and Raman imaging) intra-operative imaging modalities. Each imaging modality was analyzed, focusing on benefits and disadvantages in terms of compatibility with the operating room, costs, acquisition time and image characteristics. Tables are included to summarize this information. New generation of hybrid surgical room and algorithms for real time/in room image processing were also investigated. Each imaging modality has its own (site- and procedure-specific) peculiarities in terms of spatial and temporal resolution, field of view and contrasted tissues. Besides the benefits that each technique offers for guidance, considerations about operators and patient risk, costs, and extra time required for surgical procedures have to be considered. The current trend is to equip surgical rooms with multimodal imaging systems, so as to integrate multiple information for real-time data extraction and computer-assisted processing. The future of surgery is to enhance surgeons eye to minimize intra- and after-surgery adverse events and provide surgeons with all possible support to objectify and optimize the care-delivery process.
Collapse
Affiliation(s)
- Paolo Zaffino
- Department of Experimental and Clinical Medicine, Universitá della Magna Graecia, Catanzaro, Italy
| | - Sara Moccia
- Department of Information Engineering (DII), Universitá Politecnica delle Marche, via Brecce Bianche, 12, 60131, Ancona, AN, Italy.
| | - Elena De Momi
- Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, Piazza Leonardo da Vinci, 32, 20133, Milano, MI, Italy
| | - Maria Francesca Spadea
- Department of Experimental and Clinical Medicine, Universitá della Magna Graecia, Catanzaro, Italy
| |
Collapse
|
39
|
Fall detection for elderly-people monitoring using learned features and recurrent neural networks. EXPERIMENTAL RESULTS 2020. [DOI: 10.1017/exp.2020.3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
AbstractElderly care is becoming a relevant issue with the increase of population ageing. Fall injuries, with their impact on social and healthcare cost, represent one of the biggest concerns over the years. Researchers are focusing their attention on several fall-detection algorithms. In this paper, we present a deep-learning solution for automatic fall detection from RGB videos. The proposed approach achieved a mean recall of 0.916, prompting the possibility of translating this approach in the actual monitoring practice. Moreover to enable the scientific community making research on the topic the dataset used for our experiments will be released. This could enhance elderly people safety and quality of life, attenuating risks during elderly activities of daily living with reduced healthcare costs as a final result.
Collapse
|
40
|
Reference Method for the Development of Domain Action Recognition Classifiers: The Case of Medical Consultations. ENTERPRISE, BUSINESS-PROCESS AND INFORMATION SYSTEMS MODELING 2020. [PMCID: PMC7254546 DOI: 10.1007/978-3-030-49418-6_26] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/29/2022]
|
41
|
Moccia S, Romeo L, Migliorelli L, Frontoni E, Zingaretti P. Supervised CNN Strategies for Optical Image Segmentation and Classification in Interventional Medicine. INTELLIGENT SYSTEMS REFERENCE LIBRARY 2020. [DOI: 10.1007/978-3-030-42750-4_8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
42
|
Moccia S, Migliorelli L, Carnielli V, Frontoni E. Preterm Infants' Pose Estimation With Spatio-Temporal Features. IEEE Trans Biomed Eng 2019; 67:2370-2380. [PMID: 31870974 DOI: 10.1109/tbme.2019.2961448] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
OBJECTIVE Preterm infants' limb monitoring in neonatal intensive care units (NICUs) is of primary importance for assessing infants' health status and motor/cognitive development. Herein, we propose a new approach to preterm infants' limb pose estimation that features spatio-temporal information to detect and track limb joints from depth videos with high reliability. METHODS Limb-pose estimation is performed using a deep-learning framework consisting of a detection and a regression convolutional neural network (CNN) for rough and precise joint localization, respectively. The CNNs are implemented to encode connectivity in the temporal direction through 3D convolution. Assessment of the proposed framework is performed through a comprehensive study with sixteen depth videos acquired in the actual clinical practice from sixteen preterm infants (the babyPose dataset). RESULTS When applied to pose estimation, the median root mean square distance, computed among all limbs, between the estimated and the ground-truth pose was 9.06 pixels, overcoming approaches based on spatial features only (11.27 pixels). CONCLUSION Results showed that the spatio-temporal features had a significant influence on the pose-estimation performance, especially in challenging cases (e.g., homogeneous image intensity). SIGNIFICANCE This article significantly enhances the state of art in automatic assessment of preterm infants' health status by introducing the use of spatio-temporal features for limb detection and tracking, and by being the first study to use depth videos acquired in the actual clinical practice for limb-pose estimation. The babyPose dataset has been released as the first annotated dataset for infants' pose estimation.
Collapse
|
43
|
Inter-foetus Membrane Segmentation for TTTS Using Adversarial Networks. Ann Biomed Eng 2019; 48:848-859. [DOI: 10.1007/s10439-019-02424-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Accepted: 11/23/2019] [Indexed: 12/18/2022]
|
44
|
Weng CH, Wang CL, Huang YJ, Yeh YC, Fu CJ, Yeh CY, Tsai TT. Artificial Intelligence for Automatic Measurement of Sagittal Vertical Axis Using ResUNet Framework. J Clin Med 2019; 8:jcm8111826. [PMID: 31683913 PMCID: PMC6912675 DOI: 10.3390/jcm8111826] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2019] [Revised: 10/24/2019] [Accepted: 10/25/2019] [Indexed: 12/27/2022] Open
Abstract
We present an automated method for measuring the sagittal vertical axis (SVA) from lateral radiography of whole spine using a convolutional neural network for keypoint detection (ResUNet) with our improved localization method. The algorithm is robust to various clinical conditions, such as degenerative changes or deformities. The ResUNet was trained and evaluated on 990 standing lateral radiographs taken at Chang Gung Memorial Hospital, Linkou and performs SVA measurement with median absolute error of 1.183 ± 0.166 mm. The 5-mm detection rate of the C7 body and the sacrum are 91% and 87%, respectively. The SVA calculation takes approximately 0.2 s per image. The intra-class correlation coefficient of the SVA estimates between the algorithm and physicians of different years of experience ranges from 0.946 to 0.993, indicating an excellent consistency. The superior performance of the proposed method and its high consistency with physicians proved its usefulness for automatic measurement of SVA in clinical settings.
Collapse
Affiliation(s)
| | - Chih-Li Wang
- Department of Orthopaedic Surgery, Spine Division, Bone and Joint Research Center, Chang Gung Memorial Hospital and Chang Gung University College of Medicine, Taoyuan 333, Taiwan.
| | - Yu-Jui Huang
- Department of Orthopaedic Surgery, Spine Division, Bone and Joint Research Center, Chang Gung Memorial Hospital and Chang Gung University College of Medicine, Taoyuan 333, Taiwan.
| | - Yu-Cheng Yeh
- Department of Orthopaedic Surgery, Spine Division, Bone and Joint Research Center, Chang Gung Memorial Hospital and Chang Gung University College of Medicine, Taoyuan 333, Taiwan.
| | - Chen-Ju Fu
- Department of Medical Imaging and Intervention, Chang Gung Memorial Hospital and Chang Gung University College of Medicine, Taoyuan 333, Taiwan.
| | | | - Tsung-Ting Tsai
- Department of Orthopaedic Surgery, Spine Division, Bone and Joint Research Center, Chang Gung Memorial Hospital and Chang Gung University College of Medicine, Taoyuan 333, Taiwan.
| |
Collapse
|