1
|
Ali S, Espinel Y, Jin Y, Liu P, Güttner B, Zhang X, Zhang L, Dowrick T, Clarkson MJ, Xiao S, Wu Y, Yang Y, Zhu L, Sun D, Li L, Pfeiffer M, Farid S, Maier-Hein L, Buc E, Bartoli A. An objective comparison of methods for augmented reality in laparoscopic liver resection by preoperative-to-intraoperative image fusion from the MICCAI2022 challenge. Med Image Anal 2025; 99:103371. [PMID: 39488186 DOI: 10.1016/j.media.2024.103371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 09/25/2024] [Accepted: 10/09/2024] [Indexed: 11/04/2024]
Abstract
Augmented reality for laparoscopic liver resection is a visualisation mode that allows a surgeon to localise tumours and vessels embedded within the liver by projecting them on top of a laparoscopic image. Preoperative 3D models extracted from Computed Tomography (CT) or Magnetic Resonance (MR) imaging data are registered to the intraoperative laparoscopic images during this process. Regarding 3D-2D fusion, most algorithms use anatomical landmarks to guide registration, such as the liver's inferior ridge, the falciform ligament, and the occluding contours. These are usually marked by hand in both the laparoscopic image and the 3D model, which is time-consuming and prone to error. Therefore, there is a need to automate this process so that augmented reality can be used effectively in the operating room. We present the Preoperative-to-Intraoperative Laparoscopic Fusion challenge (P2ILF), held during the Medical Image Computing and Computer Assisted Intervention (MICCAI 2022) conference, which investigates the possibilities of detecting these landmarks automatically and using them in registration. The challenge was divided into two tasks: (1) A 2D and 3D landmark segmentation task and (2) a 3D-2D registration task. The teams were provided with training data consisting of 167 laparoscopic images and 9 preoperative 3D models from 9 patients, with the corresponding 2D and 3D landmark annotations. A total of 6 teams from 4 countries participated in the challenge, whose results were assessed for each task independently. All the teams proposed deep learning-based methods for the 2D and 3D landmark segmentation tasks and differentiable rendering-based methods for the registration task. The proposed methods were evaluated on 16 test images and 2 preoperative 3D models from 2 patients. In Task 1, the teams were able to segment most of the 2D landmarks, while the 3D landmarks showed to be more challenging to segment. In Task 2, only one team obtained acceptable qualitative and quantitative registration results. Based on the experimental outcomes, we propose three key hypotheses that determine current limitations and future directions for research in this domain.
Collapse
Affiliation(s)
- Sharib Ali
- School of Computer Science, Faculty of Engineering and Physical Sciences, University of Leeds, Leeds, LS2 9JT, United Kingdom.
| | - Yamid Espinel
- Centre Hospitalier Universitaire de Clermont-Ferrand, Clermont-Ferrand, France
| | - Yueming Jin
- Department of Electrical and Computer Engineering National University of Singapore (NUS), 119276, Singapore
| | - Peng Liu
- Department of Translational Surgical Oncology, National Center for Tumor Diseases (NCT/UCC Dresden), Fetscherstraße 74, 01307 Dresden, Germany
| | - Bianca Güttner
- Department of Translational Surgical Oncology, National Center for Tumor Diseases (NCT/UCC Dresden), Fetscherstraße 74, 01307 Dresden, Germany
| | - Xukun Zhang
- Academy for Engineering and Technology, Fudan University, Shanghai, China
| | - Lihua Zhang
- Academy for Engineering and Technology, Fudan University, Shanghai, China
| | - Tom Dowrick
- Wellcome EPSRC Centre for Interventional and Surgical Sciences, UCL, London, United Kingdom
| | - Matthew J Clarkson
- Wellcome EPSRC Centre for Interventional and Surgical Sciences, UCL, London, United Kingdom
| | - Shiting Xiao
- Department of Computer and Information Science, University of Pennsylvania, PA 19104, Philadelphia, USA
| | - Yifan Wu
- Department of Bioengineering, University of Pennsylvania, PA 19104, Philadelphia, USA
| | - Yijun Yang
- Hong Kong University of Science and Technology (Guangzhou), 511455, Guangzhou, China
| | - Lei Zhu
- Hong Kong University of Science and Technology (Guangzhou), 511455, Guangzhou, China
| | - Dai Sun
- Suzhou Institute for Advanced Research, Center for Medical Imaging, Robotics, Analytic Computing & Learning (MIRACLE), University of Science and Technology of China, Suzhou, China
| | - Lan Li
- Suzhou Institute for Advanced Research, Center for Medical Imaging, Robotics, Analytic Computing & Learning (MIRACLE), University of Science and Technology of China, Suzhou, China
| | - Micha Pfeiffer
- Department of Translational Surgical Oncology, National Center for Tumor Diseases (NCT/UCC Dresden), Fetscherstraße 74, 01307 Dresden, Germany
| | - Shahid Farid
- Department of HPB and Transplant Surgery, St. James's University Hospital, Leeds, United Kingdom
| | | | - Emmanuel Buc
- Centre Hospitalier Universitaire de Clermont-Ferrand, Clermont-Ferrand, France; Endoscopy and Computer Vision Group, Université Clermont Auvergne, Clermont-Ferrand, France
| | - Adrien Bartoli
- Centre Hospitalier Universitaire de Clermont-Ferrand, Clermont-Ferrand, France; Endoscopy and Computer Vision Group, Université Clermont Auvergne, Clermont-Ferrand, France
| |
Collapse
|
2
|
Göbel B, Reiterer A, Möller K. Image-Based 3D Reconstruction in Laparoscopy: A Review Focusing on the Quantitative Evaluation by Applying the Reconstruction Error. J Imaging 2024; 10:180. [PMID: 39194969 DOI: 10.3390/jimaging10080180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 07/16/2024] [Accepted: 07/18/2024] [Indexed: 08/29/2024] Open
Abstract
Image-based 3D reconstruction enables laparoscopic applications as image-guided navigation and (autonomous) robot-assisted interventions, which require a high accuracy. The review's purpose is to present the accuracy of different techniques to label the most promising. A systematic literature search with PubMed and google scholar from 2015 to 2023 was applied by following the framework of "Review articles: purpose, process, and structure". Articles were considered when presenting a quantitative evaluation (root mean squared error and mean absolute error) of the reconstruction error (Euclidean distance between real and reconstructed surface). The search provides 995 articles, which were reduced to 48 articles after applying exclusion criteria. From these, a reconstruction error data set could be generated for the techniques of stereo vision, Shape-from-Motion, Simultaneous Localization and Mapping, deep-learning, and structured light. The reconstruction error varies from below one millimeter to higher than ten millimeters-with deep-learning and Simultaneous Localization and Mapping delivering the best results under intraoperative conditions. The high variance emerges from different experimental conditions. In conclusion, submillimeter accuracy is challenging, but promising image-based 3D reconstruction techniques could be identified. For future research, we recommend computing the reconstruction error for comparison purposes and use ex/in vivo organs as reference objects for realistic experiments.
Collapse
Affiliation(s)
- Birthe Göbel
- Department of Sustainable Systems Engineering-INATECH, University of Freiburg, Emmy-Noether-Street 2, 79110 Freiburg im Breisgau, Germany
- KARL STORZ SE & Co. KG, Dr.-Karl-Storz-Street 34, 78532 Tuttlingen, Germany
| | - Alexander Reiterer
- Department of Sustainable Systems Engineering-INATECH, University of Freiburg, Emmy-Noether-Street 2, 79110 Freiburg im Breisgau, Germany
- Fraunhofer Institute for Physical Measurement Techniques IPM, 79110 Freiburg im Breisgau, Germany
| | - Knut Möller
- Institute of Technical Medicine-ITeM, Furtwangen University (HFU), 78054 Villingen-Schwenningen, Germany
- Mechanical Engineering, University of Canterbury, Christchurch 8140, New Zealand
| |
Collapse
|
3
|
Espinel Y, Lombion N, Compagnone L, Saroul N, Bartoli A. Preliminary trials of trackerless augmented reality in endoscopic endonasal surgery. Int J Comput Assist Radiol Surg 2024; 19:1385-1389. [PMID: 38775903 DOI: 10.1007/s11548-024-03155-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Accepted: 04/16/2024] [Indexed: 07/10/2024]
Abstract
PURPOSE We present a novel method for augmented reality in endoscopic endonasal surgery. Our method does not require the use of external tracking devices and can show hidden anatomical structures relevant to the surgical intervention. METHODS Our method registers a preoperative 3D model of the nasal cavity to an intraoperative 3D model by estimating a scaled-rigid transformation. Registration is based on a two-stage ICP approach on the reconstructed nasal cavity. The hidden structures are then transferred from the preoperative 3D model to the intraoperative one using the estimated transformation, projected and overlaid into the endoscopic images to obtain the augmented reality. RESULTS We performed qualitative and quantitative validation of our method on 12 clinical cases. Qualitative results were obtained from an ENT surgeon from visual inspection of the hidden structures in the augmented images. Quantitative results were obtained by measuring a target registration error using a novel transillumination-based approach. The results show that the hidden structures of interest are augmented at the expected locations in most cases. CONCLUSION Our method was able to augment the endoscopic images in a sufficiently precise manner when the intraoperative nasal cavity did not deform considerably with respect to its preoperative state. This is a promising step towards trackerless augmented reality in endonasal surgery.
Collapse
Affiliation(s)
- Yamid Espinel
- DRCI, DIA2M, CHU de Clermont-Ferrand, 28 Place Henri Dunant, 63000, Clermont-Ferrand, France.
| | - Nalick Lombion
- DRCI, DIA2M, CHU de Clermont-Ferrand, 28 Place Henri Dunant, 63000, Clermont-Ferrand, France
| | - Luce Compagnone
- DRCI, DIA2M, CHU de Clermont-Ferrand, 28 Place Henri Dunant, 63000, Clermont-Ferrand, France
| | - Nicolas Saroul
- DRCI, DIA2M, CHU de Clermont-Ferrand, 28 Place Henri Dunant, 63000, Clermont-Ferrand, France
| | - Adrien Bartoli
- DRCI, DIA2M, CHU de Clermont-Ferrand, 28 Place Henri Dunant, 63000, Clermont-Ferrand, France
| |
Collapse
|
4
|
Schmidt A, Mohareri O, DiMaio S, Yip MC, Salcudean SE. Tracking and mapping in medical computer vision: A review. Med Image Anal 2024; 94:103131. [PMID: 38442528 DOI: 10.1016/j.media.2024.103131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 02/08/2024] [Accepted: 02/29/2024] [Indexed: 03/07/2024]
Abstract
As computer vision algorithms increase in capability, their applications in clinical systems will become more pervasive. These applications include: diagnostics, such as colonoscopy and bronchoscopy; guiding biopsies, minimally invasive interventions, and surgery; automating instrument motion; and providing image guidance using pre-operative scans. Many of these applications depend on the specific visual nature of medical scenes and require designing algorithms to perform in this environment. In this review, we provide an update to the field of camera-based tracking and scene mapping in surgery and diagnostics in medical computer vision. We begin with describing our review process, which results in a final list of 515 papers that we cover. We then give a high-level summary of the state of the art and provide relevant background for those who need tracking and mapping for their clinical applications. After which, we review datasets provided in the field and the clinical needs that motivate their design. Then, we delve into the algorithmic side, and summarize recent developments. This summary should be especially useful for algorithm designers and to those looking to understand the capability of off-the-shelf methods. We maintain focus on algorithms for deformable environments while also reviewing the essential building blocks in rigid tracking and mapping since there is a large amount of crossover in methods. With the field summarized, we discuss the current state of the tracking and mapping methods along with needs for future algorithms, needs for quantification, and the viability of clinical applications. We then provide some research directions and questions. We conclude that new methods need to be designed or combined to support clinical applications in deformable environments, and more focus needs to be put into collecting datasets for training and evaluation.
Collapse
Affiliation(s)
- Adam Schmidt
- Department of Electrical and Computer Engineering, University of British Columbia, 2329 West Mall, Vancouver V6T 1Z4, BC, Canada.
| | - Omid Mohareri
- Advanced Research, Intuitive Surgical, 1020 Kifer Rd, Sunnyvale, CA 94086, USA
| | - Simon DiMaio
- Advanced Research, Intuitive Surgical, 1020 Kifer Rd, Sunnyvale, CA 94086, USA
| | - Michael C Yip
- Department of Electrical and Computer Engineering, University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
| | - Septimiu E Salcudean
- Department of Electrical and Computer Engineering, University of British Columbia, 2329 West Mall, Vancouver V6T 1Z4, BC, Canada
| |
Collapse
|
5
|
Nazir A, Wang Z. A Comprehensive Survey of ChatGPT: Advancements, Applications, Prospects, and Challenges. META-RADIOLOGY 2023; 1:100022. [PMID: 37901715 PMCID: PMC10611551 DOI: 10.1016/j.metrad.2023.100022] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/31/2023]
Abstract
Large Language Models (LLMs) especially when combined with Generative Pre-trained Transformers (GPT) represent a groundbreaking in natural language processing. In particular, ChatGPT, a state-of-the-art conversational language model with a user-friendly interface, has garnered substantial attention owing to its remarkable capability for generating human-like responses across a variety of conversational scenarios. This survey offers an overview of ChatGPT, delving into its inception, evolution, and key technology. We summarize the fundamental principles that underpin ChatGPT, encompassing its introduction in conjunction with GPT and LLMs. We also highlight the specific characteristics of GPT models with details of their impressive language understanding and generation capabilities. We then summarize applications of ChatGPT in a few representative domains. In parallel to the many advantages that ChatGPT can provide, we discuss the limitations and challenges along with potential mitigation strategies. Despite various controversial arguments and ethical concerns, ChatGPT has drawn significant attention from research industries and academia in a very short period. The survey concludes with an envision of promising avenues for future research in the field of ChatGPT. It is worth noting that knowing and addressing the challenges faced by ChatGPT will mount the way for more reliable and trustworthy conversational agents in the years to come.
Collapse
Affiliation(s)
- Anam Nazir
- Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine. W 670 Baltimore St, HSF III, Room 1173, Baltimore, MD 21201
| | - Ze Wang
- Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine. W 670 Baltimore St, HSF III, Room 1173, Baltimore, MD 21201
| |
Collapse
|
6
|
Hu W, Jiang H, Wang M. Flexible needle puncture path planning for liver tumors based on deep reinforcement learning. Phys Med Biol 2022; 67. [DOI: 10.1088/1361-6560/ac8fdd] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Accepted: 09/06/2022] [Indexed: 11/12/2022]
Abstract
Abstract
Objective. Minimally invasive surgery has been widely adopted in the treatment of patients with liver tumors. In liver tumor puncture surgery, an image-guided ablation needle for puncture surgery, which first reaches a target tumor along a predetermined path, and then ablates the tumor or injects drugs near the tumor, is often used to reduce patient trauma, improving the safety of surgery operations and avoiding possible damage to large blood vessels and key organs. In this paper, a path planning method for computer tomography (CT) guided ablation needle in liver tumor puncture surgery is proposed. Approach. Given a CT volume containing abdominal organs, we first classify voxels and optimize the number of voxels to reduce volume rendering pressure, then we reconstruct a multi-scale 3D model of the liver and hepatic vessels. Secondly, multiple entry points of the surgical path are selected based on the strong and weak constraints of clinical puncture surgery through multi-agent reinforcement learning. We select the optimal needle entry point based on the length measurement. Then, through the incremental training of the double deep Q-learning network (DDQN), the transmission of network parameters from the small-scale environment to the larger-scale environment is accomplished, and the optimal surgical path with more optimized details is obtained. Main results. To avoid falling into local optimum in network training, improve both the convergence speed and performance of the network, and maximize the cumulative reward, we train the path planning network on different scales 3D reconstructed organ models, and validate our method on tumor samples from public datasets. The scores of human surgeons verified the clinical relevance of the proposed method. Significance. Our method can robustly provide the optimal puncture path of flexible needle for liver tumors, which is expected to provide a reference for surgeons’ preoperative planning.
Collapse
|
7
|
Nazir A, Cheema MN, Sheng B, Li P, Kim J, Lee TY. Living Donor-Recipient Pair Matching for Liver Transplant via Ternary Tree Representation With Cascade Incremental Learning. IEEE Trans Biomed Eng 2021; 68:2540-2551. [PMID: 33417536 DOI: 10.1109/tbme.2021.3050310] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Visual understanding of liver vessels anatomy between the living donor-recipient (LDR) pair can assist surgeons to optimize transplant planning by avoiding non-targeted arteries which can cause severe complications. We propose to visually analyze the anatomical variants of the liver vessels anatomy to maximize similarity for finding a suitable Living Donor-Recipient (LDR) pair. Liver vessels are segmented from computed tomography angiography (CTA) volumes by employing a cascade incremental learning (CIL) model. Our CIL architecture is able to find optimal solutions, which we use to update the model with liver vessel CTA images. A novel ternary tree based algorithm is proposed to map all the possible liver vessel variants into their respective tree topologies. The tree topologies of the recipient's and donor's liver vessels are then used for an appropriate matching. The proposed algorithm utilizes a set of defined vessel tree variants which are updated to maintain the maximum matching options by leveraging the accurate segmentation results of the vessels derived from the incremental learning ability of the CIL. We introduce a novel concept of in-order digital string based comparison to match the geometry of two anatomically varied trees. Experiments through visual illustrations and quantitative analysis demonstrated the effectiveness of our approach compared to state-of-the-art.
Collapse
|
8
|
Sui C, Wu J, Wang Z, Ma G, Liu YH. A Real-Time 3D Laparoscopic Imaging System: Design, Method, and Validation. IEEE Trans Biomed Eng 2020; 67:2683-2695. [PMID: 31985404 DOI: 10.1109/tbme.2020.2968488] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
OBJECTIVE This paper aims to propose a 3D laparoscopic imaging system that can realize dense 3D reconstruction in real time. METHODS Based on the active stereo technique which yields high-density, accurate and robust 3D reconstruction by combining structured light and stereo vision, we design a laparoscopic system consisting of two image feedback channels and one pattern projection channel. Remote high-speed image acquisition and pattern generation lay the foundation for the real-time dense 3D surface reconstruction and enable the miniaturization of the laparoscopic probe. To enhance the reconstruction efficiency and accuracy, we propose a novel active stereo method by which the dense 3D point cloud is obtained using only five patterns, while most existing multiple-shot structured light techniques require [Formula: see text] patterns. In our method, dual-frequency phase-shifting fringes are utilized to uniquely encode the pixels of the measured targets, and a dual-codeword matching scheme is developed to simplify the matching procedure and achieve high-precision reconstruction. RESULTS Compared with the existing structured light techniques, the proposed method shows better real-time efficiency and accuracy in both quantitative and qualitative ways. Ex-vivo experiments demonstrate the robustness of the proposed method to different biological organs and the effectiveness to lesions and deformations of the organs. Feasibility of the proposed system for real-time dense 3D reconstruction is verified in dynamic experiments. According to the experimental results, the system acquires 3D point clouds with a speed of 12 frames per second. Each frame contains more than 40,000 points, and the average errors tested on standard objects are less than 0.2 mm. SIGNIFICANCE This paper provides a new real-time dense 3D reconstruction method for 3D laparoscopic imaging. The established prototype system has shown good performance in reconstructing surface of biological tissues.
Collapse
|
9
|
Cheema MN, Nazir A, Sheng B, Li P, Qin J, Feng DD. Liver Extraction Using Residual Convolution Neural Networks From Low-Dose CT Images. IEEE Trans Biomed Eng 2019; 66:2641-2650. [PMID: 30668449 DOI: 10.1109/tbme.2019.2894123] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
An efficient and precise liver extraction from computed tomography (CT) images is a crucial step for computer-aided hepatic diseases diagnosis and treatment. Considering the possible risk to patient's health due to X-ray radiation of repetitive CT examination, low-dose CT (LDCT) is an effective solution for medical imaging. However, inhomogeneous appearances and indistinct boundaries due to additional noise and streaks artifacts in LDCT images often make it a challenging task. This study aims to extract a liver model from LDCT images for facilitating medical expert in surgical planning and post-operative assessment along with low radiation risk to the patient. Our method carried out liver extraction by employing residual convolutional neural networks (LER-CN), which is further refined by noise removal and structure preservation components. After patch-based training, our LER-CN shows a competitive performance relative to state-of-the-art methods for both clinical and publicly available MICCAI Sliver07 datasets. We have proposed training and learning algorithms for LER-CN based on back propagation gradient descent. We have evaluated our method on 150 abdominal CT scans for liver extraction. LER-CN achieves dice similarity coefficient up to 96.5[Formula: see text], decreased volumetric overlap error up to 4.30[Formula: see text], and average symmetric surface distance less than 1.4 [Formula: see text]. These findings have shown that LER-CN is a favorable method for medical applications with high efficiency allowing low radiation risk to patients.
Collapse
|