1
|
Zhang C, Liu X, Fu Z, Ding G, Qin L, Wang P, Zhang H, Ye X. Registration, Path Planning and Shape Reconstruction for Soft Tools in Robot-Assisted Intraluminal Procedures: A Review. Int J Med Robot 2025; 21:e70066. [PMID: 40237632 DOI: 10.1002/rcs.70066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2024] [Revised: 02/22/2025] [Accepted: 03/31/2025] [Indexed: 04/18/2025]
Abstract
BACKGROUND Robot and navigation systems can relieve surgeon's difficulties in delicate and safe operation in tortuous lumens in traditional intraluminal procedures (IP). This paper aims to review the three key components of these systems: registration, path planning and shape reconstruction and highlight their limitations and future perspectives. METHODS An electronic search for relevant studies was performed in Web of Science and Google scholar databases until 2024. RESULTS As for 2D-3D registration in IP, we focused on analysing feature extraction. For path planning, this paper proposed a new classification method and focused on selection of planning space and the establishment of path cost. Regarding shape reconstruction, the pros and cons of existing methods are analysed and methods based on fibre optic sensors and electromagnetic (EM) tracking are focused on. CONCLUSION These three technologies in IP have made great progress, but there are still challenges that require further research.
Collapse
Affiliation(s)
- Chongan Zhang
- Biosensor National Special Laboratory, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China
| | - Xiaoyue Liu
- Biosensor National Special Laboratory, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China
| | - Zuoming Fu
- Biosensor National Special Laboratory, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China
| | - Guoqing Ding
- Department of Urology, School of Medicine, Sir Run Run Shaw Hospital, Zhejiang University, Hangzhou, China
| | - Liping Qin
- Zhejiang Institute of Medical Device Supervision and Testing, Hangzhou, China
- Key Laboratory of Safety Evaluation of Medical Devices of Zhejiang Province, Hangzhou, China
| | - Peng Wang
- Biosensor National Special Laboratory, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China
| | - Hong Zhang
- Biosensor National Special Laboratory, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China
| | - Xuesong Ye
- Biosensor National Special Laboratory, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China
- State Key Laboratory of CAD and CG, Zhejiang University, Hangzhou, China
| |
Collapse
|
2
|
He D, Liu Z, Yin X, Liu H, Gao W, Fu Y. Synthesized colonoscopy dataset from high-fidelity virtual colon with abnormal simulation. Comput Biol Med 2025; 186:109672. [PMID: 39826299 DOI: 10.1016/j.compbiomed.2025.109672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2024] [Revised: 01/06/2025] [Accepted: 01/08/2025] [Indexed: 01/22/2025]
Abstract
With the advent of the deep learning-based colonoscopy system, the need for a vast amount of high-quality colonoscopy image datasets for training is crucial. However, the generalization ability of deep learning models is challenged by the limited availability of colonoscopy images due to regulatory restrictions and privacy concerns. In this paper, we propose a method for rendering high-fidelity 3D colon models and synthesizing diversified colonoscopy images with abnormalities such as polyps, bleeding, and ulcers, which can be used to train deep learning models. The geometric model of the colon is derived from CT images. We employed dedicated surface mesh deformation to mimic the shapes of polyps and ulcers and applied texture mapping techniques to generate realistic, lifelike appearances. The generated polyp models were then attached to the inner surface of the colon model, while the ulcers were created directly on the inner surface of the colon model. To realistically model blood behavior, we developed a simulation of the blood diffusion process on the colon's inner surface and colored vertices in the traversed region to reflect blood flow. Ultimately, we generated a comprehensive dataset comprising high-fidelity rendered colonoscopy images with the abnormalities. To validate the effectiveness of the synthesized colonoscopy dataset, we trained state-of-the-art deep learning models on it and other publicly available datasets and assessed the performance of these models in abnormal classification, detection, and segmentation. Notably, the models trained on the synthesized dataset exhibit an enhanced performance in the aforementioned tasks, as evident from the results.
Collapse
Affiliation(s)
- Dongdong He
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150080, China
| | - Ziteng Liu
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150080, China
| | - Xunhai Yin
- Department of Gastroenterology, The First Affiliated Hospital of Harbin Medical University, Harbin, 150001, China
| | - Hao Liu
- State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China
| | - Wenpeng Gao
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150080, China; State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, 150080, China.
| | - Yili Fu
- State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, 150080, China.
| |
Collapse
|
3
|
Chen J, Liu Y, Wei S, Bian Z, Subramanian S, Carass A, Prince JL, Du Y. A survey on deep learning in medical image registration: New technologies, uncertainty, evaluation metrics, and beyond. Med Image Anal 2025; 100:103385. [PMID: 39612808 PMCID: PMC11730935 DOI: 10.1016/j.media.2024.103385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 10/27/2024] [Accepted: 11/01/2024] [Indexed: 12/01/2024]
Abstract
Deep learning technologies have dramatically reshaped the field of medical image registration over the past decade. The initial developments, such as regression-based and U-Net-based networks, established the foundation for deep learning in image registration. Subsequent progress has been made in various aspects of deep learning-based registration, including similarity measures, deformation regularizations, network architectures, and uncertainty estimation. These advancements have not only enriched the field of image registration but have also facilitated its application in a wide range of tasks, including atlas construction, multi-atlas segmentation, motion estimation, and 2D-3D registration. In this paper, we present a comprehensive overview of the most recent advancements in deep learning-based image registration. We begin with a concise introduction to the core concepts of deep learning-based image registration. Then, we delve into innovative network architectures, loss functions specific to registration, and methods for estimating registration uncertainty. Additionally, this paper explores appropriate evaluation metrics for assessing the performance of deep learning models in registration tasks. Finally, we highlight the practical applications of these novel techniques in medical imaging and discuss the future prospects of deep learning-based image registration.
Collapse
Affiliation(s)
- Junyu Chen
- Department of Radiology and Radiological Science, Johns Hopkins School of Medicine, MD, USA.
| | - Yihao Liu
- Department of Electrical and Computer Engineering, Johns Hopkins University, MD, USA
| | - Shuwen Wei
- Department of Electrical and Computer Engineering, Johns Hopkins University, MD, USA
| | - Zhangxing Bian
- Department of Electrical and Computer Engineering, Johns Hopkins University, MD, USA
| | - Shalini Subramanian
- Department of Radiology and Radiological Science, Johns Hopkins School of Medicine, MD, USA
| | - Aaron Carass
- Department of Electrical and Computer Engineering, Johns Hopkins University, MD, USA
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, MD, USA
| | - Yong Du
- Department of Radiology and Radiological Science, Johns Hopkins School of Medicine, MD, USA
| |
Collapse
|
4
|
Chavarrias Solano PE, Bulpitt A, Subramanian V, Ali S. Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy. Med Image Anal 2025; 99:103379. [PMID: 39536401 DOI: 10.1016/j.media.2024.103379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 09/02/2024] [Accepted: 10/22/2024] [Indexed: 11/16/2024]
Abstract
Colonoscopy screening is the gold standard procedure for assessing abnormalities in the colon and rectum, such as ulcers and cancerous polyps. Measuring the abnormal mucosal area and its 3D reconstruction can help quantify the surveyed area and objectively evaluate disease burden. However, due to the complex topology of these organs and variable physical conditions, for example, lighting, large homogeneous texture, and image modality estimating distance from the camera (aka depth) is highly challenging. Moreover, most colonoscopic video acquisition is monocular, making the depth estimation a non-trivial problem. While methods in computer vision for depth estimation have been proposed and advanced on natural scene datasets, the efficacy of these techniques has not been widely quantified on colonoscopy datasets. As the colonic mucosa has several low-texture regions that are not well pronounced, learning representations from an auxiliary task can improve salient feature extraction, allowing estimation of accurate camera depths. In this work, we propose to develop a novel multi-task learning (MTL) approach with a shared encoder and two decoders, namely a surface normal decoder and a depth estimator decoder. Our depth estimator incorporates attention mechanisms to enhance global context awareness. We leverage the surface normal prediction to improve geometric feature extraction. Also, we apply a cross-task consistency loss among the two geometrically related tasks, surface normal and camera depth. We demonstrate an improvement of 15.75% on relative error and 10.7% improvement on δ1.25 accuracy over the most accurate baseline state-of-the-art Big-to-Small (BTS) approach. All experiments are conducted on a recently released C3VD dataset, and thus, we provide a first benchmark of state-of-the-art methods on this dataset.
Collapse
Affiliation(s)
- Pedro Esteban Chavarrias Solano
- School of Computer Science, Faculty of Engineering and Physical Sciences, University of Leeds, Leeds, LS2 9JT, United Kingdom
| | - Andrew Bulpitt
- School of Computer Science, Faculty of Engineering and Physical Sciences, University of Leeds, Leeds, LS2 9JT, United Kingdom
| | - Venkataraman Subramanian
- Department of Gastroenterology, Leeds Teaching Hospitals NHS Trust, Leeds, UK; Division of Gastroenterology and Surgical Sciences Leeds Institute of Medical Research at St James's University of Leeds, Leeds, UK
| | - Sharib Ali
- School of Computer Science, Faculty of Engineering and Physical Sciences, University of Leeds, Leeds, LS2 9JT, United Kingdom.
| |
Collapse
|
5
|
Han Z, Dou Q. A review on organ deformation modeling approaches for reliable surgical navigation using augmented reality. Comput Assist Surg (Abingdon) 2024; 29:2357164. [PMID: 39253945 DOI: 10.1080/24699322.2024.2357164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024] Open
Abstract
Augmented Reality (AR) holds the potential to revolutionize surgical procedures by allowing surgeons to visualize critical structures within the patient's body. This is achieved through superimposing preoperative organ models onto the actual anatomy. Challenges arise from dynamic deformations of organs during surgery, making preoperative models inadequate for faithfully representing intraoperative anatomy. To enable reliable navigation in augmented surgery, modeling of intraoperative deformation to obtain an accurate alignment of the preoperative organ model with the intraoperative anatomy is indispensable. Despite the existence of various methods proposed to model intraoperative organ deformation, there are still few literature reviews that systematically categorize and summarize these approaches. This review aims to fill this gap by providing a comprehensive and technical-oriented overview of modeling methods for intraoperative organ deformation in augmented reality in surgery. Through a systematic search and screening process, 112 closely relevant papers were included in this review. By presenting the current status of organ deformation modeling methods and their clinical applications, this review seeks to enhance the understanding of organ deformation modeling in AR-guided surgery, and discuss the potential topics for future advancements.
Collapse
Affiliation(s)
- Zheng Han
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Qi Dou
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
6
|
He Q, Feng G, Bano S, Stoyanov D, Zuo S. MonoLoT: Self-Supervised Monocular Depth Estimation in Low-Texture Scenes for Automatic Robotic Endoscopy. IEEE J Biomed Health Inform 2024; 28:6078-6091. [PMID: 38968011 DOI: 10.1109/jbhi.2024.3423791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2024]
Abstract
The self-supervised monocular depth estimation framework is well-suited for medical images that lack ground-truth depth, such as those from digestive endoscopes, facilitating navigation and 3D reconstruction in the gastrointestinal tract. However, this framework faces several limitations, including poor performance in low-texture environments, limited generalisation to real-world datasets, and unclear applicability in downstream tasks like visual servoing. To tackle these challenges, we propose MonoLoT, a self-supervised monocular depth estimation framework featuring two key innovations: point matching loss and batch image shuffle. Extensive ablation studies on two publicly available datasets, namely C3VD and SimCol, have shown that methods enabled by MonoLoT achieve substantial improvements, with accuracies of 0.944 on C3VD and 0.959 on SimCol, surpassing both depth-supervised and self-supervised baselines on C3VD. Qualitative evaluations on real-world endoscopic data underscore the generalisation capabilities of our methods, outperforming both depth-supervised and self-supervised baselines. To demonstrate the feasibility of using monocular depth estimation for visual servoing, we have successfully integrated our method into a proof-of-concept robotic platform, enabling real-time automatic intervention and control in digestive endoscopy. In summary, our method represents a significant advancement in monocular depth estimation for digestive endoscopy, overcoming key challenges and opening promising avenues for medical applications.
Collapse
|
7
|
Bai X, Wang H, Qin Y, Han J, Yu N. MatchMorph: A real-time pre- and intra-operative deformable image registration framework for MRI-guided surgery. Comput Biol Med 2024; 180:108948. [PMID: 39121681 DOI: 10.1016/j.compbiomed.2024.108948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 06/27/2024] [Accepted: 07/25/2024] [Indexed: 08/12/2024]
Abstract
PURPOSE The technological advancements in surgical robots compatible with magnetic resonance imaging (MRI) have created an indispensable demand for real-time deformable image registration (DIR) of pre- and intra-operative MRI, but there is a lack of relevant methods. Challenges arise from dimensionality mismatch, resolution discrepancy, non-rigid deformation and requirement for real-time registration. METHODS In this paper, we propose a real-time DIR framework called MatchMorph, specifically designed for the registration of low-resolution local intraoperative MRI and high-resolution global preoperative MRI. Firstly, a super-resolution network based on global inference is developed to enhance the resolution of intraoperative MRI to the same as preoperative MRI, thus resolving the resolution discrepancy. Secondly, a fast-matching algorithm is designed to identify the optimal position of the intraoperative MRI within the corresponding preoperative MRI to address the dimensionality mismatch. Further, a cross-attention-based dual-stream DIR network is constructed to manipulate the deformation between pre- and intra-operative MRI, real-timely. RESULTS We conducted comprehensive experiments on publicly available datasets IXI and OASIS to evaluate the performance of the proposed MatchMorph framework. Compared to the state-of-the-art (SOTA) network TransMorph, the designed dual-stream DIR network of MatchMorph achieved superior performance with a 1.306 mm smaller HD and a 0.07 mm smaller ASD score on the IXI dataset. Furthermore, the MatchMorph framework demonstrates an inference speed of approximately 280 ms. CONCLUSIONS The qualitative and quantitative registration results obtained from high-resolution global preoperative MRI and simulated low-resolution local intraoperative MRI validated the effectiveness and efficiency of the proposed MatchMorph framework.
Collapse
Affiliation(s)
- Xinhao Bai
- College of Artificial Intelligence, Nankai University, Tianjin, 300350, China; Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, Nankai University, Tianjin, 300350, China; Institute of Intelligence Technology and Robotic Systems, Shenzhen Research Institute of Nankai University, Shenzhen, 518083, China
| | - Hongpeng Wang
- College of Artificial Intelligence, Nankai University, Tianjin, 300350, China; Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, Nankai University, Tianjin, 300350, China; Institute of Intelligence Technology and Robotic Systems, Shenzhen Research Institute of Nankai University, Shenzhen, 518083, China
| | - Yanding Qin
- College of Artificial Intelligence, Nankai University, Tianjin, 300350, China; Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, Nankai University, Tianjin, 300350, China; Institute of Intelligence Technology and Robotic Systems, Shenzhen Research Institute of Nankai University, Shenzhen, 518083, China
| | - Jianda Han
- College of Artificial Intelligence, Nankai University, Tianjin, 300350, China; Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, Nankai University, Tianjin, 300350, China; Institute of Intelligence Technology and Robotic Systems, Shenzhen Research Institute of Nankai University, Shenzhen, 518083, China
| | - Ningbo Yu
- College of Artificial Intelligence, Nankai University, Tianjin, 300350, China; Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, Nankai University, Tianjin, 300350, China; Institute of Intelligence Technology and Robotic Systems, Shenzhen Research Institute of Nankai University, Shenzhen, 518083, China.
| |
Collapse
|
8
|
Gs R, Sharma S, Sp P, Sivaprakasam M. DACVNet: Dual Attention Concatenation Volume Net for Stereo Endoscope 3D Reconstruction. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-4. [PMID: 40039796 DOI: 10.1109/embc53108.2024.10782720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Depth estimation is a crucial task in endoscopy for three-dimensional reconstruction, surgical navigation, and augmented reality visualization. Stereo scope based depth estimation which involves capturing two images from different viewpoints, is a preferred method as it does not require specialized hardware. The depth information is encoded as the disparity between the left and right images. CNN-based methods outperform other traditional methods in stereo disparity estimation in terms of accuracy and robustness. ACVNet is a stereo disparity estimation model with high accuracy and low inference time. ACVNet generates and applies spatial attention weights to improve accuracy. The proposed model, DACVNet, incorporates a self-attention mechanism across the feature dimension in addition to the spatial attention in ACVNet, to enhance the accuracy. The proposed model is compared with other commonly used models for stereo disparity estimation with the C3VD dataset. To show that the proposed model can be translated to clinical purposes, the model was trained in a self-supervised manner with a dataset collected from a gastric phantom using an in-house developed stereo endoscope. The proposed model outperforms ACVNet (second best model) by 7.08 % in terms of End Point Error metric. In the gastric phantom dataset, a 3D reconstruction of the scene was obtained and validated qualitatively. This shows that the proposed model combined with a stereo endoscope could be used for depth estimation for clinical purposes. The code is available at https://github.com/rahul-gs-16/DACVNet.Clinical relevance- We propose a stereo disparity estimation model, which can be used in a stereo endoscope for depth estimation, 3D reconstruction, and optic based measurement.
Collapse
|
9
|
Teufel T, Shu H, Soberanis-Mukul RD, Mangulabnan JE, Sahu M, Vedula SS, Ishii M, Hager G, Taylor RH, Unberath M. OneSLAM to map them all: a generalized approach to SLAM for monocular endoscopic imaging based on tracking any point. Int J Comput Assist Radiol Surg 2024; 19:1259-1266. [PMID: 38775904 DOI: 10.1007/s11548-024-03171-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 04/30/2024] [Indexed: 07/10/2024]
Abstract
PURPOSE Monocular SLAM algorithms are the key enabling technology for image-based surgical navigation systems for endoscopic procedures. Due to the visual feature scarcity and unique lighting conditions encountered in endoscopy, classical SLAM approaches perform inconsistently. Many of the recent approaches to endoscopic SLAM rely on deep learning models. They show promising results when optimized on singular domains such as arthroscopy, sinus endoscopy, colonoscopy or laparoscopy, but are limited by an inability to generalize to different domains without retraining. METHODS To address this generality issue, we propose OneSLAM a monocular SLAM algorithm for surgical endoscopy that works out of the box for several endoscopic domains, including sinus endoscopy, colonoscopy, arthroscopy and laparoscopy. Our pipeline builds upon robust tracking any point (TAP) foundation models to reliably track sparse correspondences across multiple frames and runs local bundle adjustment to jointly optimize camera poses and a sparse 3D reconstruction of the anatomy. RESULTS We compare the performance of our method against three strong baselines previously proposed for monocular SLAM in endoscopy and general scenes. OneSLAM presents better or comparable performance over existing approaches targeted to that specific data in all four tested domains, generalizing across domains without the need for retraining. CONCLUSION OneSLAM benefits from the convincing performance of TAP foundation models but generalizes to endoscopic sequences of different anatomies all while demonstrating better or comparable performance over domain-specific SLAM approaches. Future research on global loop closure will investigate how to reliably detect loops in endoscopic scenes to reduce accumulated drift and enhance long-term navigation capabilities.
Collapse
Affiliation(s)
- Timo Teufel
- Johns Hopkins University, Baltimore, MD, 21211, USA.
| | - Hongchao Shu
- Johns Hopkins University, Baltimore, MD, 21211, USA
| | | | | | - Manish Sahu
- Johns Hopkins University, Baltimore, MD, 21211, USA
| | | | - Masaru Ishii
- Johns Hopkins Medical Institutions, Baltimore, MD, 21287, USA
| | | | - Russell H Taylor
- Johns Hopkins University, Baltimore, MD, 21211, USA
- Johns Hopkins Medical Institutions, Baltimore, MD, 21287, USA
| | - Mathias Unberath
- Johns Hopkins University, Baltimore, MD, 21211, USA
- Johns Hopkins Medical Institutions, Baltimore, MD, 21287, USA
| |
Collapse
|
10
|
Schmidt A, Mohareri O, DiMaio SP, Salcudean SE. Surgical Tattoos in Infrared: A Dataset for Quantifying Tissue Tracking and Mapping. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2634-2645. [PMID: 38437151 DOI: 10.1109/tmi.2024.3372828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2024]
Abstract
Quantifying performance of methods for tracking and mapping tissue in endoscopic environments is essential for enabling image guidance and automation of medical interventions and surgery. Datasets developed so far either use rigid environments, visible markers, or require annotators to label salient points in videos after collection. These are respectively: not general, visible to algorithms, or costly and error-prone. We introduce a novel labeling methodology along with a dataset that uses said methodology, Surgical Tattoos in Infrared (STIR). STIR has labels that are persistent but invisible to visible spectrum algorithms. This is done by labelling tissue points with IR-fluorescent dye, indocyanine green (ICG), and then collecting visible light video clips. STIR comprises hundreds of stereo video clips in both in vivo and ex vivo scenes with start and end points labelled in the IR spectrum. With over 3,000 labelled points, STIR will help to quantify and enable better analysis of tracking and mapping methods. After introducing STIR, we analyze multiple different frame-based tracking methods on STIR using both 3D and 2D endpoint error and accuracy metrics. STIR is available at https://dx.doi.org/10.21227/w8g4-g548.
Collapse
|
11
|
Richter A, Steinmann T, Rosenthal JC, Rupitsch SJ. Advances in Real-Time 3D Reconstruction for Medical Endoscopy. J Imaging 2024; 10:120. [PMID: 38786574 PMCID: PMC11122342 DOI: 10.3390/jimaging10050120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 04/23/2024] [Accepted: 04/24/2024] [Indexed: 05/25/2024] Open
Abstract
This contribution is intended to provide researchers with a comprehensive overview of the current state-of-the-art concerning real-time 3D reconstruction methods suitable for medical endoscopy. Over the past decade, there have been various technological advancements in computational power and an increased research effort in many computer vision fields such as autonomous driving, robotics, and unmanned aerial vehicles. Some of these advancements can also be adapted to the field of medical endoscopy while coping with challenges such as featureless surfaces, varying lighting conditions, and deformable structures. To provide a comprehensive overview, a logical division of monocular, binocular, trinocular, and multiocular methods is performed and also active and passive methods are distinguished. Within these categories, we consider both flexible and non-flexible endoscopes to cover the state-of-the-art as fully as possible. The relevant error metrics to compare the publications presented here are discussed, and the choice of when to choose a GPU rather than an FPGA for camera-based 3D reconstruction is debated. We elaborate on the good practice of using datasets and provide a direct comparison of the presented work. It is important to note that in addition to medical publications, publications evaluated on the KITTI and Middlebury datasets are also considered to include related methods that may be suited for medical 3D reconstruction.
Collapse
Affiliation(s)
- Alexander Richter
- Fraunhofer Institute for High-Speed Dynamics, Ernst–Mach–Institut (EMI), Ernst-Zermelo-Straße 4, 79104 Freiburg, Germany
- Electrical Instrumentation and Embedded Systems, Albert–Ludwigs–Universität Freiburg, Goerges-Köhler-Allee 106, 79110 Freiburg, Germany; (T.S.); (S.J.R.)
| | - Till Steinmann
- Electrical Instrumentation and Embedded Systems, Albert–Ludwigs–Universität Freiburg, Goerges-Köhler-Allee 106, 79110 Freiburg, Germany; (T.S.); (S.J.R.)
| | - Jean-Claude Rosenthal
- Fraunhofer Institute for Telecommunications, Heinrich–Hertz–Institut (HHI), Einsteinufer 37, 10587 Berlin, Germany
| | - Stefan J. Rupitsch
- Electrical Instrumentation and Embedded Systems, Albert–Ludwigs–Universität Freiburg, Goerges-Köhler-Allee 106, 79110 Freiburg, Germany; (T.S.); (S.J.R.)
| |
Collapse
|
12
|
Kim BS, Cho M, Chung GE, Lee J, Kang HY, Yoon D, Cho WS, Lee JC, Bae JH, Kong HJ, Kim S. Density clustering-based automatic anatomical section recognition in colonoscopy video using deep learning. Sci Rep 2024; 14:872. [PMID: 38195632 PMCID: PMC10776865 DOI: 10.1038/s41598-023-51056-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Accepted: 12/29/2023] [Indexed: 01/11/2024] Open
Abstract
Recognizing anatomical sections during colonoscopy is crucial for diagnosing colonic diseases and generating accurate reports. While recent studies have endeavored to identify anatomical regions of the colon using deep learning, the deformable anatomical characteristics of the colon pose challenges for establishing a reliable localization system. This study presents a system utilizing 100 colonoscopy videos, combining density clustering and deep learning. Cascaded CNN models are employed to estimate the appendix orifice (AO), flexures, and "outside of the body," sequentially. Subsequently, DBSCAN algorithm is applied to identify anatomical sections. Clustering-based analysis integrates clinical knowledge and context based on the anatomical section within the model. We address challenges posed by colonoscopy images through non-informative removal preprocessing. The image data is labeled by clinicians, and the system deduces section correspondence stochastically. The model categorizes the colon into three sections: right (cecum and ascending colon), middle (transverse colon), and left (descending colon, sigmoid colon, rectum). We estimated the appearance time of anatomical boundaries with an average error of 6.31 s for AO, 9.79 s for HF, 27.69 s for SF, and 3.26 s for outside of the body. The proposed method can facilitate future advancements towards AI-based automatic reporting, offering time-saving efficacy and standardization.
Collapse
Grants
- 1711179421, RS-2021-KD000006 the Korea Medical Device Development Fund grant funded by the Korean government (the Ministry of Science and ICT, the Ministry of Trade, Industry and Energy, the Ministry of Health and Welfare, and the Ministry of Food and Drug Safety)
- 1711179421, RS-2021-KD000006 the Korea Medical Device Development Fund grant funded by the Korean government (the Ministry of Science and ICT, the Ministry of Trade, Industry and Energy, the Ministry of Health and Welfare, and the Ministry of Food and Drug Safety)
- 1711179421, RS-2021-KD000006 the Korea Medical Device Development Fund grant funded by the Korean government (the Ministry of Science and ICT, the Ministry of Trade, Industry and Energy, the Ministry of Health and Welfare, and the Ministry of Food and Drug Safety)
- IITP-2023-2018-0-01833 the Ministry of Science and ICT, Korea under the Information Technology Research Center (ITRC) support program
Collapse
Affiliation(s)
- Byeong Soo Kim
- Interdisciplinary Program in Bioengineering, Graduate School, Seoul National University, Seoul, 08826, Korea
| | - Minwoo Cho
- Innovative Medical Technology Research Institute, Seoul National University Hospital, Seoul, 03080, Korea
- Department of Transdisciplinary Medicine, Seoul National University Hospital, Seoul, 03080, Korea
- Department of Medicine, Seoul National University College of Medicine, Seoul, 03080, Korea
| | - Goh Eun Chung
- Department of Internal Medicine and Healthcare Research Institute, Healthcare System Gangnam Center, Seoul National University Hospital, Seoul, 06236, Korea
| | - Jooyoung Lee
- Department of Internal Medicine and Healthcare Research Institute, Healthcare System Gangnam Center, Seoul National University Hospital, Seoul, 06236, Korea
| | - Hae Yeon Kang
- Department of Internal Medicine and Healthcare Research Institute, Healthcare System Gangnam Center, Seoul National University Hospital, Seoul, 06236, Korea
| | - Dan Yoon
- Interdisciplinary Program in Bioengineering, Graduate School, Seoul National University, Seoul, 08826, Korea
| | - Woo Sang Cho
- Interdisciplinary Program in Bioengineering, Graduate School, Seoul National University, Seoul, 08826, Korea
| | - Jung Chan Lee
- Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, 03080, Korea
- Institute of Bioengineering, Seoul National University, Seoul, 08826, Republic of Korea
- Institute of Medical and Biological Engineering, Medical Research Center, Seoul National University, Seoul, 03080, Korea
| | - Jung Ho Bae
- Department of Internal Medicine and Healthcare Research Institute, Healthcare System Gangnam Center, Seoul National University Hospital, Seoul, 06236, Korea.
| | - Hyoun-Joong Kong
- Innovative Medical Technology Research Institute, Seoul National University Hospital, Seoul, 03080, Korea.
- Department of Transdisciplinary Medicine, Seoul National University Hospital, Seoul, 03080, Korea.
- Department of Medicine, Seoul National University College of Medicine, Seoul, 03080, Korea.
- Medical Big Data Research Center, Seoul National University College of Medicine, Seoul, 03087, Korea.
| | - Sungwan Kim
- Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, 03080, Korea.
- Institute of Bioengineering, Seoul National University, Seoul, 08826, Republic of Korea.
- Artificial Intelligence Institute, Seoul National University, Research Park Building 942, 2 Fl., Seoul, 08826, Korea.
| |
Collapse
|
13
|
Pinto-Coelho L. How Artificial Intelligence Is Shaping Medical Imaging Technology: A Survey of Innovations and Applications. Bioengineering (Basel) 2023; 10:1435. [PMID: 38136026 PMCID: PMC10740686 DOI: 10.3390/bioengineering10121435] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 12/12/2023] [Accepted: 12/15/2023] [Indexed: 12/24/2023] Open
Abstract
The integration of artificial intelligence (AI) into medical imaging has guided in an era of transformation in healthcare. This literature review explores the latest innovations and applications of AI in the field, highlighting its profound impact on medical diagnosis and patient care. The innovation segment explores cutting-edge developments in AI, such as deep learning algorithms, convolutional neural networks, and generative adversarial networks, which have significantly improved the accuracy and efficiency of medical image analysis. These innovations have enabled rapid and accurate detection of abnormalities, from identifying tumors during radiological examinations to detecting early signs of eye disease in retinal images. The article also highlights various applications of AI in medical imaging, including radiology, pathology, cardiology, and more. AI-based diagnostic tools not only speed up the interpretation of complex images but also improve early detection of disease, ultimately delivering better outcomes for patients. Additionally, AI-based image processing facilitates personalized treatment plans, thereby optimizing healthcare delivery. This literature review highlights the paradigm shift that AI has brought to medical imaging, highlighting its role in revolutionizing diagnosis and patient care. By combining cutting-edge AI techniques and their practical applications, it is clear that AI will continue shaping the future of healthcare in profound and positive ways.
Collapse
Affiliation(s)
- Luís Pinto-Coelho
- ISEP—School of Engineering, Polytechnic Institute of Porto, 4200-465 Porto, Portugal;
- INESCTEC, Campus of the Engineering Faculty of the University of Porto, 4200-465 Porto, Portugal
| |
Collapse
|
14
|
Azagra P, Sostres C, Ferrández Á, Riazuelo L, Tomasini C, Barbed OL, Morlana J, Recasens D, Batlle VM, Gómez-Rodríguez JJ, Elvira R, López J, Oriol C, Civera J, Tardós JD, Murillo AC, Lanas A, Montiel JMM. Endomapper dataset of complete calibrated endoscopy procedures. Sci Data 2023; 10:671. [PMID: 37789003 PMCID: PMC10547713 DOI: 10.1038/s41597-023-02564-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Accepted: 09/14/2023] [Indexed: 10/05/2023] Open
Abstract
Computer-assisted systems are becoming broadly used in medicine. In endoscopy, most research focuses on the automatic detection of polyps or other pathologies, but localization and navigation of the endoscope are completely performed manually by physicians. To broaden this research and bring spatial Artificial Intelligence to endoscopies, data from complete procedures is needed. This paper introduces the Endomapper dataset, the first collection of complete endoscopy sequences acquired during regular medical practice, making secondary use of medical data. Its main purpose is to facilitate the development and evaluation of Visual Simultaneous Localization and Mapping (VSLAM) methods in real endoscopy data. The dataset contains more than 24 hours of video. It is the first endoscopic dataset that includes endoscope calibration as well as the original calibration videos. Meta-data and annotations associated with the dataset vary from the anatomical landmarks, procedure labeling, segmentations, reconstructions, simulated sequences with ground truth and same patient procedures. The software used in this paper is publicly available.
Collapse
Affiliation(s)
- Pablo Azagra
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain.
| | - Carlos Sostres
- Digestive Disease Service, Hospital Clínico Universitario Lozano Blesa, Zaragoza, Spain
- Department of Medicine, Universidad de Zaragoza, Zaragoza, Spain
- Instituto de Investigación Sanitaria Aragón (IIS Aragón), Zaragoza, Spain
- Centro de Investigación Biomédica en Red, Enfermedades Hepáticas y Digestivas (CIBEREHD), Madrid, Spain
| | - Ángel Ferrández
- Digestive Disease Service, Hospital Clínico Universitario Lozano Blesa, Zaragoza, Spain
- Department of Medicine, Universidad de Zaragoza, Zaragoza, Spain
- Instituto de Investigación Sanitaria Aragón (IIS Aragón), Zaragoza, Spain
- Centro de Investigación Biomédica en Red, Enfermedades Hepáticas y Digestivas (CIBEREHD), Madrid, Spain
| | - Luis Riazuelo
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| | - Clara Tomasini
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| | - O León Barbed
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| | - Javier Morlana
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| | - David Recasens
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| | - Víctor M Batlle
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| | - Juan J Gómez-Rodríguez
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| | - Richard Elvira
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| | - Julia López
- Digestive Disease Service, Hospital Clínico Universitario Lozano Blesa, Zaragoza, Spain
| | - Cristina Oriol
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| | - Javier Civera
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| | - Juan D Tardós
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| | - Ana C Murillo
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| | - Angel Lanas
- Digestive Disease Service, Hospital Clínico Universitario Lozano Blesa, Zaragoza, Spain
- Department of Medicine, Universidad de Zaragoza, Zaragoza, Spain
- Instituto de Investigación Sanitaria Aragón (IIS Aragón), Zaragoza, Spain
- Centro de Investigación Biomédica en Red, Enfermedades Hepáticas y Digestivas (CIBEREHD), Madrid, Spain
| | - José M M Montiel
- Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, Zaragoza, Spain
| |
Collapse
|