Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Grasa ÓG, Bernal E, Casado S, Gil I, Montiel JMM. Visual SLAM for Handheld Monocular Endoscope. IEEE Trans Med Imaging 2014;33:135-46. [PMID: 24107925 DOI: 10.1109/tmi.2013.2282997] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]

Number

Cited by Other Article(s)

Yang Z, Dai J, Pan J. 3D reconstruction from endoscopy images: A survey. Comput Biol Med 2024;175:108546. [PMID: 38704902 DOI: 10.1016/j.compbiomed.2024.108546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/05/2024] [Accepted: 04/28/2024] [Indexed: 05/07/2024]

Yang Z, Pan J, Dai J, Sun Z, Xiao Y. Self-Supervised Lightweight Depth Estimation in Endoscopy Combining CNN and Transformer. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024;43:1934-1944. [PMID: 38198275 DOI: 10.1109/tmi.2024.3352390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2024]

Schmidt A, Mohareri O, DiMaio S, Yip MC, Salcudean SE. Tracking and mapping in medical computer vision: A review. Med Image Anal 2024;94:103131. [PMID: 38442528 DOI: 10.1016/j.media.2024.103131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 02/08/2024] [Accepted: 02/29/2024] [Indexed: 03/07/2024]

Abstract

As computer vision algorithms increase in capability, their applications in clinical systems will become more pervasive. These applications include: diagnostics, such as colonoscopy and bronchoscopy; guiding biopsies, minimally invasive interventions, and surgery; automating instrument motion; and providing image guidance using pre-operative scans. Many of these applications depend on the specific visual nature of medical scenes and require designing algorithms to perform in this environment. In this review, we provide an update to the field of camera-based tracking and scene mapping in surgery and diagnostics in medical computer vision. We begin with describing our review process, which results in a final list of 515 papers that we cover. We then give a high-level summary of the state of the art and provide relevant background for those who need tracking and mapping for their clinical applications. After which, we review datasets provided in the field and the clinical needs that motivate their design. Then, we delve into the algorithmic side, and summarize recent developments. This summary should be especially useful for algorithm designers and to those looking to understand the capability of off-the-shelf methods. We maintain focus on algorithms for deformable environments while also reviewing the essential building blocks in rigid tracking and mapping since there is a large amount of crossover in methods. With the field summarized, we discuss the current state of the tracking and mapping methods along with needs for future algorithms, needs for quantification, and the viability of clinical applications. We then provide some research directions and questions. We conclude that new methods need to be designed or combined to support clinical applications in deformable environments, and more focus needs to be put into collecting datasets for training and evaluation.

Collapse

Zhang Z, Song H, Fan J, Fu T, Li Q, Ai D, Xiao D, Yang J. Dual-correlate optimized coarse-fine strategy for monocular laparoscopic videos feature matching via multilevel sequential coupling feature descriptor. Comput Biol Med 2024;169:107890. [PMID: 38168646 DOI: 10.1016/j.compbiomed.2023.107890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 12/13/2023] [Accepted: 12/18/2023] [Indexed: 01/05/2024]

Abstract

Feature matching of monocular laparoscopic videos is crucial for visualization enhancement in computer-assisted surgery, and the keys to conducting high-quality matches are accurate homography estimation, relative pose estimation, as well as sufficient matches and fast calculation. However, limited by various monocular laparoscopic imaging characteristics such as highlight noises, motion blur, texture interference and illumination variation, most exiting feature matching methods face the challenges of producing high-quality matches efficiently and sufficiently. To overcome these limitations, this paper presents a novel sequential coupling feature descriptor to extract and express multilevel feature maps efficiently, and a dual-correlate optimized coarse-fine strategy to establish dense matches in coarse level and adjust pixel-wise matches in fine level. Firstly, a novel sequential coupling swin transformer layer is designed in feature descriptor to learn and extract multilevel feature representations richly without increasing complexity. Then, a dual-correlate optimized coarse-fine strategy is proposed to match coarse feature sequences under low resolution, and the correlated fine feature sequences is optimized to refine pixel-wise matches based on coarse matching priors. Finally, the sequential coupling feature descriptor and dual-correlate optimization are merged into the Sequential Coupling Dual-Correlate Network (SeCo DC-Net) to produce high-quality matches. The evaluation is conducted on two public laparoscopic datasets: Scared and EndoSLAM, and the experimental results show the proposed network outperforms state-of-the-art methods in homography estimation, relative pose estimation, reprojection error, matching pairs number and inference runtime. The source code is publicly available at https://github.com/Iheckzza/FeatureMatching.

Collapse

Liu S, Fan J, Yang Y, Xiao D, Ai D, Song H, Wang Y, Yang J. Monocular endoscopy images depth estimation with multi-scale residual fusion. Comput Biol Med 2024;169:107850. [PMID: 38145602 DOI: 10.1016/j.compbiomed.2023.107850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 11/16/2023] [Accepted: 12/11/2023] [Indexed: 12/27/2023]

Luo X, Xie L, Zeng HQ, Wang X, Li S. Monocular endoscope 6-DoF tracking with constrained evolutionary stochastic filtering. Med Image Anal 2023;89:102928. [PMID: 37603943 DOI: 10.1016/j.media.2023.102928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Revised: 05/15/2023] [Accepted: 08/03/2023] [Indexed: 08/23/2023]

Hirohata Y, Sogabe M, Miyazaki T, Kawase T, Kawashima K. Confidence-aware self-supervised learning for dense monocular depth estimation in dynamic laparoscopic scene. Sci Rep 2023;13:15380. [PMID: 37717055 PMCID: PMC10505201 DOI: 10.1038/s41598-023-42713-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 09/13/2023] [Indexed: 09/18/2023] Open

Zhang X, Ji X, Wang J, Fan Y, Tao C. Renal surface reconstruction and segmentation for image-guided surgical navigation of laparoscopic partial nephrectomy. Biomed Eng Lett 2023;13:165-174. [PMID: 37124114 PMCID: PMC10130295 DOI: 10.1007/s13534-023-00263-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 12/01/2022] [Accepted: 01/22/2023] [Indexed: 02/04/2023] Open

Abstract

An unpredictable dynamic surgical environment makes it necessary to measure morphological information of target tissue real-time for laparoscopic image-guided navigation. The stereo vision method for intraoperative tissue 3D reconstruction has the most potential for clinical development benefiting from its high reconstruction accuracy and laparoscopy compatibility. However, existing stereo vision methods have difficulty in achieving high reconstruction accuracy in real time. Also, intraoperative tissue reconstruction results often contain complex background and instrument information that prevents clinical development for image-guided systems. Taking laparoscopic partial nephrectomy (LPN) as the research object, this paper realizes a real-time dense reconstruction and extraction of the kidney tissue surface. The central symmetrical Census based semi-global block stereo matching algorithm is proposed to generate a dense disparity map. A GPU-based pixel-by-pixel connectivity segmentation mechanism is designed to segment the renal tissue area. An in-vitro porcine heart, in-vivo porcine kidney and offline clinical LPN data were performed to evaluate the accuracy and effectiveness of our approach. The algorithm achieved a reconstruction accuracy of ± 2 mm with a real-time update rate of 21 fps for an HD image size of 960 × 540, and 91.0% target tissue segmentation accuracy even with surgical instrument occlusions. Experimental results have demonstrated that the proposed method could accurately reconstruct and extract renal surface in real-time in LPN. The measurement results can be used directly for image-guided systems. Our method provides a new way to measure geometric information of target tissue intraoperatively in laparoscopy surgery.

Supplementary Information

The online version contains supplementary material available at 10.1007/s13534-023-00263-1.

Collapse

Emaduddin M, Halic T, Demirel D, Bayrak C, Arikatla VS, De S. Specular Reflection Removal for 3D Reconstruction of Tissues using Endoscopy Videos. PROCEEDINGS OF IEEE SOUTHEASTCON. IEEE SOUTHEASTCON 2023;2023:246-252. [PMID: 37900192 PMCID: PMC10603791 DOI: 10.1109/southeastcon51012.2023.10115137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/31/2023]

Wang Y, Zhao L, Gong L, Chen X, Zuo S. A monocular SLAM system based on SIFT features for gastroscope tracking. Med Biol Eng Comput 2023;61:511-523. [PMID: 36534372 DOI: 10.1007/s11517-022-02739-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 12/09/2022] [Indexed: 12/23/2022]

Mo J, Islam MJ, Sattar J. Fast Direct Stereo Visual SLAM. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2021.3133860] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Tukra S, Lidströmer N, Ashrafian H, Gianarrou S. AI in Surgical Robotics. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Zhang Z, Wang L, Zheng W, Yin L, Hu R, Yang B. Endoscope image mosaic based on pyramid ORB. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103261] [Citation(s) in RCA: 45] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Edwards PJE, Psychogyios D, Speidel S, Maier-Hein L, Stoyanov D. SERV-CT: A disparity dataset from cone-beam CT for validation of endoscopic 3D reconstruction. Med Image Anal 2021;76:102302. [PMID: 34906918 PMCID: PMC8961000 DOI: 10.1016/j.media.2021.102302] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 11/01/2021] [Accepted: 11/04/2021] [Indexed: 11/27/2022]

Abstract

•

Full torso porcine CT model for stereo-endoscopic reconstruction validation

•

CT of endoscope and anatomy with constrained manual alignment provides a reference

•

Accuracy analysis of repeated alignments and performance of existing algorithms presented

•

Open sourced dataset for stereo reconstruction validation

In computer vision, reference datasets from simulation and real outdoor scenes have been highly successful in promoting algorithmic development in stereo reconstruction. Endoscopic stereo reconstruction for surgical scenes gives rise to specific problems, including the lack of clear corner features, highly specular surface properties and the presence of blood and smoke. These issues present difficulties for both stereo reconstruction itself and also for standardised dataset production. Previous datasets have been produced using computed tomography (CT) or structured light reconstruction on phantom or ex vivo models. We present a stereo-endoscopic reconstruction validation dataset based on cone-beam CT (SERV-CT). Two ex vivo small porcine full torso cadavers were placed within the view of the endoscope with both the endoscope and target anatomy visible in the CT scan. Subsequent orientation of the endoscope was manually aligned to match the stereoscopic view and benchmark disparities, depths and occlusions are calculated. The requirement of a CT scan limited the number of stereo pairs to 8 from each ex vivo sample. For the second sample an RGB surface was acquired to aid alignment of smooth, featureless surfaces. Repeated manual alignments showed an RMS disparity accuracy of around 2 pixels and a depth accuracy of about 2 mm. A simplified reference dataset is provided consisting of endoscope image pairs with corresponding calibration, disparities, depths and occlusions covering the majority of the endoscopic image and a range of tissue types, including smooth specular surfaces, as well as significant variation of depth. We assessed the performance of various stereo algorithms from online available repositories. There is a significant variation between algorithms, highlighting some of the challenges of surgical endoscopic images. The SERV-CT dataset provides an easy to use stereoscopic validation for surgical applications with smooth reference disparities and depths covering the majority of the endoscopic image. This complements existing resources well and we hope will aid the development of surgical endoscopic anatomical reconstruction algorithms.

Collapse

Recasens D, Lamarca J, Facil JM, Montiel JMM, Civera J. Endo-Depth-and-Motion: Reconstruction and Tracking in Endoscopic Videos Using Depth Networks and Photometric Constraints. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3095528] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Fu Z, Jin Z, Zhang C, Dai Y, Gao X, Wang Z, Li L, Ding G, Hu H, Wang P, Ye X. Visual-electromagnetic system: A novel fusion-based monocular localization, reconstruction, and measurement for flexible ureteroscopy. Int J Med Robot 2021;17:e2274. [PMID: 33960604 DOI: 10.1002/rcs.2274] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2020] [Revised: 02/16/2021] [Accepted: 05/03/2021] [Indexed: 12/29/2022]

Ma R, Wang R, Zhang Y, Pizer S, McGill SK, Rosenman J, Frahm JM. RNNSLAM: Reconstructing the 3D colon to visualize missing regions during a colonoscopy. Med Image Anal 2021;72:102100. [PMID: 34102478 DOI: 10.1016/j.media.2021.102100] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 04/30/2021] [Accepted: 05/05/2021] [Indexed: 11/19/2022]

Ozyoruk KB, Gokceler GI, Bobrow TL, Coskun G, Incetan K, Almalioglu Y, Mahmood F, Curto E, Perdigoto L, Oliveira M, Sahin H, Araujo H, Alexandrino H, Durr NJ, Gilbert HB, Turan M. EndoSLAM dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos. Med Image Anal 2021;71:102058. [PMID: 33930829 DOI: 10.1016/j.media.2021.102058] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 01/23/2021] [Accepted: 03/29/2021] [Indexed: 02/07/2023]

Abstract

Deep learning techniques hold promise to develop dense topography reconstruction and pose estimation methods for endoscopic videos. However, currently available datasets do not support effective quantitative benchmarking. In this paper, we introduce a comprehensive endoscopic SLAM dataset consisting of 3D point cloud data for six porcine organs, capsule and standard endoscopy recordings, synthetically generated data as well as clinically in use conventional endoscope recording of the phantom colon with computed tomography(CT) scan ground truth. A Panda robotic arm, two commercially available capsule endoscopes, three conventional endoscopes with different camera properties, two high precision 3D scanners, and a CT scanner were employed to collect data from eight ex-vivo porcine gastrointestinal (GI)-tract organs and a silicone colon phantom model. In total, 35 sub-datasets are provided with 6D pose ground truth for the ex-vivo part: 18 sub-datasets for colon, 12 sub-datasets for stomach, and 5 sub-datasets for small intestine, while four of these contain polyp-mimicking elevations carried out by an expert gastroenterologist. To verify the applicability of this data for use with real clinical systems, we recorded a video sequence with a state-of-the-art colonoscope from a full representation silicon colon phantom. Synthetic capsule endoscopy frames from stomach, colon, and small intestine with both depth and pose annotations are included to facilitate the study of simulation-to-real transfer learning algorithms. Additionally, we propound Endo-SfMLearner, an unsupervised monocular depth and pose estimation method that combines residual networks with a spatial attention module in order to dictate the network to focus on distinguishable and highly textured tissue regions. The proposed approach makes use of a brightness-aware photometric loss to improve the robustness under fast frame-to-frame illumination changes that are commonly seen in endoscopic videos. To exemplify the use-case of the EndoSLAM dataset, the performance of Endo-SfMLearner is extensively compared with the state-of-the-art: SC-SfMLearner, Monodepth2, and SfMLearner. The codes and the link for the dataset are publicly available at https://github.com/CapsuleEndoscope/EndoSLAM. A video demonstrating the experimental setup and procedure is accessible as Supplementary Video 1.

Collapse

Widya AR, Monno Y, Okutomi M, Suzuki S, Gotoda T, Miki K. Stomach 3D Reconstruction Using Virtual Chromoendoscopic Images. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE-JTEHM 2021;9:1700211. [PMID: 33796417 PMCID: PMC8009143 DOI: 10.1109/jtehm.2021.3062226] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 01/19/2021] [Accepted: 02/15/2021] [Indexed: 12/23/2022]

Lamarca J, Parashar S, Bartoli A, Montiel JMM. DefSLAM: Tracking and Mapping of Deforming Scenes From Monocular Sequences. IEEE T ROBOT 2021. [DOI: 10.1109/tro.2020.3020739] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Tukra S, Lidströmer N, Ashrafian H, Giannarou S. AI in Surgical Robotics. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_323-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Zhou Y, Eimen RL, Seibel EJ, Bowden AK. Cost-Efficient Video Synthesis and Evaluation for Development of Virtual 3D Endoscopy. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE 2021;9:1800711. [PMID: 34950539 PMCID: PMC8673697 DOI: 10.1109/jtehm.2021.3132193] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 10/11/2021] [Accepted: 11/13/2021] [Indexed: 11/06/2022]

Xie T, Wang K, Li R, Tang X. Visual Robot Relocalization Based on Multi-Task CNN and Image-Similarity Strategy. SENSORS 2020;20:s20236943. [PMID: 33291774 PMCID: PMC7730972 DOI: 10.3390/s20236943] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 11/25/2020] [Accepted: 12/01/2020] [Indexed: 11/16/2022]

Abstract

The traditional CNN for 6D robot relocalization which outputs pose estimations does not interpret whether the model is making sensible predictions or just guessing at random. We found that convnet representations trained on classification problems generalize well to other tasks. Thus, we propose a multi-task CNN for robot relocalization, which can simultaneously perform pose regression and scene recognition. Scene recognition determines whether the input image belongs to the current scene in which the robot is located, not only reducing the error of relocalization but also making us understand with what confidence we can trust the prediction. Meanwhile, we found that when there is a large visual difference between testing images and training images, the pose precision becomes low. Based on this, we present the dual-level image-similarity strategy (DLISS), which consists of two levels: initial level and iteration-level. The initial level performs feature vector clustering in the training set and feature vector acquisition in testing images. The iteration level, namely, the PSO-based image-block selection algorithm, can select the testing images which are the most similar to training images based on the initial level, enabling us to gain higher pose accuracy in testing set. Our method considers both the accuracy and the robustness of relocalization, and it can operate indoors and outdoors in real time, taking at most 27 ms per frame to compute. Finally, we used the Microsoft 7Scenes dataset and the Cambridge Landmarks dataset to evaluate our method. It can obtain approximately 0.33 m and 7.51∘ accuracy on 7Scenes dataset, and get approximately 1.44 m and 4.83∘ accuracy on the Cambridge Landmarks dataset. Compared with PoseNet, our CNN reduced the average positional error by 25% and the average angular error by 27.79% on 7Scenes dataset, and reduced the average positional error by 40% and the average angular error by 28.55% on the Cambridge Landmarks dataset. We show that our multi-task CNN can localize from high-level features and is robust to images which are not in the current scene. Furthermore, we show that our multi-task CNN gets higher accuracy of relocalization by using testing images obtained by DLISS.

Collapse

Hartwig R, Ostler D, Feußner H, Berlet M, Yu K, Rosenthal JC, Wilhelm D. COMPASS: localization in laparoscopic visceral surgery. CURRENT DIRECTIONS IN BIOMEDICAL ENGINEERING 2020. [DOI: 10.1515/cdbme-2020-0013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Chu Y, Yang X, Li H, Ai D, Ding Y, Fan J, Song H, Yang J. Multi-level feature aggregation network for instrument identification of endoscopic images. Phys Med Biol 2020;65:165004. [PMID: 32344381 DOI: 10.1088/1361-6560/ab8dda] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Abstract

Identification of surgical instruments is crucial in understanding surgical scenarios and providing an assistive process in endoscopic image-guided surgery. This study proposes a novel multilevel feature-aggregated deep convolutional neural network (MLFA-Net) for identifying surgical instruments in endoscopic images. First, a global feature augmentation layer is created on the top layer of the backbone to improve the localization ability of object identification by boosting the high-level semantic information to the feature flow network. Second, a modified interaction path of cross-channel features is proposed to increase the nonlinear combination of features in the same level and improve the efficiency of information propagation. Third, a multiview fusion branch of features is built to aggregate the location-sensitive information of the same level in different views, increase the information diversity of features, and enhance the localization ability of objects. By utilizing the latent information, the proposed network of multilevel feature aggregation can accomplish multitask instrument identification with a single network. Three tasks are handled by the proposed network, including object detection, which classifies the type of instrument and locates its border; mask segmentation, which detects the instrument shape; and pose estimation, which detects the keypoint of instrument parts. The experiments are performed on laparoscopic images from MICCAI 2017 Endoscopic Vision Challenge, and the mean average precision (AP) and average recall (AR) are utilized to quantify the segmentation and pose estimation results. For the bounding box regression, the AP and AR are 79.1% and 63.2%, respectively, while the AP and AR of mask segmentation are 78.1% and 62.1%, and the AP and AR of the pose estimation achieve 67.1% and 55.7%, respectively. The experiments demonstrate that our method efficiently improves the recognition accuracy of the instrument in endoscopic images, and outperforms the other state-of-the-art methods.

Collapse

Application of artificial intelligence in surgery. Front Med 2020;14:417-430. [DOI: 10.1007/s11684-020-0770-0] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Accepted: 03/05/2020] [Indexed: 12/14/2022]

Chu Y, Li H, Li X, Ding Y, Yang X, Ai D, Chen X, Wang Y, Yang J. Endoscopic image feature matching via motion consensus and global bilateral regression. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2020;190:105370. [PMID: 32036206 DOI: 10.1016/j.cmpb.2020.105370] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 12/17/2019] [Accepted: 01/26/2020] [Indexed: 06/10/2023]

Abstract

BACKGROUND AND OBJECTIVE

Feature matching of endoscopic images is of crucial importance in many clinical applications, such as object tracking and surface reconstruction. However, with the presence of low texture, specular reflections and deformations, the feature matching methods of natural scene are facing great challenges in minimally invasive surgery (MIS) scenarios. We propose a novel motion consensus-based method for endoscopic image feature matching to address these problems.

METHODS

Our method starts by correcting the radial distortion with the spherical projection model and removing the specular reflection regions with an adaptive detection method, which helps to eliminate the image distortion and to reduce the quantity of outliers. We solve the matching problem with a two-stage strategy that progressively estimates a consensus of inliers; the result is a precisely smoothed motion field. First, we construct a spatial motion field from candidate feature matches and estimate its maximum posterior with expectation maximization algorithm, which is computationally efficient and able to obtain smoothed motion field quickly. Second, we extend the smoothed motion field to the affine domain and refine it with bilateral regression to preserve locally subtle motions. The true matches can be identified by checking the difference of feature motion against the estimated field.

RESULTS

Evaluations are implemented on two simulation datasets of deformation (218 images) and four different types of endoscopic datasets (1032 images). Our method is compared with three other state-of-the-art methods and achieves the best performance on affine transformation and nonrigid deformation simulations, with inlier ratio of 86.7% and 94.3%, sensitivity of 90.0% and 96.2%, precision of 88.2% and 93.9%, and F1-Score of 89.1% and 95.0%, respectively. On clinical datasets evaluations, the proposed method achieves an average reprojection error of 3.7 pixels and a consistent performance in multi-image correspondence of sequential images. Furthermore, we also present a surface reconstruction result from rhinoscopic images to validate the reliability of our method, which shows high-quality feature matching results.

CONCLUSIONS

The proposed motion consensus-based feature matching method is proved effective and robust for endoscopic images correspondence. This demonstrates its capability to generate reliable feature matches for surface reconstruction and other meaningful applications in MIS scenarios.

Collapse

Widya AR, Monno Y, Imahori K, Okutomi M, Suzuki S, Gotoda T, Miki K. 3D Reconstruction of Whole Stomach from Endoscope Video Using Structure-from-Motion. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020;2019:3900-3904. [PMID: 31946725 DOI: 10.1109/embc.2019.8857964] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Liu X, Sinha A, Ishii M, Hager GD, Reiter A, Taylor RH, Unberath M. Dense Depth Estimation in Monocular Endoscopy With Self-Supervised Learning Methods. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020;39:1438-1447. [PMID: 31689184 PMCID: PMC7289272 DOI: 10.1109/tmi.2019.2950936] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]

Qiu L, Ren H. Endoscope navigation with SLAM-based registration to computed tomography for transoral surgery. INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS 2020. [DOI: 10.1007/s41315-020-00127-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Zhou H, Jagadeesan J. Real-Time Dense Reconstruction of Tissue Surface From Stereo Optical Video. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020;39:400-412. [PMID: 31283478 PMCID: PMC6946894 DOI: 10.1109/tmi.2019.2927436] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]

Furukawa R, Nagamatsu G, Oka S, Kotachi T, Okamoto Y, Tanaka S, Kawasaki H. Simultaneous shape and camera-projector parameter estimation for 3D endoscopic system using CNN-based grid-oneshot scan. Healthc Technol Lett 2019;6:249-254. [PMID: 32038866 PMCID: PMC6943237 DOI: 10.1049/htl.2019.0070] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Accepted: 10/02/2019] [Indexed: 11/20/2022] Open

Widya AR, Monno Y, Okutomi M, Suzuki S, Gotoda T, Miki K. Whole Stomach 3D Reconstruction and Frame Localization From Monocular Endoscope Video. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE 2019;7:3300310. [PMID: 32309059 PMCID: PMC6830857 DOI: 10.1109/jtehm.2019.2946802] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/23/2019] [Revised: 09/03/2019] [Accepted: 09/25/2019] [Indexed: 12/22/2022]

Mahmoud N, Collins T, Hostettler A, Soler L, Doignon C, Montiel JMM. Live Tracking and Dense Reconstruction for Handheld Monocular Endoscopy. IEEE TRANSACTIONS ON MEDICAL IMAGING 2019;38:79-89. [PMID: 30010552 DOI: 10.1109/tmi.2018.2856109] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Mahmood F, Chen R, Durr NJ. Unsupervised Reverse Domain Adaptation for Synthetic Medical Images via Adversarial Training. IEEE TRANSACTIONS ON MEDICAL IMAGING 2018;37:2572-2581. [PMID: 29993538 DOI: 10.1109/tmi.2018.2842767] [Citation(s) in RCA: 97] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]

Speers AD, Ma B, Jarnagin WR, Himidan S, Simpson AL, Wildes RP. Fast and accurate vision-based stereo reconstruction and motion estimation for image-guided liver surgery. Healthc Technol Lett 2018;5:208-214. [PMID: 30464852 PMCID: PMC6222177 DOI: 10.1049/htl.2018.5071] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 08/20/2018] [Indexed: 11/25/2022] Open

Wide-Area Shape Reconstruction by 3D Endoscopic System Based on CNN Decoding, Shape Registration and Fusion. ACTA ACUST UNITED AC 2018. [DOI: 10.1007/978-3-030-01201-4_16] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Chen L, Tang W, John NW, Wan TR, Zhang JJ. SLAM-based dense surface reconstruction in monocular Minimally Invasive Surgery and its application to Augmented Reality. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2018;158:135-146. [PMID: 29544779 DOI: 10.1016/j.cmpb.2018.02.006] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Revised: 01/03/2018] [Accepted: 02/02/2018] [Indexed: 06/08/2023]

Abstract

BACKGROUND AND OBJECTIVE

While Minimally Invasive Surgery (MIS) offers considerable benefits to patients, it also imposes big challenges on a surgeon's performance due to well-known issues and restrictions associated with the field of view (FOV), hand-eye misalignment and disorientation, as well as the lack of stereoscopic depth perception in monocular endoscopy. Augmented Reality (AR) technology can help to overcome these limitations by augmenting the real scene with annotations, labels, tumour measurements or even a 3D reconstruction of anatomy structures at the target surgical locations. However, previous research attempts of using AR technology in monocular MIS surgical scenes have been mainly focused on the information overlay without addressing correct spatial calibrations, which could lead to incorrect localization of annotations and labels, and inaccurate depth cues and tumour measurements. In this paper, we present a novel intra-operative dense surface reconstruction framework that is capable of providing geometry information from only monocular MIS videos for geometry-aware AR applications such as site measurements and depth cues. We address a number of compelling issues in augmenting a scene for a monocular MIS environment, such as drifting and inaccurate planar mapping.

METHODS

A state-of-the-art Simultaneous Localization And Mapping (SLAM) algorithm used in robotics has been extended to deal with monocular MIS surgical scenes for reliable endoscopic camera tracking and salient point mapping. A robust global 3D surface reconstruction framework has been developed for building a dense surface using only unorganized sparse point clouds extracted from the SLAM. The 3D surface reconstruction framework employs the Moving Least Squares (MLS) smoothing algorithm and the Poisson surface reconstruction framework for real time processing of the point clouds data set. Finally, the 3D geometric information of the surgical scene allows better understanding and accurate placement AR augmentations based on a robust 3D calibration.

RESULTS

We demonstrate the clinical relevance of our proposed system through two examples: (a) measurement of the surface; (b) depth cues in monocular endoscopy. The performance and accuracy evaluations of the proposed framework consist of two steps. First, we have created a computer-generated endoscopy simulation video to quantify the accuracy of the camera tracking by comparing the results of the video camera tracking with the recorded ground-truth camera trajectories. The accuracy of the surface reconstruction is assessed by evaluating the Root Mean Square Distance (RMSD) of surface vertices of the reconstructed mesh with that of the ground truth 3D models. An error of 1.24 mm for the camera trajectories has been obtained and the RMSD for surface reconstruction is 2.54 mm, which compare favourably with previous approaches. Second, in vivo laparoscopic videos are used to examine the quality of accurate AR based annotation and measurement, and the creation of depth cues. These results show the potential promise of our geometry-aware AR technology to be used in MIS surgical scenes.

CONCLUSIONS

The results show that the new framework is robust and accurate in dealing with challenging situations such as the rapid endoscopy camera movements in monocular MIS scenes. Both camera tracking and surface reconstruction based on a sparse point cloud are effective and operated in real-time. This demonstrates the potential of our algorithm for accurate AR localization and depth augmentation with geometric cues and correct surface measurements in MIS with monocular endoscopes.

Collapse

A non-rigid map fusion-based direct SLAM method for endoscopic capsule robots. INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS 2017;1:399-409. [PMID: 29250588 PMCID: PMC5727175 DOI: 10.1007/s41315-017-0036-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2017] [Accepted: 11/06/2017] [Indexed: 02/07/2023]

Marmol A, Peynot T, Eriksson A, Jaiprakash A, Roberts J, Crawford R. Evaluation of Keypoint Detectors and Descriptors in Arthroscopic Images for Feature-Based Matching Applications. IEEE Robot Autom Lett 2017. [DOI: 10.1109/lra.2017.2714150] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Chen L, Tang W, John NW. Real-time geometry-aware augmented reality in minimally invasive surgery. Healthc Technol Lett 2017;4:163-167. [PMID: 29184658 PMCID: PMC5683199 DOI: 10.1049/htl.2017.0068] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Accepted: 07/31/2017] [Indexed: 11/25/2022] Open

Deray J, Sola J, Andrade-Cetto J. Word Ordering and Document Adjacency for Large Loop Closure Detection in 2-D Laser Maps. IEEE Robot Autom Lett 2017. [DOI: 10.1109/lra.2017.2657796] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Furukawa R, Sanomura Y, Tanaka S, Yoshida S, Sagawa R, Visentini-Scarzanella M, Kawasaki H. 3D endoscope system using DOE projector. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017;2016:2091-2094. [PMID: 28268743 DOI: 10.1109/embc.2016.7591140] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

The status of augmented reality in laparoscopic surgery as of 2016. Med Image Anal 2017;37:66-90. [DOI: 10.1016/j.media.2017.01.007] [Citation(s) in RCA: 183] [Impact Index Per Article: 26.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2016] [Revised: 01/16/2017] [Accepted: 01/23/2017] [Indexed: 12/27/2022]

Lurie KL, Angst R, Zlatev DV, Liao JC, Ellerbee Bowden AK. 3D reconstruction of cystoscopy videos for comprehensive bladder records. BIOMEDICAL OPTICS EXPRESS 2017;8:2106-2123. [PMID: 28736658 PMCID: PMC5516821 DOI: 10.1364/boe.8.002106] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Revised: 02/04/2017] [Accepted: 02/04/2017] [Indexed: 05/06/2023]

ORBSLAM-Based Endoscope Tracking and 3D Reconstruction. COMPUTER-ASSISTED AND ROBOTIC ENDOSCOPY 2017. [DOI: 10.1007/978-3-319-54057-3_7] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Cadena C, Carlone L, Carrillo H, Latif Y, Scaramuzza D, Neira J, Reid I, Leonard JJ. Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age. IEEE T ROBOT 2016. [DOI: 10.1109/tro.2016.2624754] [Citation(s) in RCA: 1565] [Impact Index Per Article: 195.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Furukawa R, Aoyama M, Hiura S, Aoki H, Kominami Y, Sanomura Y, Yoshida S, Tanaka S, Sagawa R, Kawasaki H. Calibration of a 3D endoscopic system based on active stereo method for shape measurement of biological tissues and specimen. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2016;2014:4991-4. [PMID: 25571113 DOI: 10.1109/embc.2014.6944745] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

On-patient see-through augmented reality based on visual SLAM. Int J Comput Assist Radiol Surg 2016;12:1-11. [DOI: 10.1007/s11548-016-1444-x] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2016] [Accepted: 06/07/2016] [Indexed: 11/26/2022]

Shape Acquisition and Registration for 3D Endoscope Based on Grid Pattern Projection. COMPUTER VISION – ECCV 2016 2016. [DOI: 10.1007/978-3-319-46466-4_24] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]