1
|
Yang Z, Dai J, Pan J. 3D reconstruction from endoscopy images: A survey. Comput Biol Med 2024; 175:108546. [PMID: 38704902 DOI: 10.1016/j.compbiomed.2024.108546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/05/2024] [Accepted: 04/28/2024] [Indexed: 05/07/2024]
Abstract
Three-dimensional reconstruction of images acquired through endoscopes is playing a vital role in an increasing number of medical applications. Endoscopes used in the clinic are commonly classified as monocular endoscopes and binocular endoscopes. We have reviewed the classification of methods for depth estimation according to the type of endoscope. Basically, depth estimation relies on feature matching of images and multi-view geometry theory. However, these traditional techniques have many problems in the endoscopic environment. With the increasing development of deep learning techniques, there is a growing number of works based on learning methods to address challenges such as inconsistent illumination and texture sparsity. We have reviewed over 170 papers published in the 10 years from 2013 to 2023. The commonly used public datasets and performance metrics are summarized. We also give a taxonomy of methods and analyze the advantages and drawbacks of algorithms. Summary tables and result atlas are listed to facilitate the comparison of qualitative and quantitative performance of different methods in each category. In addition, we summarize commonly used scene representation methods in endoscopy and speculate on the prospects of deep estimation research in medical applications. We also compare the robustness performance, processing time, and scene representation of the methods to facilitate doctors and researchers in selecting appropriate methods based on surgical applications.
Collapse
Affiliation(s)
- Zhuoyue Yang
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, 37 Xueyuan Road, Haidian District, Beijing, 100191, China; Peng Cheng Lab, 2 Xingke 1st Street, Nanshan District, Shenzhen, Guangdong Province, 518000, China
| | - Ju Dai
- Peng Cheng Lab, 2 Xingke 1st Street, Nanshan District, Shenzhen, Guangdong Province, 518000, China
| | - Junjun Pan
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, 37 Xueyuan Road, Haidian District, Beijing, 100191, China; Peng Cheng Lab, 2 Xingke 1st Street, Nanshan District, Shenzhen, Guangdong Province, 518000, China.
| |
Collapse
|
2
|
Ruano J, Gómez M, Romero E, Manzanera A. Leveraging a realistic synthetic database to learn Shape-from-Shading for estimating the colon depth in colonoscopy images. Comput Med Imaging Graph 2024; 115:102390. [PMID: 38714018 DOI: 10.1016/j.compmedimag.2024.102390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 03/30/2024] [Accepted: 04/25/2024] [Indexed: 05/09/2024]
Abstract
Colonoscopy is the choice procedure to diagnose, screening, and treat the colon and rectum cancer, from early detection of small precancerous lesions (polyps), to confirmation of malign masses. However, the high variability of the organ appearance and the complex shape of both the colon wall and structures of interest make this exploration difficult. Learned visuospatial and perceptual abilities mitigate technical limitations in clinical practice by proper estimation of the intestinal depth. This work introduces a novel methodology to estimate colon depth maps in single frames from monocular colonoscopy videos. The generated depth map is inferred from the shading variation of the colon wall with respect to the light source, as learned from a realistic synthetic database. Briefly, a classic convolutional neural network architecture is trained from scratch to estimate the depth map, improving sharp depth estimations in haustral folds and polyps by a custom loss function that minimizes the estimation error in edges and curvatures. The network was trained by a custom synthetic colonoscopy database herein constructed and released, composed of 248400 frames (47 videos), with depth annotations at the level of pixels. This collection comprehends 5 subsets of videos with progressively higher levels of visual complexity. Evaluation of the depth estimation with the synthetic database reached a threshold accuracy of 95.65%, and a mean-RMSE of 0.451cm, while a qualitative assessment with a real database showed consistent depth estimations, visually evaluated by the expert gastroenterologist coauthoring this paper. Finally, the method achieved competitive performance with respect to another state-of-the-art method using a public synthetic database and comparable results in a set of images with other five state-of-the-art methods. Additionally, three-dimensional reconstructions demonstrated useful approximations of the gastrointestinal tract geometry. Code for reproducing the reported results and the dataset are available at https://github.com/Cimalab-unal/ColonDepthEstimation.
Collapse
Affiliation(s)
- Josué Ruano
- Computer Imaging and Medical Applications Laboratory (CIM@LAB), Universidad Nacional de Colombia, 111321, Bogotá, Colombia
| | - Martín Gómez
- Unidad de Gastroenterología, Hospital Universitario Nacional, 111321, Bogotá, Colombia
| | - Eduardo Romero
- Computer Imaging and Medical Applications Laboratory (CIM@LAB), Universidad Nacional de Colombia, 111321, Bogotá, Colombia.
| | - Antoine Manzanera
- Unité d'Informatique et d'Ingénierie des Systémes (U2IS), ENSTA Paris, Institut Polytechnique de Paris, Palaiseau, 91762, Ile de France, France
| |
Collapse
|
3
|
Schmidt A, Mohareri O, DiMaio S, Yip MC, Salcudean SE. Tracking and mapping in medical computer vision: A review. Med Image Anal 2024; 94:103131. [PMID: 38442528 DOI: 10.1016/j.media.2024.103131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 02/08/2024] [Accepted: 02/29/2024] [Indexed: 03/07/2024]
Abstract
As computer vision algorithms increase in capability, their applications in clinical systems will become more pervasive. These applications include: diagnostics, such as colonoscopy and bronchoscopy; guiding biopsies, minimally invasive interventions, and surgery; automating instrument motion; and providing image guidance using pre-operative scans. Many of these applications depend on the specific visual nature of medical scenes and require designing algorithms to perform in this environment. In this review, we provide an update to the field of camera-based tracking and scene mapping in surgery and diagnostics in medical computer vision. We begin with describing our review process, which results in a final list of 515 papers that we cover. We then give a high-level summary of the state of the art and provide relevant background for those who need tracking and mapping for their clinical applications. After which, we review datasets provided in the field and the clinical needs that motivate their design. Then, we delve into the algorithmic side, and summarize recent developments. This summary should be especially useful for algorithm designers and to those looking to understand the capability of off-the-shelf methods. We maintain focus on algorithms for deformable environments while also reviewing the essential building blocks in rigid tracking and mapping since there is a large amount of crossover in methods. With the field summarized, we discuss the current state of the tracking and mapping methods along with needs for future algorithms, needs for quantification, and the viability of clinical applications. We then provide some research directions and questions. We conclude that new methods need to be designed or combined to support clinical applications in deformable environments, and more focus needs to be put into collecting datasets for training and evaluation.
Collapse
Affiliation(s)
- Adam Schmidt
- Department of Electrical and Computer Engineering, University of British Columbia, 2329 West Mall, Vancouver V6T 1Z4, BC, Canada.
| | - Omid Mohareri
- Advanced Research, Intuitive Surgical, 1020 Kifer Rd, Sunnyvale, CA 94086, USA
| | - Simon DiMaio
- Advanced Research, Intuitive Surgical, 1020 Kifer Rd, Sunnyvale, CA 94086, USA
| | - Michael C Yip
- Department of Electrical and Computer Engineering, University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
| | - Septimiu E Salcudean
- Department of Electrical and Computer Engineering, University of British Columbia, 2329 West Mall, Vancouver V6T 1Z4, BC, Canada
| |
Collapse
|
4
|
Tiwary P, Bhattacharyya K, A P P. Cycle consistent twin energy-based models for image-to-image translation. Med Image Anal 2024; 91:103031. [PMID: 37988920 DOI: 10.1016/j.media.2023.103031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 09/10/2023] [Accepted: 11/13/2023] [Indexed: 11/23/2023]
Abstract
Domain shift refers to change of distributional characteristics between the training (source) and the testing (target) datasets of a learning task, leading to performance drop. For tasks involving medical images, domain shift may be caused because of several factors such as change in underlying imaging modalities, measuring devices and staining mechanisms. Recent approaches address this issue via generative models based on the principles of adversarial learning albeit they suffer from issues such as difficulty in training and lack of diversity. Motivated by the aforementioned observations, we adapt an alternative class of deep generative models called the Energy-Based Models (EBMs) for the task of unpaired image-to-image translation of medical images. Specifically, we propose a novel method called the Cycle Consistent Twin EBMs (CCT-EBM) which employs a pair of EBMs in the latent space of an Auto-Encoder trained on the source data. While one of the EBMs translates the source to the target domain the other does vice-versa along with a novel consistency loss, ensuring translation symmetry and coupling between the domains. We theoretically analyze the proposed method and show that our design leads to better translation between the domains with reduced langevin mixing steps. We demonstrate the efficacy of our method through detailed quantitative and qualitative experiments on image segmentation tasks on three different datasets vis-a-vis state-of-the-art methods.
Collapse
Affiliation(s)
- Piyush Tiwary
- Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, Karnataka 560012, India.
| | - Kinjawl Bhattacharyya
- Department of Electrical Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Prathosh A P
- Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, Karnataka 560012, India
| |
Collapse
|
5
|
Liu S, Fan J, Zang L, Yang Y, Fu T, Song H, Wang Y, Yang J. Pose estimation via structure-depth information from monocular endoscopy images sequence. Biomed Opt Express 2024; 15:460-478. [PMID: 38223180 PMCID: PMC10783895 DOI: 10.1364/boe.498262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 12/08/2023] [Accepted: 12/14/2023] [Indexed: 01/16/2024]
Abstract
Image-based endoscopy pose estimation has been shown to significantly improve the visualization and accuracy of minimally invasive surgery (MIS). This paper proposes a method for pose estimation based on structure-depth information from a monocular endoscopy image sequence. Firstly, the initial frame location is constrained using the image structure difference (ISD) network. Secondly, endoscopy image depth information is used to estimate the pose of sequence frames. Finally, adaptive boundary constraints are used to optimize continuous frame endoscopy pose estimation, resulting in more accurate intraoperative endoscopy pose estimation. Evaluations were conducted on publicly available datasets, with the pose estimation error in bronchoscopy and colonoscopy datasets reaching 1.43 mm and 3.64 mm, respectively. These results meet the real-time requirements of various scenarios, demonstrating the capability of this method to generate reliable pose estimation results for endoscopy images and its meaningful applications in clinical practice. This method enables accurate localization of endoscopy images during surgery, assisting physicians in performing safer and more effective procedures.
Collapse
Affiliation(s)
- Shiyuan Liu
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
- China Center for Information Industry Development, Beijing 100081, China
| | - Jingfan Fan
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Liugeng Zang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Yun Yang
- Department of General Surgery, Beijing Friendship Hospital, Capital Medical University; National Clinical Research Center for Digestive Diseases, Beijing 100050, China
| | - Tianyu Fu
- Institute of Engineering Medicine, Beijing Institute of Technology, Beijing 100081, China
| | - Hong Song
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Yongtian Wang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Jian Yang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
6
|
Bobrow TL, Golhar M, Vijayan R, Akshintala VS, Garcia JR, Durr NJ. Colonoscopy 3D video dataset with paired depth from 2D-3D registration. Med Image Anal 2023; 90:102956. [PMID: 37713764 PMCID: PMC10591895 DOI: 10.1016/j.media.2023.102956] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 06/29/2023] [Accepted: 09/04/2023] [Indexed: 09/17/2023]
Abstract
Screening colonoscopy is an important clinical application for several 3D computer vision techniques, including depth estimation, surface reconstruction, and missing region detection. However, the development, evaluation, and comparison of these techniques in real colonoscopy videos remain largely qualitative due to the difficulty of acquiring ground truth data. In this work, we present a Colonoscopy 3D Video Dataset (C3VD) acquired with a high definition clinical colonoscope and high-fidelity colon models for benchmarking computer vision methods in colonoscopy. We introduce a novel multimodal 2D-3D registration technique to register optical video sequences with ground truth rendered views of a known 3D model. The different modalities are registered by transforming optical images to depth maps with a Generative Adversarial Network and aligning edge features with an evolutionary optimizer. This registration method achieves an average translation error of 0.321 millimeters and an average rotation error of 0.159 degrees in simulation experiments where error-free ground truth is available. The method also leverages video information, improving registration accuracy by 55.6% for translation and 60.4% for rotation compared to single frame registration. 22 short video sequences were registered to generate 10,015 total frames with paired ground truth depth, surface normals, optical flow, occlusion, six degree-of-freedom pose, coverage maps, and 3D models. The dataset also includes screening videos acquired by a gastroenterologist with paired ground truth pose and 3D surface models. The dataset and registration source code are available at https://durr.jhu.edu/C3VD.
Collapse
Affiliation(s)
- Taylor L Bobrow
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Mayank Golhar
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Rohan Vijayan
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Venkata S Akshintala
- Division of Gastroenterology and Hepatology, Johns Hopkins Medicine, Baltimore, MD 21287, USA
| | - Juan R Garcia
- Department of Art as Applied to Medicine, Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Nicholas J Durr
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.
| |
Collapse
|
7
|
Shi H, Wang Z, Zhou Y, Li D, Yang X, Li Q. Bidirectional Semi-Supervised Dual-Branch CNN for Robust 3D Reconstruction of Stereo Endoscopic Images via Adaptive Cross and Parallel Supervisions. IEEE Trans Med Imaging 2023; 42:3269-3282. [PMID: 37227904 DOI: 10.1109/tmi.2023.3279899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Semi-supervised learning via teacher-student network can train a model effectively on a few labeled samples. It enables a student model to distill knowledge from the teacher's predictions of extra unlabeled data. However, such knowledge flow is typically unidirectional, having the accuracy vulnerable to the quality of teacher model. In this paper, we seek to robust 3D reconstruction of stereo endoscopic images by proposing a novel fashion of bidirectional learning between two learners, each of which can play both roles of teacher and student concurrently. Specifically, we introduce two self-supervisions, i.e., Adaptive Cross Supervision (ACS) and Adaptive Parallel Supervision (APS), to learn a dual-branch convolutional neural network. The two branches predict two different disparity probability distributions for the same position, and output their expectations as disparity values. The learned knowledge flows across branches along two directions: a cross direction (disparity guides distribution in ACS) and a parallel direction (disparity guides disparity in APS). Moreover, each branch also learns confidences to dynamically refine its provided supervisions. In ACS, the predicted disparity is softened into a unimodal distribution, and the lower the confidence, the smoother the distribution. In APS, the incorrect predictions are suppressed by lowering the weights of those with low confidence. With the adaptive bidirectional learning, the two branches enjoy well-tuned mutual supervisions, and eventually converge on a consistent and more accurate disparity estimation. The experimental results on four public datasets demonstrate our superior accuracy over other state-of-the-arts with a relative decrease of averaged disparity error by at least 9.76%.
Collapse
|
8
|
Lazo JF, Rosa B, Catellani M, Fontana M, Mistretta FA, Musi G, de Cobelli O, de Mathelin M, De Momi E. Semi-Supervised Bladder Tissue Classification in Multi-Domain Endoscopic Images. IEEE Trans Biomed Eng 2023; 70:2822-2833. [PMID: 37037233 DOI: 10.1109/tbme.2023.3265679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2023]
Abstract
OBJECTIVE Accurate visual classification of bladder tissue during Trans-Urethral Resection of Bladder Tumor (TURBT) procedures is essential to improve early cancer diagnosis and treatment. During TURBT interventions, White Light Imaging (WLI) and Narrow Band Imaging (NBI) techniques are used for lesion detection. Each imaging technique provides diverse visual information that allows clinicians to identify and classify cancerous lesions. Computer vision methods that use both imaging techniques could improve endoscopic diagnosis. We address the challenge of tissue classification when annotations are available only in one domain, in our case WLI, and the endoscopic images correspond to an unpaired dataset, i.e. there is no exact equivalent for every image in both NBI and WLI domains. METHOD We propose a semi-surprised Generative Adversarial Network (GAN)-based method composed of three main components: a teacher network trained on the labeled WLI data; a cycle-consistency GAN to perform unpaired image-to-image translation, and a multi-input student network. To ensure the quality of the synthetic images generated by the proposed GAN we perform a detailed quantitative, and qualitative analysis with the help of specialists. CONCLUSION The overall average classification accuracy, precision, and recall obtained with the proposed method for tissue classification are 0.90, 0.88, and 0.89 respectively, while the same metrics obtained in the unlabeled domain (NBI) are 0.92, 0.64, and 0.94 respectively. The quality of the generated images is reliable enough to deceive specialists. SIGNIFICANCE This study shows the potential of using semi-supervised GAN-based bladder tissue classification when annotations are limited in multi-domain data.
Collapse
|
9
|
Wang X, Nie Y, Ren W, Wei M, Zhang J. Multi-scale, multi-dimensional binocular endoscopic image depth estimation network. Comput Biol Med 2023; 164:107305. [PMID: 37597409 DOI: 10.1016/j.compbiomed.2023.107305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 07/07/2023] [Accepted: 07/28/2023] [Indexed: 08/21/2023]
Abstract
During invasive surgery, the use of deep learning techniques to acquire depth information from lesion sites in real-time is hindered by the lack of endoscopic environmental datasets. This work aims to develop a high-accuracy three-dimensional (3D) simulation model for generating image datasets and acquiring depth information in real-time. Here, we proposed an end-to-end multi-scale supervisory depth estimation network (MMDENet) model for the depth estimation of pairs of binocular images. The proposed MMDENet highlights a multi-scale feature extraction module incorporating contextual information to enhance the correspondence precision of poorly exposed regions. A multi-dimensional information-guidance refinement module is also proposed to refine the initial coarse disparity map. Statistical experimentation demonstrated a 3.14% reduction in endpoint error compared to state-of-the-art methods. With a processing time of approximately 30fps, satisfying the requirements of real-time operation applications. In order to validate the performance of the trained MMDENet in actual endoscopic images, we conduct both qualitative and quantitative analysis with 93.38% high precision, which holds great promise for applications in surgical navigation.
Collapse
Affiliation(s)
- Xiongzhi Wang
- School of Future Technology, University of Chinese Academy of Sciences, Beijing 100039, China; School of Aerospace Science And Technology, Xidian University, Xian 710071, China.
| | - Yunfeng Nie
- Brussel Photonics, Department of Applied Physics and Photonics, Vrije Universiteit Brussel and Flanders Make, 1050 Brussels, Belgium
| | - Wenqi Ren
- State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100093, China
| | - Min Wei
- Department of Orthopedics, the Fourth Medical Center, Chinese PLA General Hospital, Beijing 100853, China
| | - Jingang Zhang
- School of Future Technology, University of Chinese Academy of Sciences, Beijing 100039, China; School of Aerospace Science And Technology, Xidian University, Xian 710071, China.
| |
Collapse
|
10
|
Mathew A, Magerand L, Trucco E, Manfredi L. Self-supervised monocular depth estimation for high field of view colonoscopy cameras. Front Robot AI 2023; 10:1212525. [PMID: 37559569 PMCID: PMC10407791 DOI: 10.3389/frobt.2023.1212525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 06/26/2023] [Indexed: 08/11/2023] Open
Abstract
Optical colonoscopy is the gold standard procedure to detect colorectal cancer, the fourth most common cancer in the United Kingdom. Up to 22%-28% of polyps can be missed during the procedure that is associated with interval cancer. A vision-based autonomous soft endorobot for colonoscopy can drastically improve the accuracy of the procedure by inspecting the colon more systematically with reduced discomfort. A three-dimensional understanding of the environment is essential for robot navigation and can also improve the adenoma detection rate. Monocular depth estimation with deep learning methods has progressed substantially, but collecting ground-truth depth maps remains a challenge as no 3D camera can be fitted to a standard colonoscope. This work addresses this issue by using a self-supervised monocular depth estimation model that directly learns depth from video sequences with view synthesis. In addition, our model accommodates wide field-of-view cameras typically used in colonoscopy and specific challenges such as deformable surfaces, specular lighting, non-Lambertian surfaces, and high occlusion. We performed qualitative analysis on a synthetic data set, a quantitative examination of the colonoscopy training model, and real colonoscopy videos in near real-time.
Collapse
Affiliation(s)
- Alwyn Mathew
- Division of Imaging Science and Technology, School of Medicine, University of Dundee, Dundee, United Kingdom
| | - Ludovic Magerand
- Discipline of Computing, School of Science and Engineering, University of Dundee, Dundee, United Kingdom
| | - Emanuele Trucco
- Discipline of Computing, School of Science and Engineering, University of Dundee, Dundee, United Kingdom
| | - Luigi Manfredi
- Division of Imaging Science and Technology, School of Medicine, University of Dundee, Dundee, United Kingdom
| |
Collapse
|
11
|
Liu Y, Zuo S. Self-supervised monocular depth estimation for gastrointestinal endoscopy. Comput Methods Programs Biomed 2023; 238:107619. [PMID: 37235969 DOI: 10.1016/j.cmpb.2023.107619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 04/26/2023] [Accepted: 05/18/2023] [Indexed: 05/28/2023]
Abstract
BACKGROUND AND OBJECTIVE Gastrointestinal (GI) endoscopy represents a promising tool for GI cancer screening. However, the limited field of view and uneven skills of endoscopists make it remains difficult to accurately identify polyps and follow up on precancerous lesions under endoscopy. Estimating depth from GI endoscopic sequences is essential for a series of AI-assisted surgical techniques. Nonetheless, depth estimation algorithm of GI endoscopy is a challenging task due to the particularity of the environment and the limitation of datasets. In this paper, we propose a self-supervised monocular depth estimation method for GI endoscopy. METHODS A depth estimation network and a camera ego-motion estimation network are firstly constructed to obtain the depth information and pose information of the sequence respectively, and then the model is enabled to perform self-supervised training by calculating the multi-scale structural similarity with L1 norm (MS-SSIM+L1) loss function between the target frame and the reconstructed image as part of the loss of the training network. The MS-SSIM+L1 loss function is good for reserving high-frequency information and can maintain the invariance of brightness and color. Our model consists of the U-shape convolutional network with the dual-attention mechanism, which is beneficial to capture muti-scale contextual information, and greatly improves the accuracy of depth estimation. We evaluated our method qualitatively and quantitatively with different state-of-the-art methods. RESULTS AND CONCLUSIONS The experimental results manifest that our method has superior generality, achieving lower error metrics and higher accuracy metrics on both the UCL dataset and the Endoslam dataset. The proposed method has also been validated with clinical GI endoscopy, demonstrating the potential clinical value of the model.
Collapse
Affiliation(s)
- Yuying Liu
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, Tianjin, China
| | - Siyang Zuo
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, Tianjin, China.
| |
Collapse
|
12
|
Makki K, Chandelon K, Bartoli A. Elliptical specularity detection in endoscopy with application to normal reconstruction. Int J Comput Assist Radiol Surg 2023:10.1007/s11548-023-02904-3. [PMID: 37142809 DOI: 10.1007/s11548-023-02904-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 03/31/2023] [Indexed: 05/06/2023]
Abstract
PURPOSE To detect specularities as elliptical blobs in endoscopy. The rationale is that in the endoscopic setting, specularities are generally small and that knowing the ellipse coefficients allows one to reconstruct the surface normal. In contrast, previous works detect specular masks as free-form shapes and consider the specular pixels as nuisance. METHODS A pipeline combining deep learning with handcrafted steps for specularity detection. This pipeline is general and accurate in the context of endoscopic applications involving multiple organs and moist tissues. A fully convolutional network produces an initial mask which specifically finds specular pixels, being mainly composed of sparsely distributed blobs. Standard ellipse fitting follows for local segmentation refinement in order to only keep the blobs fulfilling the conditions for successful normal reconstruction. RESULTS Convincing results in detection and reconstruction on synthetic and real images, showing that the elliptical shape prior improves the detection itself in both colonoscopy and kidney laparoscopy. The pipeline achieved a mean Dice of 84% and 87% respectively in test data for these two use cases, and allows one to exploit the specularities as useful information for inferring sparse surface geometry. The reconstructed normals are in good quantitative agreement with external learning-based depth reconstruction methods manifested, as shown by an average angular discrepancy of [Formula: see text] in colonoscopy. CONCLUSION First fully automatic method to exploit specularities in endoscopic 3D reconstruction. Because the design of current reconstruction methods can vary considerably for different applications, our elliptical specularity detection could be of potential interest in clinical practice thanks to its simplicity and generalisability. In particular, the obtained results are promising towards future integration with learning-based depth inference and SfM methods.
Collapse
Affiliation(s)
- Karim Makki
- EnCoV, Institut Pascal, UMR6602 CNRS/UCA, Clermont-Ferrand, France.
| | - Kilian Chandelon
- EnCoV, Institut Pascal, UMR6602 CNRS/UCA, Clermont-Ferrand, France
| | - Adrien Bartoli
- EnCoV, Institut Pascal, UMR6602 CNRS/UCA, Clermont-Ferrand, France.
- Direction de la Recherche Clinique et de l'Innovation, CHU de, Clermont-Ferrand, France.
| |
Collapse
|
13
|
Osuala R, Kushibar K, Garrucho L, Linardos A, Szafranowska Z, Klein S, Glocker B, Diaz O, Lekadir K. Data synthesis and adversarial networks: A review and meta-analysis in cancer imaging. Med Image Anal 2023; 84:102704. [PMID: 36473414 DOI: 10.1016/j.media.2022.102704] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 11/02/2022] [Accepted: 11/21/2022] [Indexed: 11/26/2022]
Abstract
Despite technological and medical advances, the detection, interpretation, and treatment of cancer based on imaging data continue to pose significant challenges. These include inter-observer variability, class imbalance, dataset shifts, inter- and intra-tumour heterogeneity, malignancy determination, and treatment effect uncertainty. Given the recent advancements in image synthesis, Generative Adversarial Networks (GANs), and adversarial training, we assess the potential of these technologies to address a number of key challenges of cancer imaging. We categorise these challenges into (a) data scarcity and imbalance, (b) data access and privacy, (c) data annotation and segmentation, (d) cancer detection and diagnosis, and (e) tumour profiling, treatment planning and monitoring. Based on our analysis of 164 publications that apply adversarial training techniques in the context of cancer imaging, we highlight multiple underexplored solutions with research potential. We further contribute the Synthesis Study Trustworthiness Test (SynTRUST), a meta-analysis framework for assessing the validation rigour of medical image synthesis studies. SynTRUST is based on 26 concrete measures of thoroughness, reproducibility, usefulness, scalability, and tenability. Based on SynTRUST, we analyse 16 of the most promising cancer imaging challenge solutions and observe a high validation rigour in general, but also several desirable improvements. With this work, we strive to bridge the gap between the needs of the clinical cancer imaging community and the current and prospective research on data synthesis and adversarial networks in the artificial intelligence community.
Collapse
Affiliation(s)
- Richard Osuala
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain.
| | - Kaisar Kushibar
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain
| | - Lidia Garrucho
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain
| | - Akis Linardos
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain
| | - Zuzanna Szafranowska
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain
| | - Stefan Klein
- Biomedical Imaging Group Rotterdam, Department of Radiology & Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
| | - Ben Glocker
- Biomedical Image Analysis Group, Department of Computing, Imperial College London, UK
| | - Oliver Diaz
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain
| | - Karim Lekadir
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain
| |
Collapse
|
14
|
Mascagni P, Alapatt D, Sestini L, Altieri MS, Madani A, Watanabe Y, Alseidi A, Redan JA, Alfieri S, Costamagna G, Boškoski I, Padoy N, Hashimoto DA. Computer vision in surgery: from potential to clinical value. NPJ Digit Med 2022; 5:163. [PMID: 36307544 PMCID: PMC9616906 DOI: 10.1038/s41746-022-00707-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 10/10/2022] [Indexed: 11/09/2022] Open
Abstract
Hundreds of millions of operations are performed worldwide each year, and the rising uptake in minimally invasive surgery has enabled fiber optic cameras and robots to become both important tools to conduct surgery and sensors from which to capture information about surgery. Computer vision (CV), the application of algorithms to analyze and interpret visual data, has become a critical technology through which to study the intraoperative phase of care with the goals of augmenting surgeons' decision-making processes, supporting safer surgery, and expanding access to surgical care. While much work has been performed on potential use cases, there are currently no CV tools widely used for diagnostic or therapeutic applications in surgery. Using laparoscopic cholecystectomy as an example, we reviewed current CV techniques that have been applied to minimally invasive surgery and their clinical applications. Finally, we discuss the challenges and obstacles that remain to be overcome for broader implementation and adoption of CV in surgery.
Collapse
Affiliation(s)
- Pietro Mascagni
- Gemelli Hospital, Catholic University of the Sacred Heart, Rome, Italy. .,IHU-Strasbourg, Institute of Image-Guided Surgery, Strasbourg, France. .,Global Surgical Artificial Intelligence Collaborative, Toronto, ON, Canada.
| | - Deepak Alapatt
- ICube, University of Strasbourg, CNRS, IHU, Strasbourg, France
| | - Luca Sestini
- ICube, University of Strasbourg, CNRS, IHU, Strasbourg, France.,Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milano, Italy
| | - Maria S Altieri
- Global Surgical Artificial Intelligence Collaborative, Toronto, ON, Canada.,Department of Surgery, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Amin Madani
- Global Surgical Artificial Intelligence Collaborative, Toronto, ON, Canada.,Department of Surgery, University Health Network, Toronto, ON, Canada
| | - Yusuke Watanabe
- Global Surgical Artificial Intelligence Collaborative, Toronto, ON, Canada.,Department of Surgery, University of Hokkaido, Hokkaido, Japan
| | - Adnan Alseidi
- Global Surgical Artificial Intelligence Collaborative, Toronto, ON, Canada.,Department of Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Jay A Redan
- Department of Surgery, AdventHealth-Celebration Health, Celebration, FL, USA
| | - Sergio Alfieri
- Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Guido Costamagna
- Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Ivo Boškoski
- Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Nicolas Padoy
- IHU-Strasbourg, Institute of Image-Guided Surgery, Strasbourg, France.,ICube, University of Strasbourg, CNRS, IHU, Strasbourg, France
| | - Daniel A Hashimoto
- Global Surgical Artificial Intelligence Collaborative, Toronto, ON, Canada.,Department of Surgery, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| |
Collapse
|
15
|
González-bueno Puyal J, Brandao P, Ahmad OF, Bhatia KK, Toth D, Kader R, Lovat L, Mountney P, Stoyanov D. Polyp detection on video colonoscopy using a hybrid 2D/3D CNN. Med Image Anal 2022. [DOI: 10.1016/j.media.2022.102625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 08/22/2022] [Accepted: 09/10/2022] [Indexed: 12/15/2022]
|
16
|
Adjei PE, Lonseko ZM, Du W, Zhang H, Rao N. Examining the effect of synthetic data augmentation in polyp detection and segmentation. Int J Comput Assist Radiol Surg 2022; 17:1289-1302. [PMID: 35678960 DOI: 10.1007/s11548-022-02651-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Accepted: 04/21/2022] [Indexed: 12/17/2022]
Abstract
PURPOSE As with several medical image analysis tasks based on deep learning, gastrointestinal image analysis is plagued with data scarcity, privacy concerns and an insufficient number of pathology samples. This study examines the generation and utility of synthetic samples of colonoscopy images with polyps for data augmentation. METHODS We modify and train a pix2pix model to generate synthetic colonoscopy samples with polyps to augment the original dataset. Subsequently, we create a variety of datasets by varying the quantity of synthetic samples and traditional augmentation samples, to train a U-Net network and Faster R-CNN model for segmentation and detection of polyps, respectively. We compare the performance of the models when trained with the resulting datasets in terms of F1 score, intersection over union, precision and recall. Further, we compare the performances of the models with unseen polyp datasets to assess their generalization ability. RESULTS The average F1 coefficient and intersection over union are improved with increasing number of synthetic samples in U-Net over all test datasets. The performance of the Faster R-CNN model is also improved in terms of polyp detection, while decreasing the false-negative rate. Further, the experimental results for polyp detection outperform similar studies in the literature on the ETIS-PolypLaribDB dataset. CONCLUSION By varying the quantity of synthetic and traditional augmentation, there is the potential to control the sensitivity of deep learning models in polyp segmentation and detection. Further, GAN-based augmentation is a viable option for improving the performance of models for polyp segmentation and detection.
Collapse
Affiliation(s)
- Prince Ebenezer Adjei
- Key Laboratory for Neuroinformation of Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 610054, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Department of Computer Engineering, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
| | - Zenebe Markos Lonseko
- Key Laboratory for Neuroinformation of Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 610054, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Wenju Du
- Key Laboratory for Neuroinformation of Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 610054, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Han Zhang
- Key Laboratory for Neuroinformation of Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 610054, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Nini Rao
- Key Laboratory for Neuroinformation of Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 610054, China. .,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| |
Collapse
|
17
|
Villaseñor-aguilar MJ, Peralta-lópez JE, Lázaro-mata D, García-alcalá CE, Padilla-medina JA, Perez-pinal FJ, Vázquez-lópez JA, Barranco-gutiérrez AI. Fuzzy Fusion of Stereo Vision, Odometer, and GPS for Tracking Land Vehicles. Mathematics 2022; 10:2052. [DOI: 10.3390/math10122052] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
The incorporation of high precision vehicle positioning systems has been demanded by the autonomous electric vehicle (AEV) industry. For this reason, research on visual odometry (VO) and Artificial Intelligence (AI) to reduce positioning errors automatically has become essential in this field. In this work, a new method to reduce the error in the absolute location of AEV using fuzzy logic (FL) is presented. The cooperative data fusion of GPS, odometer, and stereo camera signals is then performed to improve the estimation of AEV localization. Although the most important challenge of this work focuses on the reduction in the odometry error in the vehicle, the defiance of synchrony and the information fusion of sources of different nature is solved. This research is integrated by three phases: data acquisition, data fusion, and statistical evaluation. The first one is data acquisition by using an odometer, a GPS, and a ZED camera in AVE’s trajectories. The second one is the data analysis and fuzzy fusion design using the MatLab 2019® fuzzy logic toolbox. The last is the statistical evaluation of the positioning error of the different sensors. According to the obtained results, the proposed model with the lowest error is that which uses all sensors as input (stereo camera, odometer, and GPS). It can be highlighted that the best proposed model manages to reduce the positioning mean absolute error (MAE) up to 25% with respect to the state of the art.
Collapse
|
18
|
Oda M, Itoh H, Tanaka K, Takabatake H, Mori M, Natori H, Mori K. Depth estimation from single-shot monocular endoscope image using image domain adaptation and edge-aware depth estimation. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization 2022. [DOI: 10.1080/21681163.2021.2012835] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Masahiro Oda
- Information and Communications, Nagoya University, Nagoya, Japan
- Graduate School of Informatics, Nagoya University, Nagoya, Japan
| | - Hayato Itoh
- Graduate School of Informatics, Nagoya University, Nagoya, Japan
| | - Kiyohito Tanaka
- Department of Gastroenterology, Kyoto Second Red Cross Hospital, Kyoto, Japan
| | - Hirotsugu Takabatake
- Department of Respiratory Medicine, Sapporo-Minami-Sanjo Hospital, Sapporo, Japan
| | - Masaki Mori
- Department of Respiratory Medicine, Sapporo-Kosei General Hospital, Sapporo, Japan
| | - Hiroshi Natori
- Department of Respiratory Medicine, Keiwakai Nishioka Hospital, Sapporo, Japan
| | - Kensaku Mori
- Information and Communications, Nagoya University, Nagoya, Japan
- Graduate School of Informatics, Nagoya University, Nagoya, Japan
- Research Center for Medical Bigdata, National Institute of Informatics, Tokyo, Japan
| |
Collapse
|
19
|
Chen J, Zhang Z, Xie X, Li Y, Xu T, Ma K, Zheng Y. Beyond Mutual Information: Generative Adversarial Network for Domain Adaptation Using Information Bottleneck Constraint. IEEE Trans Med Imaging 2022; 41:595-607. [PMID: 34606453 DOI: 10.1109/tmi.2021.3117996] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Medical images from multicentres often suffer from the domain shift problem, which makes the deep learning models trained on one domain usually fail to generalize well to another. One of the potential solutions for the problem is the generative adversarial network (GAN), which has the capacity to translate images between different domains. Nevertheless, the existing GAN-based approaches are prone to fail at preserving image-objects in image-to-image (I2I) translation, which reduces their practicality on domain adaptation tasks. In this regard, a novel GAN (namely IB-GAN) is proposed to preserve image-objects during cross-domain I2I adaptation. Specifically, we integrate the information bottleneck constraint into the typical cycle-consistency-based GAN to discard the superfluous information (e.g., domain information) and maintain the consistency of disentangled content features for image-object preservation. The proposed IB-GAN is evaluated on three tasks-polyp segmentation using colonoscopic images, the segmentation of optic disc and cup in fundus images and the whole heart segmentation using multi-modal volumes. We show that the proposed IB-GAN can generate realistic translated images and remarkably boost the generalization of widely used segmentation networks (e.g., U-Net).
Collapse
|
20
|
Tukra S, Lidströmer N, Ashrafian H, Gianarrou S. AI in Surgical Robotics. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
21
|
Luo H, Wang C, Duan X, Liu H, Wang P, Hu Q, Jia F. Unsupervised learning of depth estimation from imperfect rectified stereo laparoscopic images. Comput Biol Med 2022; 140:105109. [PMID: 34891097 DOI: 10.1016/j.compbiomed.2021.105109] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 11/30/2021] [Accepted: 12/02/2021] [Indexed: 11/03/2022]
Abstract
BACKGROUND Learning-based methods have achieved remarkable performances on depth estimation. However, the premise of most self-learning and unsupervised learning methods is built on rigorous, geometrically-aligned stereo rectification. The performances of these methods degrade when the rectification is not accurate. Therefore, we explore an approach for unsupervised depth estimation from stereo images that can handle imperfect camera parameters. METHODS We propose an unsupervised deep convolutional network that takes rectified stereo image pairs as input and outputs corresponding dense disparity maps. First, a new vertical correction module is designed for predicting a correction map to compensate for the imperfect geometry alignment. Second, the left and right images, which are reconstructed based on the input image pair and corresponding disparities as well as the vertical correction maps, are regarded as the outputs of the generative term of the generative adversarial network (GAN). Then, the discriminator term of the GAN is used to distinguish the reconstructed images from the original inputs to force the generator to output increasingly realistic images. In addition, a residual mask is introduced to exclude pixels that conflict with the appearance of the original image in the loss calculation. RESULTS The proposed model is validated on the publicly available Stereo Correspondence and Reconstruction of Endoscopic Data (SCARED) dataset and the average MAE is 3.054 mm. CONCLUSION Our model can effectively handle imperfect rectified stereo images for depth estimation.
Collapse
Affiliation(s)
- Huoling Luo
- Research Lab for Medical Imaging and Digital Surgery, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen, China
| | - Congcong Wang
- School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, China; Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway
| | - Xingguang Duan
- Advanced Innovation Centre for Intelligent Robots & Systems, Beijing Institute of Technology, Beijing, China
| | - Hao Liu
- State Key Lab for Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China
| | - Ping Wang
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Qingmao Hu
- Research Lab for Medical Imaging and Digital Surgery, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen, China
| | - Fucang Jia
- Research Lab for Medical Imaging and Digital Surgery, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen, China; Pazhou Lab, Guangzhou, China.
| |
Collapse
|
22
|
Marzullo A, Moccia S, Calimeri F, De Momi E. AIM in Endoscopy Procedures. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
23
|
Livovsky DM, Veikherman D, Golany T, Aides A, Dashinsky V, Rabani N, Ben Shimol D, Blau Y, Katzir L, Shimshoni I, Liu Y, Segol O, Goldin E, Corrado G, Lachter J, Matias Y, Rivlin E, Freedman D. Detection of elusive polyps using a large-scale artificial intelligence system (with videos). Gastrointest Endosc 2021; 94:1099-1109.e10. [PMID: 34216598 DOI: 10.1016/j.gie.2021.06.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 06/22/2021] [Indexed: 12/11/2022]
Abstract
BACKGROUND AND AIMS Colorectal cancer is a leading cause of death. Colonoscopy is the criterion standard for detection and removal of precancerous lesions and has been shown to reduce mortality. The polyp miss rate during colonoscopies is 22% to 28%. DEEP DEtection of Elusive Polyps (DEEP2) is a new polyp detection system based on deep learning that alerts the operator in real time to the presence and location of polyps. The primary outcome was the performance of DEEP2 on the detection of elusive polyps. METHODS The DEEP2 system was trained on 3611 hours of colonoscopy videos derived from 2 sources and was validated on a set comprising 1393 hours from a third unrelated source. Ground truth labeling was provided by offline gastroenterologist annotators who were able to watch the video in slow motion and pause and rewind as required. To assess applicability, stability, and user experience and to obtain some preliminary data on performance in a real-life scenario, a preliminary prospective clinical validation study was performed comprising 100 procedures. RESULTS DEEP2 achieved a sensitivity of 97.1% at 4.6 false alarms per video for all polyps and of 88.5% and 84.9% for polyps in the field of view for less than 5 and 2 seconds, respectively. DEEP2 was able to detect polyps not seen by live real-time endoscopists or offline annotators in an average of .22 polyps per sequence. In the clinical validation study, the system detected an average of .89 additional polyps per procedure. No adverse events occurred. CONCLUSIONS DEEP2 has a high sensitivity for polyp detection and was effective in increasing the detection of polyps both in colonoscopy videos and in real procedures with a low number of false alarms. (Clinical trial registration number: NCT04693078.).
Collapse
|
24
|
Edwards PJE, Psychogyios D, Speidel S, Maier-Hein L, Stoyanov D. SERV-CT: A disparity dataset from cone-beam CT for validation of endoscopic 3D reconstruction. Med Image Anal 2021; 76:102302. [PMID: 34906918 PMCID: PMC8961000 DOI: 10.1016/j.media.2021.102302] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 11/01/2021] [Accepted: 11/04/2021] [Indexed: 11/27/2022]
Abstract
Full torso porcine CT model for stereo-endoscopic reconstruction validation CT of endoscope and anatomy with constrained manual alignment provides a reference Accuracy analysis of repeated alignments and performance of existing algorithms presented Open sourced dataset for stereo reconstruction validation
In computer vision, reference datasets from simulation and real outdoor scenes have been highly successful in promoting algorithmic development in stereo reconstruction. Endoscopic stereo reconstruction for surgical scenes gives rise to specific problems, including the lack of clear corner features, highly specular surface properties and the presence of blood and smoke. These issues present difficulties for both stereo reconstruction itself and also for standardised dataset production. Previous datasets have been produced using computed tomography (CT) or structured light reconstruction on phantom or ex vivo models. We present a stereo-endoscopic reconstruction validation dataset based on cone-beam CT (SERV-CT). Two ex vivo small porcine full torso cadavers were placed within the view of the endoscope with both the endoscope and target anatomy visible in the CT scan. Subsequent orientation of the endoscope was manually aligned to match the stereoscopic view and benchmark disparities, depths and occlusions are calculated. The requirement of a CT scan limited the number of stereo pairs to 8 from each ex vivo sample. For the second sample an RGB surface was acquired to aid alignment of smooth, featureless surfaces. Repeated manual alignments showed an RMS disparity accuracy of around 2 pixels and a depth accuracy of about 2 mm. A simplified reference dataset is provided consisting of endoscope image pairs with corresponding calibration, disparities, depths and occlusions covering the majority of the endoscopic image and a range of tissue types, including smooth specular surfaces, as well as significant variation of depth. We assessed the performance of various stereo algorithms from online available repositories. There is a significant variation between algorithms, highlighting some of the challenges of surgical endoscopic images. The SERV-CT dataset provides an easy to use stereoscopic validation for surgical applications with smooth reference disparities and depths covering the majority of the endoscopic image. This complements existing resources well and we hope will aid the development of surgical endoscopic anatomical reconstruction algorithms.
Collapse
Affiliation(s)
- P J Eddie Edwards
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London (UCL), Charles Bell House, 43-45 Foley Street, London W1W 7TS, UK.
| | - Dimitris Psychogyios
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London (UCL), Charles Bell House, 43-45 Foley Street, London W1W 7TS, UK
| | - Stefanie Speidel
- Division of Translational Surgical Oncology, National Center for Tumor Diseases (NCT) Dresden, Dresden, 01307, Germany
| | - Lena Maier-Hein
- Division of Medical and Biological Informatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Danail Stoyanov
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London (UCL), Charles Bell House, 43-45 Foley Street, London W1W 7TS, UK
| |
Collapse
|
25
|
Mikamo M, Furukawa R, Oka S, Kotachi T, Okamoto Y, Tanaka S, Sagawa R, Kawasaki H. Active Stereo Method for 3D Endoscopes using Deep-layer GCN and Graph Representation with Proximity Information. Annu Int Conf IEEE Eng Med Biol Soc 2021; 2021:7551-7555. [PMID: 34892838 DOI: 10.1109/embc46164.2021.9629696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Techniques for 3D endoscopic systems have been widely studied for various reasons. Among them, active stereo based systems, in which structured-light patterns are projected to surfaces and endoscopic images of the pattern are analyzed to produce 3D depth images, are promising, because of robustness and simple system configurations. For those systems, finding correspondences between a projected pattern and an original pattern is an open problem. Recently, correspondence estimation by graph neural networks (GCN) using graph-based representation of the patterns were proposed for 3D endoscopic systems. One severe problem of the approach is that the graph matching by GCN is largely affected by the stability of the graph construction process using the detected patterns of a captured image. If the detected pattern is fragmented into small pieces, graph matching may fail and 3D shapes cannot be retrieved. In this paper, we propose a solution for those problems by applying deep-layered GCN and extended graph representations of the patterns, where proximity information is added. Experiments show that the proposed method outperformed the previous method in accuracies for correspondence matching for 3D reconstruction.
Collapse
|
26
|
Widya AR, Monno Y, Okutomi M, Suzuki S, Gotoda T, Miki K. Learning-Based Depth and Pose Estimation for Monocular Endoscope with Loss Generalization. Annu Int Conf IEEE Eng Med Biol Soc 2021; 2021:3547-3552. [PMID: 34892005 DOI: 10.1109/embc46164.2021.9630156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Gastroendoscopy has been a clinical standard for diagnosing and treating conditions that affect a part of a patient's digestive system, such as the stomach. Despite the fact that gastroendoscopy has a lot of advantages for patients, there exist some challenges for practitioners, such as the lack of 3D perception, including the depth and the endoscope pose information. Such challenges make navigating the endoscope and localizing any found lesion in a digestive tract difficult. To tackle these problems, deep learning-based approaches have been proposed to provide monocular gastroendoscopy with additional yet important depth and pose information. In this paper, we propose a novel supervised approach to train depth and pose estimation networks using consecutive endoscopy images to assist the endoscope navigation in the stomach. We firstly generate real depth and pose training data using our previously proposed whole stomach 3D reconstruction pipeline to avoid poor generalization ability between computer-generated (CG) models and real data for the stomach. In addition, we propose a novel generalized photometric loss function to avoid the complicated process of finding proper weights for balancing the depth and the pose loss terms, which is required for existing direct depth and pose supervision approaches. We then experimentally show that our proposed generalized loss performs better than existing direct supervision losses.
Collapse
|
27
|
Recasens D, Lamarca J, Facil JM, Montiel JMM, Civera J. Endo-Depth-and-Motion: Reconstruction and Tracking in Endoscopic Videos Using Depth Networks and Photometric Constraints. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3095528] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
28
|
Abstract
Haustral folds are colon wall protrusions implicated for high polyp miss rate during optical colonoscopy procedures. If segmented accurately, haustral folds can allow for better estimation of missed surface and can also serve as valuable landmarks for registering pre-treatment virtual (CT) and optical colonoscopies, to guide navigation towards the anomalies found in pre-treatment scans. We present a novel generative adversarial network, FoldIt, for feature-consistent image translation of optical colonoscopy videos to virtual colonoscopy renderings with haustral fold overlays. A new transitive loss is introduced in order to leverage ground truth information between haustral fold annotations and virtual colonoscopy renderings. We demonstrate the effectiveness of our model on real challenging optical colonoscopy videos as well as on textured virtual colonoscopy videos with clinician-verified haustral fold annotations. All code and scripts to reproduce the experiments of this paper will be made available via our Computational Endoscopy Platform at https://github.com/nadeemlab/CEP.
Collapse
Affiliation(s)
- Shawn Mathew
- Department of Computer Science, Stony Brook University
| | - Saad Nadeem
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center
| | - Arie Kaufman
- Department of Computer Science, Stony Brook University
| |
Collapse
|
29
|
Itoh H, Oda M, Mori Y, Misawa M, Kudo SE, Imai K, Ito S, Hotta K, Takabatake H, Mori M, Natori H, Mori K. Unsupervised colonoscopic depth estimation by domain translations with a Lambertian-reflection keeping auxiliary task. Int J Comput Assist Radiol Surg 2021; 16:989-1001. [PMID: 34002340 DOI: 10.1007/s11548-021-02398-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 05/02/2021] [Indexed: 12/15/2022]
Abstract
PURPOSE A three-dimensional (3D) structure extraction technique viewed from a two-dimensional image is essential for the development of a computer-aided diagnosis (CAD) system for colonoscopy. However, a straightforward application of existing depth-estimation methods to colonoscopic images is impossible or inappropriate due to several limitations of colonoscopes. In particular, the absence of ground-truth depth for colonoscopic images hinders the application of supervised machine learning methods. To circumvent these difficulties, we developed an unsupervised and accurate depth-estimation method. METHOD We propose a novel unsupervised depth-estimation method by introducing a Lambertian-reflection model as an auxiliary task to domain translation between real and virtual colonoscopic images. This auxiliary task contributes to accurate depth estimation by maintaining the Lambertian-reflection assumption. In our experiments, we qualitatively evaluate the proposed method by comparing it with state-of-the-art unsupervised methods. Furthermore, we present two quantitative evaluations of the proposed method using a measuring device, as well as a new 3D reconstruction technique and measured polyp sizes. RESULTS Our proposed method achieved accurate depth estimation with an average estimation error of less than 1 mm for regions close to the colonoscope in both of two types of quantitative evaluations. Qualitative evaluation showed that the introduced auxiliary task reduces the effects of specular reflections and colon wall textures on depth estimation and our proposed method achieved smooth depth estimation without noise, thus validating the proposed method. CONCLUSIONS We developed an accurate depth-estimation method with a new type of unsupervised domain translation with the auxiliary task. This method is useful for analysis of colonoscopic images and for the development of a CAD system since it can extract accurate 3D information.
Collapse
|
30
|
Tong HS, Ng YL, Liu Z, Ho JDL, Chan PL, Chan JYK, Kwok KW. Real-to-virtual domain transfer-based depth estimation for real-time 3D annotation in transnasal surgery: a study of annotation accuracy and stability. Int J Comput Assist Radiol Surg 2021; 16:731-739. [PMID: 33786777 PMCID: PMC8134290 DOI: 10.1007/s11548-021-02346-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 03/05/2021] [Indexed: 11/29/2022]
Abstract
PURPOSE Surgical annotation promotes effective communication between medical personnel during surgical procedures. However, existing approaches to 2D annotations are mostly static with respect to a display. In this work, we propose a method to achieve 3D annotations that anchor rigidly and stably to target structures upon camera movement in a transnasal endoscopic surgery setting. METHODS This is accomplished through intra-operative endoscope tracking and monocular depth estimation. A virtual endoscopic environment is utilized to train a supervised depth estimation network. An adversarial network transfers the style from the real endoscopic view to a synthetic-like view for input into the depth estimation network, wherein framewise depth can be obtained in real time. RESULTS (1) Accuracy: Framewise depth was predicted from images captured from within a nasal airway phantom and compared with ground truth, achieving a SSIM value of 0.8310 ± 0.0655. (2) Stability: mean absolute error (MAE) between reference and predicted depth of a target point was 1.1330 ± 0.9957 mm. CONCLUSION Both the accuracy and stability evaluations demonstrated the feasibility and practicality of our proposed method for achieving 3D annotations.
Collapse
Affiliation(s)
- Hon-Sing Tong
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong
| | - Yui-Lun Ng
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong
| | - Zhiyu Liu
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong
| | - Justin D L Ho
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong
| | - Po-Ling Chan
- Department of Otorhinolaryngology, Head and Neck Surgery, The Chinese University of Hong Kong, Sha Tin, Hong Kong SAR
| | - Jason Y K Chan
- Department of Otorhinolaryngology, Head and Neck Surgery, The Chinese University of Hong Kong, Sha Tin, Hong Kong SAR.
| | - Ka-Wai Kwok
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong.
| |
Collapse
|
31
|
Sahu M, Mukhopadhyay A, Zachow S. Simulation-to-real domain adaptation with teacher-student learning for endoscopic instrument segmentation. Int J Comput Assist Radiol Surg 2021; 16:849-859. [PMID: 33982232 PMCID: PMC8134307 DOI: 10.1007/s11548-021-02383-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 04/16/2021] [Indexed: 02/06/2023]
Abstract
PURPOSE Segmentation of surgical instruments in endoscopic video streams is essential for automated surgical scene understanding and process modeling. However, relying on fully supervised deep learning for this task is challenging because manual annotation occupies valuable time of the clinical experts. METHODS We introduce a teacher-student learning approach that learns jointly from annotated simulation data and unlabeled real data to tackle the challenges in simulation-to-real unsupervised domain adaptation for endoscopic image segmentation. RESULTS Empirical results on three datasets highlight the effectiveness of the proposed framework over current approaches for the endoscopic instrument segmentation task. Additionally, we provide analysis of major factors affecting the performance on all datasets to highlight the strengths and failure modes of our approach. CONCLUSIONS We show that our proposed approach can successfully exploit the unlabeled real endoscopic video frames and improve generalization performance over pure simulation-based training and the previous state-of-the-art. This takes us one step closer to effective segmentation of surgical instrument in the annotation scarce setting.
Collapse
Affiliation(s)
- Manish Sahu
- Zuse Institute Berlin (ZIB), Berlin, Germany
| | | | | |
Collapse
|
32
|
Ortega-Morán JF, Azpeitia Á, Sánchez-Peralta LF, Bote-Curiel L, Pagador B, Cabezón V, Saratxaga CL, Sánchez-Margallo FM. Medical needs related to the endoscopic technology and colonoscopy for colorectal cancer diagnosis. BMC Cancer 2021; 21:467. [PMID: 33902503 PMCID: PMC8077886 DOI: 10.1186/s12885-021-08190-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 04/14/2021] [Indexed: 12/14/2022] Open
Abstract
Background The high incidence and mortality rate of colorectal cancer require new technologies to improve its early diagnosis. This study aims at extracting the medical needs related to the endoscopic technology and the colonoscopy procedure currently used for colorectal cancer diagnosis, essential for designing these demanded technologies. Methods Semi-structured interviews and an online survey were used. Results Six endoscopists were interviewed and 103 were surveyed, obtaining the demanded needs that can be divided into: a) clinical needs, for better polyp detection and classification (especially flat polyps), location, size, margins and penetration depth; b) computer-aided diagnosis (CAD) system needs, for additional visual information supporting polyp characterization and diagnosis; and c) operational/physical needs, related to limitations of image quality, colon lighting, flexibility of the endoscope tip, and even poor bowel preparation. Conclusions This study shows some undertaken initiatives to meet the detected medical needs and challenges to be solved. The great potential of advanced optical technologies suggests their use for a better polyp detection and classification since they provide additional functional and structural information than the currently used image enhancement technologies. The inspection of remaining tissue of diminutive polyps (< 5 mm) should be addressed to reduce recurrence rates. Few progresses have been made in estimating the infiltration depth. Detection and classification methods should be combined into one CAD system, providing visual aids over polyps for detection and displaying a Kudo-based diagnosis suggestion to assist the endoscopist on real-time decision making. Estimated size and location of polyps should also be provided. Endoscopes with 360° vision are still a challenge not met by the mechanical and optical systems developed to improve the colon inspection. Patients and healthcare providers should be trained to improve the patient’s bowel preparation. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-021-08190-z.
Collapse
Affiliation(s)
| | - Águeda Azpeitia
- Biobanco Vasco, Fundación Vasca de Investigaciones e Innovación Sanitaria (BIOEF), Ronda de Azkue, 1, 48902, Barakaldo, Spain
| | | | - Luis Bote-Curiel
- Jesús Usón Minimally Invasive Surgery Centre, Ctra. N-521, Km 41.8, 10071, Cáceres, Spain
| | - Blas Pagador
- Jesús Usón Minimally Invasive Surgery Centre, Ctra. N-521, Km 41.8, 10071, Cáceres, Spain
| | - Virginia Cabezón
- Biobanco Vasco, Fundación Vasca de Investigaciones e Innovación Sanitaria (BIOEF), Ronda de Azkue, 1, 48902, Barakaldo, Spain
| | - Cristina L Saratxaga
- TECNALIA, Basque Research and Technology Alliance (BRTA), Parque Tecnológico de Bizkaia, C/Geldo. Edificio 700, E-48160, Derio, Bizkaia, Spain
| | | |
Collapse
|
33
|
Hwang SJ, Park SJ, Kim GM, Baek JH. Unsupervised Monocular Depth Estimation for Colonoscope System Using Feedback Network. Sensors (Basel) 2021; 21:s21082691. [PMID: 33920357 PMCID: PMC8069522 DOI: 10.3390/s21082691] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 04/01/2021] [Accepted: 04/09/2021] [Indexed: 12/25/2022]
Abstract
A colonoscopy is a medical examination used to check disease or abnormalities in the large intestine. If necessary, polyps or adenomas would be removed through the scope during a colonoscopy. Colorectal cancer can be prevented through this. However, the polyp detection rate differs depending on the condition and skill level of the endoscopist. Even some endoscopists have a 90% chance of missing an adenoma. Artificial intelligence and robot technologies for colonoscopy are being studied to compensate for these problems. In this study, we propose a self-supervised monocular depth estimation using spatiotemporal consistency in the colon environment. It is our contribution to propose a loss function for reconstruction errors between adjacent predicted depths and a depth feedback network that uses predicted depth information of the previous frame to predict the depth of the next frame. We performed quantitative and qualitative evaluation of our approach, and the proposed FBNet (depth FeedBack Network) outperformed state-of-the-art results for unsupervised depth estimation on the UCL datasets.
Collapse
|
34
|
Marzullo A, Moccia S, Catellani M, Calimeri F, Momi ED. Towards realistic laparoscopic image generation using image-domain translation. Comput Methods Programs Biomed 2021; 200:105834. [PMID: 33229016 DOI: 10.1016/j.cmpb.2020.105834] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Accepted: 11/05/2020] [Indexed: 06/11/2023]
Abstract
Background and ObjectivesOver the last decade, Deep Learning (DL) has revolutionized data analysis in many areas, including medical imaging. However, there is a bottleneck in the advancement of DL in the surgery field, which can be seen in a shortage of large-scale data, which in turn may be attributed to the lack of a structured and standardized methodology for storing and analyzing surgical images in clinical centres. Furthermore, accurate annotations manually added are expensive and time consuming. A great help can come from the synthesis of artificial images; in this context, in the latest years, the use of Generative Adversarial Neural Networks (GANs) achieved promising results in obtaining photo-realistic images. MethodsIn this study, a method for Minimally Invasive Surgery (MIS) image synthesis is proposed. To this aim, the generative adversarial network pix2pix is trained to generate paired annotated MIS images by transforming rough segmentation of surgical instruments and tissues into realistic images. An additional regularization term was added to the original optimization problem, in order to enhance realism of surgical tools with respect to the background. Results Quantitative and qualitative (i.e., human-based) evaluations of generated images have been carried out in order to assess the effectiveness of the method. ConclusionsExperimental results show that the proposed method is actually able to translate MIS segmentations to realistic MIS images, which can in turn be used to augment existing data sets and help at overcoming the lack of useful images; this allows physicians and algorithms to take advantage from new annotated instances for their training.
Collapse
Affiliation(s)
- Aldo Marzullo
- Department of Mathematics and Computer Science, University of Calabria, Rende, Italy.
| | - Sara Moccia
- Department of Information Engineering, Unviersitá Politecnica delle Marche, Ancona, Italy; Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genoa, Italy
| | - Michele Catellani
- Department of urology, European Institute of Oncology, IRCCS, Milan, Italy
| | - Francesco Calimeri
- Department of Mathematics and Computer Science, University of Calabria, Rende, Italy
| | - Elena De Momi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| |
Collapse
|
35
|
Funama Y, Oda S, Kidoh M, Nagayama Y, Goto M, Sakabe D, Nakaura T. Conditional generative adversarial networks to generate pseudo low monoenergetic CT image from a single-tube voltage CT scanner. Phys Med 2021; 83:46-51. [DOI: 10.1016/j.ejmp.2021.02.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 02/11/2021] [Accepted: 02/21/2021] [Indexed: 01/29/2023] Open
|
36
|
Widya AR, Monno Y, Okutomi M, Suzuki S, Gotoda T, Miki K. Stomach 3D Reconstruction Using Virtual Chromoendoscopic Images. IEEE J Transl Eng Health Med 2021; 9:1700211. [PMID: 33796417 PMCID: PMC8009143 DOI: 10.1109/jtehm.2021.3062226] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 01/19/2021] [Accepted: 02/15/2021] [Indexed: 12/23/2022]
Abstract
Gastric endoscopy is a golden standard in the clinical process that enables medical practitioners to diagnose various lesions inside a patient’s stomach. If a lesion is found, a success in identifying the location of the found lesion relative to the global view of the stomach will lead to better decision making for the next clinical treatment. Our previous research showed that the lesion localization could be achieved by reconstructing the whole stomach shape from chromoendoscopic indigo carmine (IC) dye-sprayed images using a structure-from-motion (SfM) pipeline. However, spraying the IC dye to the whole stomach requires additional time, which is not desirable for both patients and practitioners. Our objective is to propose an alternative way to achieve whole stomach 3D reconstruction without the need of the IC dye. We generate virtual IC-sprayed (VIC) images based on image-to-image style translation trained on unpaired real no-IC and IC-sprayed images, where we have investigated the effect of input and output color channel selection for generating the VIC images. We validate our reconstruction results by comparing them with the results using real IC-sprayed images and confirm that the obtained stomach 3D structures are comparable to each other. We also propose a local reconstruction technique to obtain a more detailed surface and texture around an interesting region. The proposed method achieves the whole stomach reconstruction without the need of real IC dye using SfM. We have found that translating no-IC green-channel images to IC-sprayed red-channel images gives the best SfM reconstruction result. Clinical impact We offer a method of the frame localization and local 3D reconstruction of a found gastric lesion using standard endoscopy images, leading to better clinical decision.
Collapse
Affiliation(s)
- Aji Resindra Widya
- Department of Systems and Control EngineeringSchool of EngineeringTokyo Institute of TechnologyTokyo152-8550Japan
| | - Yusuke Monno
- Department of Systems and Control EngineeringSchool of EngineeringTokyo Institute of TechnologyTokyo152-8550Japan
| | - Masatoshi Okutomi
- Department of Systems and Control EngineeringSchool of EngineeringTokyo Institute of TechnologyTokyo152-8550Japan
| | - Sho Suzuki
- Division of Gastroenterology and HepatologyDepartment of MedicineNihon University School of MedicineTokyo101-8309Japan
| | - Takuji Gotoda
- Division of Gastroenterology and HepatologyDepartment of MedicineNihon University School of MedicineTokyo101-8309Japan
| | - Kenji Miki
- Department of Internal MedicineTsujinaka Hospital KashiwanohaKashiwa277-0871Japan
| |
Collapse
|
37
|
Tukra S, Lidströmer N, Ashrafian H, Giannarou S. AI in Surgical Robotics. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_323-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
38
|
Marzullo A, Moccia S, Calimeri F, De Momi E. AIM in Endoscopy Procedures. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_164-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2022]
|
39
|
Ahmad OF, Stassen P, Webster GJ. Artificial intelligence in biliopancreatic endoscopy: Is there any role? Best Pract Res Clin Gastroenterol 2020; 52-53:101724. [PMID: 34172251 DOI: 10.1016/j.bpg.2020.101724] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 12/22/2020] [Indexed: 01/31/2023]
Abstract
Artificial intelligence (AI) research in endoscopy is being translated at rapid pace with a number of approved devices now available for use in luminal endoscopy. However, the published literature for AI in biliopancreatic endoscopy is predominantly limited to early pre-clinical studies including applications for diagnostic EUS and patient risk stratification. Potential future use cases are highlighted in this manuscript including optical characterisation of strictures during cholangioscopy, prediction of post-ERCP acute pancreatitis and selective biliary duct cannulation difficulty, automated report generation and novel AI-based quality key performance metrics. To realise the full potential of AI and accelerate innovation, it is crucial that robust inter-disciplinary collaborations are formed between biliopancreatic endoscopists and AI researchers.
Collapse
Affiliation(s)
- Omer F Ahmad
- Department of Gastroenterology, University College London Hospitals NHS Foundation Trust, 250 Euston Road, London, NW1 2BU, United Kingdom; Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, Charles Bell House, 43-45 Foley Street, London, W1W 7TS, United Kingdom.
| | - Pauline Stassen
- Erasmus MC University Medical Center, Doctor Molewaterplein 40, 3015 GD, Rotterdam, Netherlands
| | - George J Webster
- Department of Gastroenterology, University College London Hospitals NHS Foundation Trust, 250 Euston Road, London, NW1 2BU, United Kingdom
| |
Collapse
|
40
|
Freedman D, Blau Y, Katzir L, Aides A, Shimshoni I, Veikherman D, Golany T, Gordon A, Corrado G, Matias Y, Rivlin E. Detecting Deficient Coverage in Colonoscopies. IEEE Trans Med Imaging 2020; 39:3451-3462. [PMID: 32746092 DOI: 10.1109/tmi.2020.2994221] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Colonoscopy is tool of choice for preventing Colorectal Cancer, by detecting and removing polyps before they become cancerous. However, colonoscopy is hampered by the fact that endoscopists routinely miss 22-28% of polyps. While some of these missed polyps appear in the endoscopist's field of view, others are missed simply because of substandard coverage of the procedure, i.e. not all of the colon is seen. This paper attempts to rectify the problem of substandard coverage in colonoscopy through the introduction of the C2D2 (Colonoscopy Coverage Deficiency via Depth) algorithm which detects deficient coverage, and can thereby alert the endoscopist to revisit a given area. More specifically, C2D2 consists of two separate algorithms: the first performs depth estimation of the colon given an ordinary RGB video stream; while the second computes coverage given these depth estimates. Rather than compute coverage for the entire colon, our algorithm computes coverage locally, on a segment-by-segment basis; C2D2 can then indicate in real-time whether a particular area of the colon has suffered from deficient coverage, and if so the endoscopist can return to that area. Our coverage algorithm is the first such algorithm to be evaluated in a large-scale way; while our depth estimation technique is the first calibration-free unsupervised method applied to colonoscopies. The C2D2 algorithm achieves state of the art results in the detection of deficient coverage. On synthetic sequences with ground truth, it is 2.4 times more accurate than human experts; while on real sequences, C2D2 achieves a 93.0% agreement with experts.
Collapse
|
41
|
Widya AR, Monno Y, Okutomi M, Suzuki S, Gotoda T, Miki K. Stomach 3D Reconstruction Based on Virtual Chromoendoscopic Image Generation. Annu Int Conf IEEE Eng Med Biol Soc 2020; 2020:1848-1852. [PMID: 33018360 DOI: 10.1109/embc44109.2020.9176016] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Gastric endoscopy is a standard clinical process that enables medical practitioners to diagnose various lesions inside a patient's stomach. If any lesion is found, it is very important to perceive the location of the lesion relative to the global view of the stomach. Our previous research showed that this could be addressed by reconstructing the whole stomach shape from chromoendoscopic images using a structure-from-motion (SfM) pipeline, in which indigo carmine (IC) blue dye-sprayed images were used to increase feature matches for SfM by enhancing stomach surface's textures. However, spraying the IC dye to the whole stomach requires additional time, labor, and cost, which is not desirable for patients and practitioners. In this paper, we propose an alternative way to achieve whole stomach 3D reconstruction without the need of the IC dye by generating virtual IC-sprayed (VIC) images based on image-to-image style translation trained on unpaired real no-IC and IC-sprayed images. We have specifically investigated the effect of input and output color channel selection for generating the VIC images and found that translating no-IC green-channel images to IC-sprayed red-channel images gives the best SfM reconstruction result.
Collapse
|
42
|
Chadebecq F, Vasconcelos F, Mazomenos E, Stoyanov D. Computer Vision in the Surgical Operating Room. Visc Med 2020; 36:456-462. [PMID: 33447601 DOI: 10.1159/000511934] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 09/30/2020] [Indexed: 12/20/2022] Open
Abstract
Background Multiple types of surgical cameras are used in modern surgical practice and provide a rich visual signal that is used by surgeons to visualize the clinical site and make clinical decisions. This signal can also be used by artificial intelligence (AI) methods to provide support in identifying instruments, structures, or activities both in real-time during procedures and postoperatively for analytics and understanding of surgical processes. Summary In this paper, we provide a succinct perspective on the use of AI and especially computer vision to power solutions for the surgical operating room (OR). The synergy between data availability and technical advances in computational power and AI methodology has led to rapid developments in the field and promising advances. Key Messages With the increasing availability of surgical video sources and the convergence of technologies around video storage, processing, and understanding, we believe clinical solutions and products leveraging vision are going to become an important component of modern surgical capabilities. However, both technical and clinical challenges remain to be overcome to efficiently make use of vision-based approaches into the clinic.
Collapse
Affiliation(s)
- François Chadebecq
- Department of Computer Science, Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, United Kingdom
| | - Francisco Vasconcelos
- Department of Computer Science, Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, United Kingdom
| | - Evangelos Mazomenos
- Department of Computer Science, Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, United Kingdom
| | - Danail Stoyanov
- Department of Computer Science, Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, United Kingdom
| |
Collapse
|
43
|
Attanasio A, Scaglioni B, Leonetti M, Frangi AF, Cross W, Biyani CS, Valdastri P. Autonomous Tissue Retraction in Robotic Assisted Minimally Invasive Surgery – A Feasibility Study. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.3013914] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
44
|
Mathew S, Nadeem S, Kumari S, Kaufman A. Augmenting Colonoscopy using Extended and Directional CycleGAN for Lossy Image Translation. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2020; 2020:4695-4704. [PMID: 33456298 PMCID: PMC7811175 DOI: 10.1109/cvpr42600.2020.00475] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Colorectal cancer screening modalities, such as optical colonoscopy (OC) and virtual colonoscopy (VC), are critical for diagnosing and ultimately removing polyps (precursors of colon cancer). The non-invasive VC is normally used to inspect a 3D reconstructed colon (from CT scans) for polyps and if found, the OC procedure is performed to physically traverse the colon via endoscope and remove these polyps. In this paper, we present a deep learning framework, Extended and Directional CycleGAN, for lossy unpaired image-to-image translation between OC and VC to augment OC video sequences with scale-consistent depth information from VC, and augment VC with patient-specific textures, color and specular highlights from OC (e.g, for realistic polyp synthesis). Both OC and VC contain structural information, but it is obscured in OC by additional patient-specific texture and specular highlights, hence making the translation from OC to VC lossy. The existing CycleGAN approaches do not handle lossy transformations. To address this shortcoming, we introduce an extended cycle consistency loss, which compares the geometric structures from OC in the VC domain. This loss removes the need for the CycleGAN to embed OC information in the VC domain. To handle a stronger removal of the textures and lighting, a Directional Discriminator is introduced to differentiate the direction of translation (by creating paired information for the discriminator), as opposed to the standard CycleGAN which is direction-agnostic. Combining the extended cycle consistency loss and the Directional Discriminator, we show state-of-the-art results on scale-consistent depth inference for phantom, textured VC and for real polyp and normal colon video sequences. We also present results for realistic pendunculated and flat polyp synthesis from bumps introduced in 3D VC models.
Collapse
|
45
|
Ciuti G, Skonieczna-Żydecka K, Marlicz W, Iacovacci V, Liu H, Stoyanov D, Arezzo A, Chiurazzi M, Toth E, Thorlacius H, Dario P, Koulaouzidis A. Frontiers of Robotic Colonoscopy: A Comprehensive Review of Robotic Colonoscopes and Technologies. J Clin Med 2020; 9:E1648. [PMID: 32486374 PMCID: PMC7356873 DOI: 10.3390/jcm9061648] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 05/16/2020] [Accepted: 05/19/2020] [Indexed: 12/15/2022] Open
Abstract
Flexible colonoscopy remains the prime mean of screening for colorectal cancer (CRC) and the gold standard of all population-based screening pathways around the world. Almost 60% of CRC deaths could be prevented with screening. However, colonoscopy attendance rates are affected by discomfort, fear of pain and embarrassment or loss of control during the procedure. Moreover, the emergence and global thread of new communicable diseases might seriously affect the functioning of contemporary centres performing gastrointestinal endoscopy. Innovative solutions are needed: artificial intelligence (AI) and physical robotics will drastically contribute for the future of the healthcare services. The translation of robotic technologies from traditional surgery to minimally invasive endoscopic interventions is an emerging field, mainly challenged by the tough requirements for miniaturization. Pioneering approaches for robotic colonoscopy have been reported in the nineties, with the appearance of inchworm-like devices. Since then, robotic colonoscopes with assistive functionalities have become commercially available. Research prototypes promise enhanced accessibility and flexibility for future therapeutic interventions, even via autonomous or robotic-assisted agents, such as robotic capsules. Furthermore, the pairing of such endoscopic systems with AI-enabled image analysis and recognition methods promises enhanced diagnostic yield. By assembling a multidisciplinary team of engineers and endoscopists, the paper aims to provide a contemporary and highly-pictorial critical review for robotic colonoscopes, hence providing clinicians and researchers with a glimpse of the major changes and challenges that lie ahead.
Collapse
Affiliation(s)
- Gastone Ciuti
- The BioRobotics Institute, Scuola Superiore Sant’Anna, 56025 Pisa, Italy; (V.I.); (M.C.); (P.D.)
- Department of Excellence in Robotics & AI, Scuola Superiore Sant’Anna, 56127 Pisa, Italy
| | - Karolina Skonieczna-Żydecka
- Department of Human Nutrition and Metabolomics, Pomeranian Medical University in Szczecin, 71-460 Szczecin, Poland;
| | - Wojciech Marlicz
- Department of Gastroenterology, Pomeranian Medical University in Szczecin, 71-252 Szczecin, Poland;
- Endoklinika sp. z o.o., 70-535 Szczecin, Poland
| | - Veronica Iacovacci
- The BioRobotics Institute, Scuola Superiore Sant’Anna, 56025 Pisa, Italy; (V.I.); (M.C.); (P.D.)
- Department of Excellence in Robotics & AI, Scuola Superiore Sant’Anna, 56127 Pisa, Italy
| | - Hongbin Liu
- School of Biomedical Engineering & Imaging Sciences, Faculty of Life Sciences and Medicine, King’s College London, London SE1 7EH, UK;
| | - Danail Stoyanov
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London W1W 7TY, UK;
| | - Alberto Arezzo
- Department of Surgical Sciences, University of Torino, 10126 Torino, Italy;
| | - Marcello Chiurazzi
- The BioRobotics Institute, Scuola Superiore Sant’Anna, 56025 Pisa, Italy; (V.I.); (M.C.); (P.D.)
- Department of Excellence in Robotics & AI, Scuola Superiore Sant’Anna, 56127 Pisa, Italy
| | - Ervin Toth
- Department of Gastroenterology, Skåne University Hospital, Lund University, 20502 Malmö, Sweden;
| | - Henrik Thorlacius
- Department of Clinical Sciences, Section of Surgery, Lund University, 20502 Malmö, Sweden;
| | - Paolo Dario
- The BioRobotics Institute, Scuola Superiore Sant’Anna, 56025 Pisa, Italy; (V.I.); (M.C.); (P.D.)
- Department of Excellence in Robotics & AI, Scuola Superiore Sant’Anna, 56127 Pisa, Italy
| | | |
Collapse
|
46
|
Xu L, Li J, Hao Y, Zhang P, Ciuti G, Dario P, Huang Q. Depth Estimation for Local Colon Structure in Monocular Capsule Endoscopy Based on Brightness and Camera Motion. ROBOTICA 2021; 39:334-45. [DOI: 10.1017/s0263574720000399] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
SUMMARYWe present a 3D reconstruction method using brightness and camera motion estimation for registering local colon structure in colonoscopy. The proposed method is based on reverse projection from 2D fold contours to 3D space, motion estimation from 3D reconstructed points between neighboring frames, and model registration to reconstruct the fold structure. On the synthetic colon, the average percentages of the reconstructed depth error and circumference error are about 14.2% and 15.2%, respectively. The accuracy is enough for the navigation and control in capsule robot. This work demonstrates that the proposed method is superior to the methods using single-frame-based brightness intensity.
Collapse
|
47
|
Choi J, Shin K, Jung J, Bae HJ, Kim DH, Byeon JS, Kim N. Convolutional Neural Network Technology in Endoscopic Imaging: Artificial Intelligence for Endoscopy. Clin Endosc 2020; 53:117-126. [PMID: 32252504 PMCID: PMC7137563 DOI: 10.5946/ce.2020.054] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 03/10/2020] [Accepted: 03/13/2020] [Indexed: 12/11/2022] Open
Abstract
Recently, significant improvements have been made in artificial intelligence. The artificial neural network was introduced in the 1950s. However, because of the low computing power and insufficient datasets available at that time, artificial neural networks suffered from overfitting and vanishing gradient problems for training deep networks. This concept has become more promising owing to the enhanced big data processing capability, improvement in computing power with parallel processing units, and new algorithms for deep neural networks, which are becoming increasingly successful and attracting interest in many domains, including computer vision, speech recognition, and natural language processing. Recent studies in this technology augur well for medical and healthcare applications, especially in endoscopic imaging. This paper provides perspectives on the history, development, applications, and challenges of deep-learning technology.
Collapse
Affiliation(s)
- Joonmyeong Choi
- Department of Convergence Medicine, University of Ulsan College of Medicine, Seoul, Korea
| | - Keewon Shin
- Department of Convergence Medicine, University of Ulsan College of Medicine, Seoul, Korea
| | | | | | - Do Hoon Kim
- Department of Gastroenterology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Jeong-Sik Byeon
- Department of Gastroenterology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Namku Kim
- Department of Convergence Medicine, University of Ulsan College of Medicine, Seoul, Korea
- Department of Radiology, Asan Medical Center, Seoul, Korea
| |
Collapse
|
48
|
Armin MA, Barnes N, Grimpen F, Salvado O. Learning colon centreline from optical colonoscopy, a new way to generate a map of the internal colon surface. Healthc Technol Lett 2020; 6:187-190. [PMID: 32038855 PMCID: PMC6952246 DOI: 10.1049/htl.2019.0073] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 10/02/2019] [Indexed: 11/20/2022] Open
Abstract
Optical colonoscopy is known as a gold standard screening method in detecting and removing cancerous polyps. During this procedure, some polyps may be undetected due to their positions, not being covered by the camera or missed by the surgeon. In this Letter, the authors introduce a novel convolutional neural network (ConvNet) algorithm to map the internal colon surface to a 2D map (visibility map), which can be used to increase the awareness of clinicians about areas they might miss. This was achieved by leveraging a colonoscopy simulator to generate a dataset consisting of colonoscopy video frames and their corresponding colon centreline (CCL) points in 3D camera coordinates. A pair of video frames were used as input to a ConvNet, whereas the output was a point on the CCL and its direction vector. By knowing CCL for each frame and roughly modelling the colon as a cylinder, frames could be unrolled to build a visibility map. They validated their results using both simulated and real colonoscopy frames. Their results showed that using consecutive simulated frames to learn the CCL can be generalised to real colonoscopy video frames to generate a visibility map.
Collapse
Affiliation(s)
| | - Nick Barnes
- CSIRO (Data61) 3D Computer Vision, Canberra, Australia.,College of Engineering and Computer Science (ANU), Canberra, Australia
| | - Florian Grimpen
- Department of Gastroenterology and Hepatology, Royal Brisbane and Women's Hospital, Brisbane, Australia
| | | |
Collapse
|
49
|
Widya AR, Monno Y, Okutomi M, Suzuki S, Gotoda T, Miki K. Whole Stomach 3D Reconstruction and Frame Localization From Monocular Endoscope Video. IEEE J Transl Eng Health Med 2019; 7:3300310. [PMID: 32309059 PMCID: PMC6830857 DOI: 10.1109/jtehm.2019.2946802] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/23/2019] [Revised: 09/03/2019] [Accepted: 09/25/2019] [Indexed: 12/22/2022]
Abstract
Gastric endoscopy is a common clinical practice that enables medical doctors to diagnose various lesions inside a stomach. In order to identify the location of a gastric lesion such as early cancer and a peptic ulcer within the stomach, this work addresses to reconstruct the color-textured 3D model of a whole stomach from a standard monocular endoscope video and localize any selected video frame to the 3D model. We examine how to enable structure-from-motion (SfM) to reconstruct the whole shape of a stomach from endoscope images, which is a challenging task due to the texture-less nature of the stomach surface. We specifically investigate the combined effect of chromo-endoscopy and color channel selection on SfM to increase the number of feature points. We also design a plane fitting-based algorithm for 3D point outliers removal to improve the 3D model quality. We show that whole stomach 3D reconstruction can be achieved (more than 90% of the frames can be reconstructed) by using red channel images captured under chromo-endoscopy by spreading indigo carmine (IC) dye on the stomach surface. In experimental results, we demonstrate the reconstructed 3D models for seven subjects and the application of lesion localization and reconstruction. The methodology and results presented in this paper could offer some valuable reference to other researchers and also could be an excellent tool for gastric surgeons in various computer-aided diagnosis applications.
Collapse
Affiliation(s)
- Aji Resindra Widya
- Department of Systems and Control EngineeringSchool of EngineeringTokyo Institute of TechnologyTokyo152-8550Japan
| | - Yusuke Monno
- Department of Systems and Control EngineeringSchool of EngineeringTokyo Institute of TechnologyTokyo152-8550Japan
| | - Masatoshi Okutomi
- Department of Systems and Control EngineeringSchool of EngineeringTokyo Institute of TechnologyTokyo152-8550Japan
| | - Sho Suzuki
- Division of Gastroenterology and HepatologyDepartment of MedicineNihon University School of MedicineTokyo101-8309Japan
| | - Takuji Gotoda
- Division of Gastroenterology and HepatologyDepartment of MedicineNihon University School of MedicineTokyo101-8309Japan
| | - Kenji Miki
- Department of Internal MedicineTsujinaka Hospital KashiwanohaKashiwa277-0871Japan
| |
Collapse
|