1
|
Liu S, Fan J, Yang Y, Xiao D, Ai D, Song H, Wang Y, Yang J. Monocular endoscopy images depth estimation with multi-scale residual fusion. Comput Biol Med 2024; 169:107850. [PMID: 38145602 DOI: 10.1016/j.compbiomed.2023.107850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 11/16/2023] [Accepted: 12/11/2023] [Indexed: 12/27/2023]
Abstract
BACKGROUND Monocular depth estimation plays a fundamental role in clinical endoscopy surgery. However, the coherent illumination, smooth surfaces, and texture-less nature of endoscopy images present significant challenges to traditional depth estimation methods. Existing approaches struggle to accurately perceive depth in such settings. METHOD To overcome these challenges, this paper proposes a novel multi-scale residual fusion method for estimating the depth of monocular endoscopy images. Specifically, we address the issue of coherent illumination by leveraging image frequency domain component space transformation, thereby enhancing the stability of the scene's light source. Moreover, we employ an image radiation intensity attenuation model to estimate the initial depth map. Finally, to refine the accuracy of depth estimation, we utilize a multi-scale residual fusion optimization technique. RESULTS To evaluate the performance of our proposed method, extensive experiments were conducted on public datasets. The structural similarity measures for continuous frames in three distinct clinical data scenes reached impressive values of 0.94, 0.82, and 0.84, respectively. These results demonstrate the effectiveness of our approach in capturing the intricate details of endoscopy images. Furthermore, the depth estimation accuracy achieved remarkable levels of 89.3 % and 91.2 % for the two models' data, respectively, underscoring the robustness of our method. CONCLUSIONS Overall, the promising results obtained on public datasets highlight the significant potential of our method for clinical applications, facilitating reliable depth estimation and enhancing the quality of endoscopy surgical procedures.
Collapse
Affiliation(s)
- Shiyuan Liu
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China; China Center for Information Industry Development, Beijing, 100081, China
| | - Jingfan Fan
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China.
| | - Yun Yang
- Department of General Surgery, Beijing Friendship Hospital, Capital Medical University, National Clinical Research Center for Digestive Diseases, Beijing 100050, China
| | - Deqiang Xiao
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Danni Ai
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Hong Song
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Yongtian Wang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China.
| | - Jian Yang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| |
Collapse
|
2
|
Liu S, Fan J, Zang L, Yang Y, Fu T, Song H, Wang Y, Yang J. Pose estimation via structure-depth information from monocular endoscopy images sequence. Biomed Opt Express 2024; 15:460-478. [PMID: 38223180 PMCID: PMC10783895 DOI: 10.1364/boe.498262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 12/08/2023] [Accepted: 12/14/2023] [Indexed: 01/16/2024]
Abstract
Image-based endoscopy pose estimation has been shown to significantly improve the visualization and accuracy of minimally invasive surgery (MIS). This paper proposes a method for pose estimation based on structure-depth information from a monocular endoscopy image sequence. Firstly, the initial frame location is constrained using the image structure difference (ISD) network. Secondly, endoscopy image depth information is used to estimate the pose of sequence frames. Finally, adaptive boundary constraints are used to optimize continuous frame endoscopy pose estimation, resulting in more accurate intraoperative endoscopy pose estimation. Evaluations were conducted on publicly available datasets, with the pose estimation error in bronchoscopy and colonoscopy datasets reaching 1.43 mm and 3.64 mm, respectively. These results meet the real-time requirements of various scenarios, demonstrating the capability of this method to generate reliable pose estimation results for endoscopy images and its meaningful applications in clinical practice. This method enables accurate localization of endoscopy images during surgery, assisting physicians in performing safer and more effective procedures.
Collapse
Affiliation(s)
- Shiyuan Liu
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
- China Center for Information Industry Development, Beijing 100081, China
| | - Jingfan Fan
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Liugeng Zang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Yun Yang
- Department of General Surgery, Beijing Friendship Hospital, Capital Medical University; National Clinical Research Center for Digestive Diseases, Beijing 100050, China
| | - Tianyu Fu
- Institute of Engineering Medicine, Beijing Institute of Technology, Beijing 100081, China
| | - Hong Song
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Yongtian Wang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Jian Yang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
3
|
Bobrow TL, Golhar M, Vijayan R, Akshintala VS, Garcia JR, Durr NJ. Colonoscopy 3D video dataset with paired depth from 2D-3D registration. Med Image Anal 2023; 90:102956. [PMID: 37713764 PMCID: PMC10591895 DOI: 10.1016/j.media.2023.102956] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 06/29/2023] [Accepted: 09/04/2023] [Indexed: 09/17/2023]
Abstract
Screening colonoscopy is an important clinical application for several 3D computer vision techniques, including depth estimation, surface reconstruction, and missing region detection. However, the development, evaluation, and comparison of these techniques in real colonoscopy videos remain largely qualitative due to the difficulty of acquiring ground truth data. In this work, we present a Colonoscopy 3D Video Dataset (C3VD) acquired with a high definition clinical colonoscope and high-fidelity colon models for benchmarking computer vision methods in colonoscopy. We introduce a novel multimodal 2D-3D registration technique to register optical video sequences with ground truth rendered views of a known 3D model. The different modalities are registered by transforming optical images to depth maps with a Generative Adversarial Network and aligning edge features with an evolutionary optimizer. This registration method achieves an average translation error of 0.321 millimeters and an average rotation error of 0.159 degrees in simulation experiments where error-free ground truth is available. The method also leverages video information, improving registration accuracy by 55.6% for translation and 60.4% for rotation compared to single frame registration. 22 short video sequences were registered to generate 10,015 total frames with paired ground truth depth, surface normals, optical flow, occlusion, six degree-of-freedom pose, coverage maps, and 3D models. The dataset also includes screening videos acquired by a gastroenterologist with paired ground truth pose and 3D surface models. The dataset and registration source code are available at https://durr.jhu.edu/C3VD.
Collapse
Affiliation(s)
- Taylor L Bobrow
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Mayank Golhar
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Rohan Vijayan
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Venkata S Akshintala
- Division of Gastroenterology and Hepatology, Johns Hopkins Medicine, Baltimore, MD 21287, USA
| | - Juan R Garcia
- Department of Art as Applied to Medicine, Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Nicholas J Durr
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.
| |
Collapse
|
4
|
Shi H, Wang Z, Zhou Y, Li D, Yang X, Li Q. Bidirectional Semi-Supervised Dual-Branch CNN for Robust 3D Reconstruction of Stereo Endoscopic Images via Adaptive Cross and Parallel Supervisions. IEEE Trans Med Imaging 2023; 42:3269-3282. [PMID: 37227904 DOI: 10.1109/tmi.2023.3279899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Semi-supervised learning via teacher-student network can train a model effectively on a few labeled samples. It enables a student model to distill knowledge from the teacher's predictions of extra unlabeled data. However, such knowledge flow is typically unidirectional, having the accuracy vulnerable to the quality of teacher model. In this paper, we seek to robust 3D reconstruction of stereo endoscopic images by proposing a novel fashion of bidirectional learning between two learners, each of which can play both roles of teacher and student concurrently. Specifically, we introduce two self-supervisions, i.e., Adaptive Cross Supervision (ACS) and Adaptive Parallel Supervision (APS), to learn a dual-branch convolutional neural network. The two branches predict two different disparity probability distributions for the same position, and output their expectations as disparity values. The learned knowledge flows across branches along two directions: a cross direction (disparity guides distribution in ACS) and a parallel direction (disparity guides disparity in APS). Moreover, each branch also learns confidences to dynamically refine its provided supervisions. In ACS, the predicted disparity is softened into a unimodal distribution, and the lower the confidence, the smoother the distribution. In APS, the incorrect predictions are suppressed by lowering the weights of those with low confidence. With the adaptive bidirectional learning, the two branches enjoy well-tuned mutual supervisions, and eventually converge on a consistent and more accurate disparity estimation. The experimental results on four public datasets demonstrate our superior accuracy over other state-of-the-arts with a relative decrease of averaged disparity error by at least 9.76%.
Collapse
|
5
|
Wang X, Nie Y, Ren W, Wei M, Zhang J. Multi-scale, multi-dimensional binocular endoscopic image depth estimation network. Comput Biol Med 2023; 164:107305. [PMID: 37597409 DOI: 10.1016/j.compbiomed.2023.107305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 07/07/2023] [Accepted: 07/28/2023] [Indexed: 08/21/2023]
Abstract
During invasive surgery, the use of deep learning techniques to acquire depth information from lesion sites in real-time is hindered by the lack of endoscopic environmental datasets. This work aims to develop a high-accuracy three-dimensional (3D) simulation model for generating image datasets and acquiring depth information in real-time. Here, we proposed an end-to-end multi-scale supervisory depth estimation network (MMDENet) model for the depth estimation of pairs of binocular images. The proposed MMDENet highlights a multi-scale feature extraction module incorporating contextual information to enhance the correspondence precision of poorly exposed regions. A multi-dimensional information-guidance refinement module is also proposed to refine the initial coarse disparity map. Statistical experimentation demonstrated a 3.14% reduction in endpoint error compared to state-of-the-art methods. With a processing time of approximately 30fps, satisfying the requirements of real-time operation applications. In order to validate the performance of the trained MMDENet in actual endoscopic images, we conduct both qualitative and quantitative analysis with 93.38% high precision, which holds great promise for applications in surgical navigation.
Collapse
Affiliation(s)
- Xiongzhi Wang
- School of Future Technology, University of Chinese Academy of Sciences, Beijing 100039, China; School of Aerospace Science And Technology, Xidian University, Xian 710071, China.
| | - Yunfeng Nie
- Brussel Photonics, Department of Applied Physics and Photonics, Vrije Universiteit Brussel and Flanders Make, 1050 Brussels, Belgium
| | - Wenqi Ren
- State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100093, China
| | - Min Wei
- Department of Orthopedics, the Fourth Medical Center, Chinese PLA General Hospital, Beijing 100853, China
| | - Jingang Zhang
- School of Future Technology, University of Chinese Academy of Sciences, Beijing 100039, China; School of Aerospace Science And Technology, Xidian University, Xian 710071, China.
| |
Collapse
|
6
|
van Bokhorst QNE, Houwen BBSL, Hazewinkel Y, Fockens P, Dekker E. Advances in artificial intelligence and computer science for computer-aided diagnosis of colorectal polyps: current status. Endosc Int Open 2023; 11:E752-E767. [PMID: 37593158 PMCID: PMC10431975 DOI: 10.1055/a-2098-1999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Accepted: 05/08/2023] [Indexed: 08/19/2023] Open
Affiliation(s)
- Querijn N E van Bokhorst
- Department of Gastroenterology and Hepatology, Amsterdam University Medical Centers, location Academic Medical Center, Amsterdam, the Netherlands
- Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam, the Netherlands
| | - Britt B S L Houwen
- Department of Gastroenterology and Hepatology, Amsterdam University Medical Centers, location Academic Medical Center, Amsterdam, the Netherlands
- Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam, the Netherlands
| | - Yark Hazewinkel
- Department of Gastroenterology and Hepatology, Tergooi Medical Center, Hilversum, the Netherlands
| | - Paul Fockens
- Department of Gastroenterology and Hepatology, Amsterdam University Medical Centers, location Academic Medical Center, Amsterdam, the Netherlands
- Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam, the Netherlands
| | - Evelien Dekker
- Department of Gastroenterology and Hepatology, Amsterdam University Medical Centers, location Academic Medical Center, Amsterdam, the Netherlands
- Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam, the Netherlands
| |
Collapse
|
7
|
Mathew A, Magerand L, Trucco E, Manfredi L. Self-supervised monocular depth estimation for high field of view colonoscopy cameras. Front Robot AI 2023; 10:1212525. [PMID: 37559569 PMCID: PMC10407791 DOI: 10.3389/frobt.2023.1212525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 06/26/2023] [Indexed: 08/11/2023] Open
Abstract
Optical colonoscopy is the gold standard procedure to detect colorectal cancer, the fourth most common cancer in the United Kingdom. Up to 22%-28% of polyps can be missed during the procedure that is associated with interval cancer. A vision-based autonomous soft endorobot for colonoscopy can drastically improve the accuracy of the procedure by inspecting the colon more systematically with reduced discomfort. A three-dimensional understanding of the environment is essential for robot navigation and can also improve the adenoma detection rate. Monocular depth estimation with deep learning methods has progressed substantially, but collecting ground-truth depth maps remains a challenge as no 3D camera can be fitted to a standard colonoscope. This work addresses this issue by using a self-supervised monocular depth estimation model that directly learns depth from video sequences with view synthesis. In addition, our model accommodates wide field-of-view cameras typically used in colonoscopy and specific challenges such as deformable surfaces, specular lighting, non-Lambertian surfaces, and high occlusion. We performed qualitative analysis on a synthetic data set, a quantitative examination of the colonoscopy training model, and real colonoscopy videos in near real-time.
Collapse
Affiliation(s)
- Alwyn Mathew
- Division of Imaging Science and Technology, School of Medicine, University of Dundee, Dundee, United Kingdom
| | - Ludovic Magerand
- Discipline of Computing, School of Science and Engineering, University of Dundee, Dundee, United Kingdom
| | - Emanuele Trucco
- Discipline of Computing, School of Science and Engineering, University of Dundee, Dundee, United Kingdom
| | - Luigi Manfredi
- Division of Imaging Science and Technology, School of Medicine, University of Dundee, Dundee, United Kingdom
| |
Collapse
|
8
|
Liu Y, Zuo S. Self-supervised monocular depth estimation for gastrointestinal endoscopy. Comput Methods Programs Biomed 2023; 238:107619. [PMID: 37235969 DOI: 10.1016/j.cmpb.2023.107619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 04/26/2023] [Accepted: 05/18/2023] [Indexed: 05/28/2023]
Abstract
BACKGROUND AND OBJECTIVE Gastrointestinal (GI) endoscopy represents a promising tool for GI cancer screening. However, the limited field of view and uneven skills of endoscopists make it remains difficult to accurately identify polyps and follow up on precancerous lesions under endoscopy. Estimating depth from GI endoscopic sequences is essential for a series of AI-assisted surgical techniques. Nonetheless, depth estimation algorithm of GI endoscopy is a challenging task due to the particularity of the environment and the limitation of datasets. In this paper, we propose a self-supervised monocular depth estimation method for GI endoscopy. METHODS A depth estimation network and a camera ego-motion estimation network are firstly constructed to obtain the depth information and pose information of the sequence respectively, and then the model is enabled to perform self-supervised training by calculating the multi-scale structural similarity with L1 norm (MS-SSIM+L1) loss function between the target frame and the reconstructed image as part of the loss of the training network. The MS-SSIM+L1 loss function is good for reserving high-frequency information and can maintain the invariance of brightness and color. Our model consists of the U-shape convolutional network with the dual-attention mechanism, which is beneficial to capture muti-scale contextual information, and greatly improves the accuracy of depth estimation. We evaluated our method qualitatively and quantitatively with different state-of-the-art methods. RESULTS AND CONCLUSIONS The experimental results manifest that our method has superior generality, achieving lower error metrics and higher accuracy metrics on both the UCL dataset and the Endoslam dataset. The proposed method has also been validated with clinical GI endoscopy, demonstrating the potential clinical value of the model.
Collapse
Affiliation(s)
- Yuying Liu
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, Tianjin, China
| | - Siyang Zuo
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, Tianjin, China.
| |
Collapse
|
9
|
Horovistiz A, Oliveira M, Araújo H. Computer vision-based solutions to overcome the limitations of wireless capsule endoscopy. J Med Eng Technol 2023; 47:242-261. [PMID: 38231042 DOI: 10.1080/03091902.2024.2302025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 12/28/2023] [Indexed: 01/18/2024]
Abstract
Endoscopic investigation plays a critical role in the diagnosis of gastrointestinal (GI) diseases. Since 2001, Wireless Capsule Endoscopy (WCE) has been available for small bowel exploration and is in continuous development. Over the last decade, WCE has achieved impressive improvements in areas such as miniaturisation, image quality and battery life. As a result, WCE is currently a very useful alternative to wired enteroscopy in the investigation of various small bowel abnormalities and has the potential to become the leading screening technique for the entire gastrointestinal tract. However, commercial solutions still have several limitations, namely incomplete examination and limited diagnostic capacity. These deficiencies are related to technical issues, such as image quality, motion estimation and power consumption management. Computational methods, based on image processing and analysis, can help to overcome these challenges and reduce both the time required by reviewers and human interpretation errors. Research groups have proposed a series of methods including algorithms for locating the capsule or lesion, assessing intestinal motility and improving image quality.In this work, we provide a critical review of computational vision-based methods for WCE image analysis aimed at overcoming the technological challenges of capsules. This article also reviews several representative public datasets used to evaluate the performance of WCE techniques and methods. Finally, some promising solutions of computational methods based on the analysis of multiple-camera endoscopic images are presented.
Collapse
Affiliation(s)
- Ana Horovistiz
- Institute of Systems and Robotics, University of Coimbra, Coimbra, Portugal
| | - Marina Oliveira
- Institute of Systems and Robotics, University of Coimbra, Coimbra, Portugal
- Department of Electrical and Computer Engineering (DEEC), Faculty of Sciences and Technology, University of Coimbra, Coimbra, Portugal
| | - Helder Araújo
- Institute of Systems and Robotics, University of Coimbra, Coimbra, Portugal
- Department of Electrical and Computer Engineering (DEEC), Faculty of Sciences and Technology, University of Coimbra, Coimbra, Portugal
| |
Collapse
|
10
|
Gao C, Killeen BD, Hu Y, Grupp RB, Taylor RH, Armand M, Unberath M. Synthetic data accelerates the development of generalizable learning-based algorithms for X-ray image analysis. NAT MACH INTELL 2023; 5:294-308. [PMID: 38523605 PMCID: PMC10959504 DOI: 10.1038/s42256-023-00629-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 02/06/2023] [Indexed: 03/26/2024]
Abstract
Artificial intelligence (AI) now enables automated interpretation of medical images. However, AI's potential use for interventional image analysis remains largely untapped. This is because the post hoc analysis of data collected during live procedures has fundamental and practical limitations, including ethical considerations, expense, scalability, data integrity and a lack of ground truth. Here we demonstrate that creating realistic simulated images from human models is a viable alternative and complement to large-scale in situ data collection. We show that training AI image analysis models on realistically synthesized data, combined with contemporary domain generalization techniques, results in machine learning models that on real data perform comparably to models trained on a precisely matched real data training set. We find that our model transfer paradigm for X-ray image analysis, which we refer to as SyntheX, can even outperform real-data-trained models due to the effectiveness of training on a larger dataset. SyntheX provides an opportunity to markedly accelerate the conception, design and evaluation of X-ray-based intelligent systems. In addition, SyntheX provides the opportunity to test novel instrumentation, design complementary surgical approaches, and envision novel techniques that improve outcomes, save time or mitigate human error, free from the ethical and practical considerations of live human data collection.
Collapse
Affiliation(s)
- Cong Gao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Benjamin D. Killeen
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Yicheng Hu
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Robert B. Grupp
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Russell H. Taylor
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Mehran Armand
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- Department of Orthopaedic Surgery, Johns Hopkins Applied Physics Laboratory, Baltimore, MD, USA
| | - Mathias Unberath
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
11
|
Somers P, Holdenried-Krafft S, Zahn J, Schüle J, Veil C, Harland N, Walz S, Stenzl A, Sawodny O, Tarín C, Lensch HPA. Cystoscopic depth estimation using gated adversarial domain adaptation. Biomed Eng Lett 2023; 13:141-151. [PMID: 37124116 PMCID: PMC10130294 DOI: 10.1007/s13534-023-00261-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 12/20/2022] [Accepted: 01/09/2023] [Indexed: 01/22/2023] Open
Abstract
AbstractMonocular depth estimation from camera images is very important for surrounding scene evaluation in many technical fields from automotive to medicine. However, traditional triangulation methods using stereo cameras or multiple views with the assumption of a rigid environment are not applicable for endoscopic domains. Particularly in cystoscopies it is not possible to produce ground truth depth information to directly train machine learning algorithms for using a monocular image directly for depth prediction. This work considers first creating a synthetic cystoscopic environment for initial encoding of depth information from synthetically rendered images. Next, the task of predicting pixel-wise depth values for real images is constrained to a domain adaption between the synthetic and real image domains. This adaptation is done through added gated residual blocks in order to simplify the network task and maintain training stability during adversarial training. Training is done on an internally collected cystoscopy dataset from human patients. The results after training demonstrate the ability to predict reasonable depth estimations from actual cystoscopic videos and added stability from using gated residual blocks is shown to prevent mode collapse during adversarial training.
Collapse
|
12
|
Yin TK, Huang KL, Chiu SR, Yang YQ, Chang BR. Endoscopy Artefact Detection by Deep Transfer Learning of Baseline Models. J Digit Imaging 2022; 35:1101-1110. [PMID: 35478060 PMCID: PMC9582060 DOI: 10.1007/s10278-022-00627-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 03/28/2022] [Accepted: 03/30/2022] [Indexed: 10/18/2022] Open
Abstract
To visualise the tumours inside the body on a screen, a long and thin tube is inserted with a light source and a camera at the tip to obtain video frames inside organs in endoscopy. However, multiple artefacts exist in these video frames that cause difficulty during the diagnosis of cancers. In this research, deep learning was applied to detect eight kinds of artefacts: specularity, bubbles, saturation, contrast, blood, instrument, blur, and imaging artefacts. Based on transfer learning with pre-trained parameters and fine-tuning, two state-of-the-art methods were applied for detection: faster region-based convolutional neural networks (Faster R-CNN) and EfficientDet. Experiments were implemented on the grand challenge dataset, Endoscopy Artefact Detection and Segmentation (EAD2020). To validate our approach in this study, we used phase I of 2,200 frames and phase II of 331 frames in the original training dataset with ground-truth annotations as training and testing dataset, respectively. Among the tested methods, EfficientDet-D2 achieves a score of 0.2008 (mAPd[Formula: see text]0.6+mIoUd[Formula: see text]0.4) on the dataset that is better than three other baselines: Faster-RCNN, YOLOv3, and RetinaNet, and competitive to the best non-baseline result scored 0.25123 on the leaderboard although our testing was on phase II of 331 frames instead of the original 200 testing frames. Without extra improvement techniques beyond basic neural networks such as test-time augmentation, we showed that a simple baseline could achieve state-of-the-art performance in detecting artefacts in endoscopy. In conclusion, we proposed the combination of EfficientDet-D2 with suitable data augmentation and pre-trained parameters during fine-tuning training to detect the artefacts in endoscopy.
Collapse
Affiliation(s)
- Tang-Kai Yin
- Department of Computer Science and Information Engineering, National University of Kaohsiung, No. 700, Kaohsiung University Rd., Nan-Tzu Dist., 811, Kaohsiung, Taiwan.
| | - Kai-Lun Huang
- Department of Computer Science and Information Engineering, National University of Kaohsiung, No. 700, Kaohsiung University Rd., Nan-Tzu Dist., 811, Kaohsiung, Taiwan
| | - Si-Rong Chiu
- Department of Computer Science and Information Engineering, National University of Kaohsiung, No. 700, Kaohsiung University Rd., Nan-Tzu Dist., 811, Kaohsiung, Taiwan
| | - Yu-Qi Yang
- Department of Computer Science and Information Engineering, National University of Kaohsiung, No. 700, Kaohsiung University Rd., Nan-Tzu Dist., 811, Kaohsiung, Taiwan
| | - Bao-Rong Chang
- Department of Computer Science and Information Engineering, National University of Kaohsiung, No. 700, Kaohsiung University Rd., Nan-Tzu Dist., 811, Kaohsiung, Taiwan
| |
Collapse
|
13
|
Xu Q, Wang Z, Plewczynski D. A Data-Driven Model for Automated Chinese Word Segmentation and POS Tagging. Computational Intelligence and Neuroscience 2022; 2022:1-10. [PMID: 36156940 PMCID: PMC9507729 DOI: 10.1155/2022/7622392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 08/25/2022] [Accepted: 09/05/2022] [Indexed: 11/30/2022]
Abstract
Chinese natural language processing tasks often require the solution of Chinese word segmentation and POS tagging problems. Traditional Chinese word segmentation and POS tagging methods mainly use simple matching algorithms based on lexicons and rules. The simple matching or statistical analysis requires manual word segmentation followed by POS tagging, which leads to the inability to meet the practical requirements for label prediction accuracy. With the continuous development of deep learning technology, data-driven machine learning models provide new opportunities for automated Chinese word segmentation and POS tagging. Therefore, a data-driven automated Chinese word segmentation and POS tagging model is proposed in order to address the above problems. Firstly, the main idea and overall framework of the proposed automated model are outlined, and the tagging strategy and neural network language model used are described. Secondly, two main optimisations are made on the input side of the model: (1) the use of word2Vec for the representation of text features, thus representing the text as a distributed word vector; and (2) the use of an improved AlexNet for efficient encoding of long-range word, and the addition of an attention mechanism to the model. Finally, on the output side, an additional auxiliary loss function was designed to optimise the Chinese text based on its frequency. The experimental results show that the proposed model can significantly improve the accuracy and operational efficiency of Chinese word segmentation and POS tagging compared with other existing models, thus verifying its effectiveness and advancement.
Collapse
|
14
|
Yang B, Xu S, Chen H, Zheng W, Liu C. Reconstruct Dynamic Soft-Tissue With Stereo Endoscope Based on a Single-Layer Network. IEEE Trans Image Process 2022; 31:5828-5840. [PMID: 36054398 DOI: 10.1109/tip.2022.3202367] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In dynamic minimally invasive surgery environments, 3D reconstruction of deformable soft-tissue surfaces with stereo endoscopic images is very challenging. A simple self-supervised stereo reconstruction framework is proposed to address this issue, which bridges the traditional geometric deformable models and the newly revived neural networks. The equivalence between the classical thin plate spline (TPS) model and a single-layer fully-connected or convolutional network is studied. By alternating training of two TPS equivalent networks within the self-supervised framework, disparity priors are learnt from the past stereo frames of target tissues to form an optimized disparity basis, on which disparity maps of subsequent frames can be estimated more accurately without sacrificing computational efficiency and robustness. The proposed method was verified on stereo-endoscopic videos recorded by the da Vinci® surgical robots.
Collapse
|
15
|
Oda M, Itoh H, Tanaka K, Takabatake H, Mori M, Natori H, Mori K. Depth estimation from single-shot monocular endoscope image using image domain adaptation and edge-aware depth estimation. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization 2022. [DOI: 10.1080/21681163.2021.2012835] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Masahiro Oda
- Information and Communications, Nagoya University, Nagoya, Japan
- Graduate School of Informatics, Nagoya University, Nagoya, Japan
| | - Hayato Itoh
- Graduate School of Informatics, Nagoya University, Nagoya, Japan
| | - Kiyohito Tanaka
- Department of Gastroenterology, Kyoto Second Red Cross Hospital, Kyoto, Japan
| | - Hirotsugu Takabatake
- Department of Respiratory Medicine, Sapporo-Minami-Sanjo Hospital, Sapporo, Japan
| | - Masaki Mori
- Department of Respiratory Medicine, Sapporo-Kosei General Hospital, Sapporo, Japan
| | - Hiroshi Natori
- Department of Respiratory Medicine, Keiwakai Nishioka Hospital, Sapporo, Japan
| | - Kensaku Mori
- Information and Communications, Nagoya University, Nagoya, Japan
- Graduate School of Informatics, Nagoya University, Nagoya, Japan
- Research Center for Medical Bigdata, National Institute of Informatics, Tokyo, Japan
| |
Collapse
|
16
|
Liu S, Fan J, Song D, Fu T, Lin Y, Xiao D, Song H, Wang Y, Yang J. Joint estimation of depth and motion from a monocular endoscopy image sequence using a multi-loss rebalancing network. Biomed Opt Express 2022; 13:2707-2727. [PMID: 35774318 PMCID: PMC9203100 DOI: 10.1364/boe.457475] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 04/01/2022] [Accepted: 04/01/2022] [Indexed: 06/15/2023]
Abstract
Building an in vivo three-dimensional (3D) surface model from a monocular endoscopy is an effective technology to improve the intuitiveness and precision of clinical laparoscopic surgery. This paper proposes a multi-loss rebalancing-based method for joint estimation of depth and motion from a monocular endoscopy image sequence. The feature descriptors are used to provide monitoring signals for the depth estimation network and motion estimation network. The epipolar constraints of the sequence frame is considered in the neighborhood spatial information by depth estimation network to enhance the accuracy of depth estimation. The reprojection information of depth estimation is used to reconstruct the camera motion by motion estimation network with a multi-view relative pose fusion mechanism. The relative response loss, feature consistency loss, and epipolar consistency loss function are defined to improve the robustness and accuracy of the proposed unsupervised learning-based method. Evaluations are implemented on public datasets. The error of motion estimation in three scenes decreased by 42.1%,53.6%, and 50.2%, respectively. And the average error of 3D reconstruction is 6.456 ± 1.798mm. This demonstrates its capability to generate reliable depth estimation and trajectory reconstruction results for endoscopy images and meaningful applications in clinical.
Collapse
Affiliation(s)
- Shiyuan Liu
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Jingfan Fan
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Dengpan Song
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Tianyu Fu
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Yucong Lin
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Deqiang Xiao
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Hong Song
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Yongtian Wang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Jian Yang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| |
Collapse
|
17
|
Abstract
During the past decades, many automated image analysis methods have been developed for colonoscopy. Real-time implementation of the most promising methods during colonoscopy has been tested in clinical trials, including several recent multi-center studies. All trials have shown results that may contribute to prevention of colorectal cancer. We summarize the past and present development of colonoscopy video analysis methods, focusing on two categories of artificial intelligence (AI) technologies used in clinical trials. These are (1) analysis and feedback for improving colonoscopy quality and (2) detection of abnormalities. Our survey includes methods that use traditional machine learning algorithms on carefully designed hand-crafted features as well as recent deep-learning methods. Lastly, we present the gap between current state-of-the-art technology and desirable clinical features and conclude with future directions of endoscopic AI technology development that will bridge the current gap.
Collapse
|
18
|
Pan H, Cai M, Liao Q, Jiang Y, Liu Y, Zhuang X, Yu Y. Artificial Intelligence-Aid Colonoscopy Vs. Conventional Colonoscopy for Polyp and Adenoma Detection: A Systematic Review of 7 Discordant Meta-Analyses. Front Med (Lausanne) 2022; 8:775604. [PMID: 35096870 PMCID: PMC8792899 DOI: 10.3389/fmed.2021.775604] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 12/20/2021] [Indexed: 12/16/2022] Open
Abstract
Objectives: Multiple meta-analyses which investigated the comparative efficacy and safety of artificial intelligence (AI)-aid colonoscopy (AIC) vs. conventional colonoscopy (CC) in the detection of polyp and adenoma have been published. However, a definitive conclusion has not yet been generated. This systematic review selected from discordant meta-analyses to draw a definitive conclusion about whether AIC is better than CC for the detection of polyp and adenoma. Methods: We comprehensively searched potentially eligible literature in PubMed, Embase, Cochrane library, and China National Knowledgement Infrastructure (CNKI) databases from their inceptions until to April 2021. Assessment of Multiple Systematic Reviews (AMSTAR) instrument was used to assess the methodological quality. Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist was used to assess the reporting quality. Two investigators independently used the Jadad decision algorithm to select high-quality meta-analyses which summarized the best available evidence. Results: Seven meta-analyses met our selection criteria finally. AMSTAR score ranged from 8 to 10, and PRISMA score ranged from 23 to 26. According to the Jadad decision algorithm, two high-quality meta-analyses were selected. These two meta-analyses suggested that AIC was superior to CC for colonoscopy outcomes, especially for polyp detection rate (PDR) and adenoma detection rate (ADR). Conclusion: Based on the best available evidence, we conclude that AIC should be preferentially selected for the route screening of colorectal lesions because it has potential value of increasing the polyp and adenoma detection. However, the continued improvement of AIC in differentiating the shape and pathology of colorectal lesions is needed.
Collapse
Affiliation(s)
- Hui Pan
- Department of Endoscopy, Shanghai Jiangong Hospital, Shanghai, China
| | - Mingyan Cai
- Endoscopy Center, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Qi Liao
- Department of Gastroenterology, Shanghai Jiangong Hospital, Shanghai, China
| | - Yong Jiang
- Department of Surgery, Shanghai Jiangong Hospital, Shanghai, China
| | - Yige Liu
- Department of Endoscopy, Shanghai Jiangong Hospital, Shanghai, China
| | - Xiaolong Zhuang
- Department of Endoscopy, Shanghai Jiangong Hospital, Shanghai, China
| | - Ying Yu
- Department of Endoscopy, Shanghai Jiangong Hospital, Shanghai, China
| |
Collapse
|
19
|
Marzullo A, Moccia S, Calimeri F, De Momi E. AIM in Endoscopy Procedures. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
20
|
Shao S, Pei Z, Chen W, Zhu W, Wu X, Sun D, Zhang B. Self-Supervised monocular depth and ego-Motion estimation in endoscopy: Appearance flow to the rescue. Med Image Anal 2021; 77:102338. [PMID: 35016079 DOI: 10.1016/j.media.2021.102338] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 10/24/2021] [Accepted: 12/14/2021] [Indexed: 11/25/2022]
Abstract
Recently, self-supervised learning technology has been applied to calculate depth and ego-motion from monocular videos, achieving remarkable performance in autonomous driving scenarios. One widely adopted assumption of depth and ego-motion self-supervised learning is that the image brightness remains constant within nearby frames. Unfortunately, the endoscopic scene does not meet this assumption because there are severe brightness fluctuations induced by illumination variations, non-Lambertian reflections and interreflections during data collection, and these brightness fluctuations inevitably deteriorate the depth and ego-motion estimation accuracy. In this work, we introduce a novel concept referred to as appearance flow to address the brightness inconsistency problem. The appearance flow takes into consideration any variations in the brightness pattern and enables us to develop a generalized dynamic image constraint. Furthermore, we build a unified self-supervised framework to estimate monocular depth and ego-motion simultaneously in endoscopic scenes, which comprises a structure module, a motion module, an appearance module and a correspondence module, to accurately reconstruct the appearance and calibrate the image brightness. Extensive experiments are conducted on the SCARED dataset and EndoSLAM dataset, and the proposed unified framework exceeds other self-supervised approaches by a large margin. To validate our framework's generalization ability on different patients and cameras, we train our model on SCARED but test it on the SERV-CT and Hamlyn datasets without any fine-tuning, and the superior results reveal its strong generalization ability. Code is available at: https://github.com/ShuweiShao/AF-SfMLearner.
Collapse
Affiliation(s)
- Shuwei Shao
- School of Automation Science and Electrical Engineering, Beihang University, Beijing, China
| | - Zhongcai Pei
- School of Automation Science and Electrical Engineering, Beihang University, Beijing, China; Hangzhou Innovation Institute, Beihang University, Hangzhou, China
| | - Weihai Chen
- School of Automation Science and Electrical Engineering, Beihang University, Beijing, China; Hangzhou Innovation Institute, Beihang University, Hangzhou, China.
| | | | - Xingming Wu
- School of Automation Science and Electrical Engineering, Beihang University, Beijing, China
| | - Dianmin Sun
- Shandong Cancer Hospital Affiliated to Shandong University, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
| | - Baochang Zhang
- Institute of Artificial Intelligence, Beihang University, Beijing, China.
| |
Collapse
|
21
|
Li Y, Konuthula N, Humphreys IM, Moe K, Hannaford B, Bly R. Real-time virtual intraoperative CT in endoscopic sinus surgery. Int J Comput Assist Radiol Surg 2021; 17:249-260. [PMID: 34888754 DOI: 10.1007/s11548-021-02536-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2021] [Accepted: 11/17/2021] [Indexed: 10/19/2022]
Abstract
PURPOSE Endoscopic sinus surgery (ESS) is typically guided under preoperative computed tomography (CT), which increasingly diverges from actual patient anatomy as the surgery progresses. Studies have reported that the revision surgery rate in ESS ranges between 28 and 47%. This paper presents a method that can update the preoperative CT in real time to improve surgical completeness in ESS. APPROACH The work presents and compares three novel methods that use instrument motion data and anatomical structures to predict surgical modifications in real time. The methods use learning techniques, such as nonparametric filtering and Gaussian process regression, to correlate surgical modifications with instrument tip positions, tip trajectories, and instrument shapes. Preoperative CT image sets are updated with modification predictions to serve as a virtual intraoperative CT. RESULTS The three methods were compared in eight ESS cadaver cases, which were performed by five surgeons and included the following representative ESS operations: maxillary antrostomy, uncinectomy, anterior and posterior ethmoidectomy, and sphenoidotomy. Experimental results showed accuracy metrics that were clinically acceptable with dice similarity coefficients > 86%, with F-score > 92% and precision > 89.91% in surgical completeness evaluation. Among the three methods, the tip trajectory-based estimator had the highest precision of 96.87%. CONCLUSIONS This work demonstrated that virtually modified intraoperative CT scans improved the consistency between the actual surgical scene and the reference model, and could lead to improved surgical completeness in ESS. Compared to actual intraoperative CT scans, the proposed method has no impact on existing surgical protocols, does not require extra hardware, does not expose the patient to radiation, and does not lengthen time under anesthesia.
Collapse
Affiliation(s)
- Yangming Li
- RoCALab, Rochester Institute of Technology, Rochester, 14623, USA.
| | - Neeraja Konuthula
- Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, 98195, USA
| | - Ian M Humphreys
- Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, 98195, USA
| | - Kris Moe
- Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, 98195, USA
| | - Blake Hannaford
- BioRobotics Lab, University of Washington, Seattle, 98195, USA
| | - Randall Bly
- Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, 98195, USA.,Seattle Children's Hospital, Seattle, 98105, USA
| |
Collapse
|
22
|
Edwards PJE, Psychogyios D, Speidel S, Maier-Hein L, Stoyanov D. SERV-CT: A disparity dataset from cone-beam CT for validation of endoscopic 3D reconstruction. Med Image Anal 2021; 76:102302. [PMID: 34906918 PMCID: PMC8961000 DOI: 10.1016/j.media.2021.102302] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 11/01/2021] [Accepted: 11/04/2021] [Indexed: 11/27/2022]
Abstract
Full torso porcine CT model for stereo-endoscopic reconstruction validation CT of endoscope and anatomy with constrained manual alignment provides a reference Accuracy analysis of repeated alignments and performance of existing algorithms presented Open sourced dataset for stereo reconstruction validation
In computer vision, reference datasets from simulation and real outdoor scenes have been highly successful in promoting algorithmic development in stereo reconstruction. Endoscopic stereo reconstruction for surgical scenes gives rise to specific problems, including the lack of clear corner features, highly specular surface properties and the presence of blood and smoke. These issues present difficulties for both stereo reconstruction itself and also for standardised dataset production. Previous datasets have been produced using computed tomography (CT) or structured light reconstruction on phantom or ex vivo models. We present a stereo-endoscopic reconstruction validation dataset based on cone-beam CT (SERV-CT). Two ex vivo small porcine full torso cadavers were placed within the view of the endoscope with both the endoscope and target anatomy visible in the CT scan. Subsequent orientation of the endoscope was manually aligned to match the stereoscopic view and benchmark disparities, depths and occlusions are calculated. The requirement of a CT scan limited the number of stereo pairs to 8 from each ex vivo sample. For the second sample an RGB surface was acquired to aid alignment of smooth, featureless surfaces. Repeated manual alignments showed an RMS disparity accuracy of around 2 pixels and a depth accuracy of about 2 mm. A simplified reference dataset is provided consisting of endoscope image pairs with corresponding calibration, disparities, depths and occlusions covering the majority of the endoscopic image and a range of tissue types, including smooth specular surfaces, as well as significant variation of depth. We assessed the performance of various stereo algorithms from online available repositories. There is a significant variation between algorithms, highlighting some of the challenges of surgical endoscopic images. The SERV-CT dataset provides an easy to use stereoscopic validation for surgical applications with smooth reference disparities and depths covering the majority of the endoscopic image. This complements existing resources well and we hope will aid the development of surgical endoscopic anatomical reconstruction algorithms.
Collapse
Affiliation(s)
- P J Eddie Edwards
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London (UCL), Charles Bell House, 43-45 Foley Street, London W1W 7TS, UK.
| | - Dimitris Psychogyios
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London (UCL), Charles Bell House, 43-45 Foley Street, London W1W 7TS, UK
| | - Stefanie Speidel
- Division of Translational Surgical Oncology, National Center for Tumor Diseases (NCT) Dresden, Dresden, 01307, Germany
| | - Lena Maier-Hein
- Division of Medical and Biological Informatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Danail Stoyanov
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London (UCL), Charles Bell House, 43-45 Foley Street, London W1W 7TS, UK
| |
Collapse
|
23
|
Mikamo M, Furukawa R, Oka S, Kotachi T, Okamoto Y, Tanaka S, Sagawa R, Kawasaki H. Active Stereo Method for 3D Endoscopes using Deep-layer GCN and Graph Representation with Proximity Information. Annu Int Conf IEEE Eng Med Biol Soc 2021; 2021:7551-7555. [PMID: 34892838 DOI: 10.1109/embc46164.2021.9629696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Techniques for 3D endoscopic systems have been widely studied for various reasons. Among them, active stereo based systems, in which structured-light patterns are projected to surfaces and endoscopic images of the pattern are analyzed to produce 3D depth images, are promising, because of robustness and simple system configurations. For those systems, finding correspondences between a projected pattern and an original pattern is an open problem. Recently, correspondence estimation by graph neural networks (GCN) using graph-based representation of the patterns were proposed for 3D endoscopic systems. One severe problem of the approach is that the graph matching by GCN is largely affected by the stability of the graph construction process using the detected patterns of a captured image. If the detected pattern is fragmented into small pieces, graph matching may fail and 3D shapes cannot be retrieved. In this paper, we propose a solution for those problems by applying deep-layered GCN and extended graph representations of the patterns, where proximity information is added. Experiments show that the proposed method outperformed the previous method in accuracies for correspondence matching for 3D reconstruction.
Collapse
|
24
|
Widya AR, Monno Y, Okutomi M, Suzuki S, Gotoda T, Miki K. Learning-Based Depth and Pose Estimation for Monocular Endoscope with Loss Generalization. Annu Int Conf IEEE Eng Med Biol Soc 2021; 2021:3547-3552. [PMID: 34892005 DOI: 10.1109/embc46164.2021.9630156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Gastroendoscopy has been a clinical standard for diagnosing and treating conditions that affect a part of a patient's digestive system, such as the stomach. Despite the fact that gastroendoscopy has a lot of advantages for patients, there exist some challenges for practitioners, such as the lack of 3D perception, including the depth and the endoscope pose information. Such challenges make navigating the endoscope and localizing any found lesion in a digestive tract difficult. To tackle these problems, deep learning-based approaches have been proposed to provide monocular gastroendoscopy with additional yet important depth and pose information. In this paper, we propose a novel supervised approach to train depth and pose estimation networks using consecutive endoscopy images to assist the endoscope navigation in the stomach. We firstly generate real depth and pose training data using our previously proposed whole stomach 3D reconstruction pipeline to avoid poor generalization ability between computer-generated (CG) models and real data for the stomach. In addition, we propose a novel generalized photometric loss function to avoid the complicated process of finding proper weights for balancing the depth and the pose loss terms, which is required for existing direct depth and pose supervision approaches. We then experimentally show that our proposed generalized loss performs better than existing direct supervision losses.
Collapse
|
25
|
Deliwala SS, Hamid K, Barbarawi M, Lakshman H, Zayed Y, Kandel P, Malladi S, Singh A, Bachuwa G, Gurvits GE, Chawla S. Artificial intelligence (AI) real-time detection vs. routine colonoscopy for colorectal neoplasia: a meta-analysis and trial sequential analysis. Int J Colorectal Dis 2021; 36:2291-2303. [PMID: 33934173 DOI: 10.1007/s00384-021-03929-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/07/2021] [Indexed: 02/04/2023]
Abstract
GOALS AND BACKGROUND Studies analyzing artificial intelligence (AI) in colonoscopies have reported improvements in detecting colorectal cancer (CRC) lesions, however its utility in the realworld remains limited. In this systematic review and meta-analysis, we evaluate the efficacy of AI-assisted colonoscopies against routine colonoscopy (RC). STUDY We performed an extensive search of major databases (through January 2021) for randomized controlled trials (RCTs) reporting adenoma and polyp detection rates. Odds ratio (OR) and standardized mean differences (SMD) with 95% confidence intervals (CIs) were reported. Additionally, trial sequential analysis (TSA) was performed to guard against errors. RESULTS Six RCTs were included (4996 participants). The mean age (SD) was 51.99 (4.43) years, and 49% were females. Detection rates favored AI over RC for adenomas (OR 1.77; 95% CI: 1.570-2.08) and polyps (OR 1.91; 95% CI: 1.68-2.16). Secondary outcomes including mean number of adenomas (SMD 0.23; 95% CI: 0.18-0.29) and polyps (SMD 0.23; 95% CI: 0.17-0.29) detected per procedure favored AI. However, RC outperformed AI in detecting pedunculated polyps. Withdrawal times (WTs) favored AI when biopsies were included, while WTs without biopsies, cecal intubation times, and bowel preparation adequacy were similar. CONCLUSIONS Colonoscopies equipped with AI detection algorithms could significantly detect previously missed adenomas and polyps while retaining the ability to self-assess and improve periodically. More effective clearance of diminutive adenomas may allow lengthening in surveillance intervals, reducing the burden of surveillance colonoscopies, and increasing its accessibility to those at higher risk. TSA ruled out the risk for false-positive results and confirmed a sufficient sample size to detect the observed effect. Currently, these findings suggest that AI-assisted colonoscopy can serve as a useful proxy to address critical gaps in CRC identification.
Collapse
Affiliation(s)
- Smit S Deliwala
- Department of Internal Medicine, Michigan State University at Hurley Medical Center, Two Hurley Plaza, Ste 212, Flint, MI, 48503, USA.
| | - Kewan Hamid
- Department of Internal Medicine/Pediatrics, Michigan State University at Hurley Medical Center, Flint, MI, USA
| | - Mahmoud Barbarawi
- Department of Internal Medicine, Michigan State University at Hurley Medical Center, Two Hurley Plaza, Ste 212, Flint, MI, 48503, USA
| | - Harini Lakshman
- Department of Internal Medicine, Michigan State University at Hurley Medical Center, Two Hurley Plaza, Ste 212, Flint, MI, 48503, USA
| | - Yazan Zayed
- Department of Internal Medicine, Michigan State University at Hurley Medical Center, Two Hurley Plaza, Ste 212, Flint, MI, 48503, USA
| | - Pujan Kandel
- Department of Internal Medicine, Michigan State University at Hurley Medical Center, Two Hurley Plaza, Ste 212, Flint, MI, 48503, USA
| | - Srikanth Malladi
- Department of Internal Medicine/Pediatrics, Michigan State University at Hurley Medical Center, Flint, MI, USA
| | - Adiraj Singh
- Department of Internal Medicine/Pediatrics, Michigan State University at Hurley Medical Center, Flint, MI, USA
| | - Ghassan Bachuwa
- Department of Internal Medicine, Michigan State University at Hurley Medical Center, Two Hurley Plaza, Ste 212, Flint, MI, 48503, USA
| | - Grigoriy E Gurvits
- Department of Internal Medicine - Division of Gastroenterology, New York University/Langone Medical Center, New York, NY, USA
| | - Saurabh Chawla
- Department of Internal Medicine - Division of Gastroenterology, Emory University, Atlanta, GA, USA
| |
Collapse
|
26
|
Yue Z, Ding S, Li X, Yang S, Zhang Y. Automatic Acetowhite Lesion Segmentation via Specular Reflection Removal and Deep Attention Network. IEEE J Biomed Health Inform 2021; 25:3529-3540. [PMID: 33684051 DOI: 10.1109/jbhi.2021.3064366] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Automatic acetowhite lesion segmentation in colposcopy images (cervigrams) is essential in assisting gynecologists for the diagnosis of cervical intraepithelial neoplasia grades and cervical cancer. It can also help gynecologists determine the correct lesion areas for further pathological examination. Existing computer-aided diagnosis algorithms show poor segmentation performance because of specular reflections, insufficient training data and the inability to focus on semantically meaningful lesion parts. In this paper, a novel computer-aided diagnosis algorithm is proposed to segment acetowhite lesions in cervigrams automatically. To reduce the interference of specularities on segmentation performance, a specular reflection removal mechanism is presented to detect and inpaint these areas with precision. Moreover, we design a cervigram image classification network to classify pathology results and generate lesion attention maps, which are subsequently leveraged to guide a more accurate lesion segmentation task by the proposed lesion-aware convolutional neural network. We conducted comprehensive experiments to evaluate the proposed approaches on 3045 clinical cervigrams. Our results show that our method outperforms state-of-the-art approaches and achieves better Dice similarity coefficient and Hausdorff Distance values in acetowhite legion segmentation.
Collapse
|
27
|
Tian L, Hunt B, Bell MAL, Yi J, Smith JT, Ochoa M, Intes X, Durr NJ. Deep Learning in Biomedical Optics. Lasers Surg Med 2021; 53:748-775. [PMID: 34015146 PMCID: PMC8273152 DOI: 10.1002/lsm.23414] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 04/02/2021] [Accepted: 04/15/2021] [Indexed: 01/02/2023]
Abstract
This article reviews deep learning applications in biomedical optics with a particular emphasis on image formation. The review is organized by imaging domains within biomedical optics and includes microscopy, fluorescence lifetime imaging, in vivo microscopy, widefield endoscopy, optical coherence tomography, photoacoustic imaging, diffuse tomography, and functional optical brain imaging. For each of these domains, we summarize how deep learning has been applied and highlight methods by which deep learning can enable new capabilities for optics in medicine. Challenges and opportunities to improve translation and adoption of deep learning in biomedical optics are also summarized. Lasers Surg. Med. © 2021 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- L. Tian
- Department of Electrical and Computer Engineering, Boston University, Boston, MA, USA
| | - B. Hunt
- Thayer School of Engineering, Dartmouth College, Hanover, NH, USA
| | - M. A. L. Bell
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - J. Yi
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Ophthalmology, Johns Hopkins University, Baltimore, MD, USA
| | - J. T. Smith
- Center for Modeling, Simulation, and Imaging in Medicine, Rensselaer Polytechnic Institute, Troy, New York NY 12180
| | - M. Ochoa
- Center for Modeling, Simulation, and Imaging in Medicine, Rensselaer Polytechnic Institute, Troy, New York NY 12180
| | - X. Intes
- Center for Modeling, Simulation, and Imaging in Medicine, Rensselaer Polytechnic Institute, Troy, New York NY 12180
| | - N. J. Durr
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
28
|
Yang Z, Simon R, Li Y, Linte CA. Dense Depth Estimation from Stereo Endoscopy Videos Using Unsupervised Optical Flow Methods. Med Image Underst Anal (2021) 2021; 12722:337-349. [PMID: 35610998 PMCID: PMC9125693 DOI: 10.1007/978-3-030-80432-9_26] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
In the context of Minimally Invasive Surgery, estimating depth from stereo endoscopy plays a crucial role in three-dimensional (3D) reconstruction, surgical navigation, and augmentation reality (AR) visualization. However, the challenges associated with this task are three-fold: 1) feature-less surface representations, often polluted by artifacts, pose difficulty in identifying correspondence; 2) ground truth depth is difficult to estimate; and 3) an endoscopy image acquisition accompanied by accurately calibrated camera parameters is rare, as the camera is often adjusted during an intervention. To address these difficulties, we propose an unsupervised depth estimation framework (END-flow) based on an unsupervised optical flow network trained on un-rectified binocular videos without calibrated camera parameters. The proposed END-flow architecture is compared with traditional stereo matching, self-supervised depth estimation, unsupervised optical flow, and supervised methods implemented on the Stereo Correspondence and Reconstruction of Endoscopic Data (SCARED) Challenge dataset. Experimental results show that our method outperforms several state-of-the-art techniques and achieves a close performance to that of supervised methods.
Collapse
Affiliation(s)
- Zixin Yang
- Center for Imaging Science, Rochester Institute of Technology, Rochester, NY 14623, USA
| | - Richard Simon
- Biomedical Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA
| | - Yangming Li
- Electrical Computer and Telecommunications Engineering Technology, Rochester Institute of Technology, Rochester, NY 14623, USA
| | - Cristian A Linte
- Center for Imaging Science, Rochester Institute of Technology, Rochester, NY 14623, USA
- Biomedical Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA
| |
Collapse
|
29
|
Abstract
Colorectal cancer is one of the main causes of cancer incident cases and cancer deaths worldwide. Undetected colon polyps, be them benign or malignant, lead to late diagnosis of colorectal cancer. Computer aided devices have helped to decrease the polyp miss rate. The application of deep learning algorithms and techniques has escalated during this last decade. Many scientific studies are published to detect, localize, and classify colon polyps. We present here a brief review of the latest published studies. We compare the accuracy of these studies with our results obtained from training and testing three independent datasets using a convolutional neural network and autoencoder model. A train, validate and test split was performed for each dataset, 75%, 15%, and 15%, respectively. An accuracy of 0.937 was achieved for CVC-ColonDB, 0.951 for CVC-ClinicDB, and 0.967 for ETIS-LaribPolypDB. Our results suggest slight improvements compared to the algorithms used to date.
Collapse
|
30
|
Sengupta A, Bartoli A. Colonoscopic 3D reconstruction by tubular non-rigid structure-from-motion. Int J Comput Assist Radiol Surg 2021; 16:1237-41. [PMID: 34031817 DOI: 10.1007/s11548-021-02409-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 05/11/2021] [Indexed: 10/21/2022]
Abstract
PURPOSE The visual examination of colonoscopic images fails to extract precise geometric information of the colonic surface. Reconstructing the 3D surface of the colon from colonoscopic image sequences may thus add valuable clinical information. We address this problem of extracting precise spatio-temporal 3D structure information from colonoscopic images. METHODS Using just the intrinsically calibrated monocular image stream, we develop a technique to compute the depth of certain feature points that have been tracked across images. Our method uses the prior knowledge of an approximate geometry of the colon, called the (TTP). It works by fitting a deformable cylindrical model to points reconstructed independently by non-rigid structure-from-motion (NRSfM), compromising between the data term and a novel tubular smoothing prior. Our method represents the first method ever to exploit a very weak topological prior to improve NRSfM. As such, it lies in-between standard NRSfM, which does not use a topological prior beyond the mere plane, and shape-from-template (SfT), which uses a very strong prior as a full deformable 3D object model. RESULTS We validate our method on both synthetic images of tubular structures and real colonoscopic data. Our method improves the results obtained by existing NRSfM methods by 71.74% on average on synthetic data and succeeds in obtaining 3D reconstruction from a real colonoscopic sequence defeating the existing methods. CONCLUSION Colonoscopic 3D reconstruction is a difficult problem, which is yet unresolved by the existing methods from computer vision. Our proposed dedicated NRSfM method and experiments show that the visual motion might be the right visual cue to use in colonoscopy.
Collapse
|
31
|
Itoh H, Oda M, Mori Y, Misawa M, Kudo SE, Imai K, Ito S, Hotta K, Takabatake H, Mori M, Natori H, Mori K. Unsupervised colonoscopic depth estimation by domain translations with a Lambertian-reflection keeping auxiliary task. Int J Comput Assist Radiol Surg 2021; 16:989-1001. [PMID: 34002340 DOI: 10.1007/s11548-021-02398-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 05/02/2021] [Indexed: 12/15/2022]
Abstract
PURPOSE A three-dimensional (3D) structure extraction technique viewed from a two-dimensional image is essential for the development of a computer-aided diagnosis (CAD) system for colonoscopy. However, a straightforward application of existing depth-estimation methods to colonoscopic images is impossible or inappropriate due to several limitations of colonoscopes. In particular, the absence of ground-truth depth for colonoscopic images hinders the application of supervised machine learning methods. To circumvent these difficulties, we developed an unsupervised and accurate depth-estimation method. METHOD We propose a novel unsupervised depth-estimation method by introducing a Lambertian-reflection model as an auxiliary task to domain translation between real and virtual colonoscopic images. This auxiliary task contributes to accurate depth estimation by maintaining the Lambertian-reflection assumption. In our experiments, we qualitatively evaluate the proposed method by comparing it with state-of-the-art unsupervised methods. Furthermore, we present two quantitative evaluations of the proposed method using a measuring device, as well as a new 3D reconstruction technique and measured polyp sizes. RESULTS Our proposed method achieved accurate depth estimation with an average estimation error of less than 1 mm for regions close to the colonoscope in both of two types of quantitative evaluations. Qualitative evaluation showed that the introduced auxiliary task reduces the effects of specular reflections and colon wall textures on depth estimation and our proposed method achieved smooth depth estimation without noise, thus validating the proposed method. CONCLUSIONS We developed an accurate depth-estimation method with a new type of unsupervised domain translation with the auxiliary task. This method is useful for analysis of colonoscopic images and for the development of a CAD system since it can extract accurate 3D information.
Collapse
|
32
|
Hwang SJ, Park SJ, Kim GM, Baek JH. Unsupervised Monocular Depth Estimation for Colonoscope System Using Feedback Network. Sensors (Basel) 2021; 21:s21082691. [PMID: 33920357 PMCID: PMC8069522 DOI: 10.3390/s21082691] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 04/01/2021] [Accepted: 04/09/2021] [Indexed: 12/25/2022]
Abstract
A colonoscopy is a medical examination used to check disease or abnormalities in the large intestine. If necessary, polyps or adenomas would be removed through the scope during a colonoscopy. Colorectal cancer can be prevented through this. However, the polyp detection rate differs depending on the condition and skill level of the endoscopist. Even some endoscopists have a 90% chance of missing an adenoma. Artificial intelligence and robot technologies for colonoscopy are being studied to compensate for these problems. In this study, we propose a self-supervised monocular depth estimation using spatiotemporal consistency in the colon environment. It is our contribution to propose a loss function for reconstruction errors between adjacent predicted depths and a depth feedback network that uses predicted depth information of the previous frame to predict the depth of the next frame. We performed quantitative and qualitative evaluation of our approach, and the proposed FBNet (depth FeedBack Network) outperformed state-of-the-art results for unsupervised depth estimation on the UCL datasets.
Collapse
|
33
|
Widya AR, Monno Y, Okutomi M, Suzuki S, Gotoda T, Miki K. Stomach 3D Reconstruction Using Virtual Chromoendoscopic Images. IEEE J Transl Eng Health Med 2021; 9:1700211. [PMID: 33796417 PMCID: PMC8009143 DOI: 10.1109/jtehm.2021.3062226] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 01/19/2021] [Accepted: 02/15/2021] [Indexed: 12/23/2022]
Abstract
Gastric endoscopy is a golden standard in the clinical process that enables medical practitioners to diagnose various lesions inside a patient’s stomach. If a lesion is found, a success in identifying the location of the found lesion relative to the global view of the stomach will lead to better decision making for the next clinical treatment. Our previous research showed that the lesion localization could be achieved by reconstructing the whole stomach shape from chromoendoscopic indigo carmine (IC) dye-sprayed images using a structure-from-motion (SfM) pipeline. However, spraying the IC dye to the whole stomach requires additional time, which is not desirable for both patients and practitioners. Our objective is to propose an alternative way to achieve whole stomach 3D reconstruction without the need of the IC dye. We generate virtual IC-sprayed (VIC) images based on image-to-image style translation trained on unpaired real no-IC and IC-sprayed images, where we have investigated the effect of input and output color channel selection for generating the VIC images. We validate our reconstruction results by comparing them with the results using real IC-sprayed images and confirm that the obtained stomach 3D structures are comparable to each other. We also propose a local reconstruction technique to obtain a more detailed surface and texture around an interesting region. The proposed method achieves the whole stomach reconstruction without the need of real IC dye using SfM. We have found that translating no-IC green-channel images to IC-sprayed red-channel images gives the best SfM reconstruction result. Clinical impact We offer a method of the frame localization and local 3D reconstruction of a found gastric lesion using standard endoscopy images, leading to better clinical decision.
Collapse
Affiliation(s)
- Aji Resindra Widya
- Department of Systems and Control EngineeringSchool of EngineeringTokyo Institute of TechnologyTokyo152-8550Japan
| | - Yusuke Monno
- Department of Systems and Control EngineeringSchool of EngineeringTokyo Institute of TechnologyTokyo152-8550Japan
| | - Masatoshi Okutomi
- Department of Systems and Control EngineeringSchool of EngineeringTokyo Institute of TechnologyTokyo152-8550Japan
| | - Sho Suzuki
- Division of Gastroenterology and HepatologyDepartment of MedicineNihon University School of MedicineTokyo101-8309Japan
| | - Takuji Gotoda
- Division of Gastroenterology and HepatologyDepartment of MedicineNihon University School of MedicineTokyo101-8309Japan
| | - Kenji Miki
- Department of Internal MedicineTsujinaka Hospital KashiwanohaKashiwa277-0871Japan
| |
Collapse
|
34
|
İncetan K, Celik IO, Obeid A, Gokceler GI, Ozyoruk KB, Almalioglu Y, Chen RJ, Mahmood F, Gilbert H, Durr NJ, Turan M. VR-Caps: A Virtual Environment for Capsule Endoscopy. Med Image Anal 2021; 70:101990. [PMID: 33609920 DOI: 10.1016/j.media.2021.101990] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 02/01/2021] [Accepted: 02/02/2021] [Indexed: 02/06/2023]
Abstract
Current capsule endoscopes and next-generation robotic capsules for diagnosis and treatment of gastrointestinal diseases are complex cyber-physical platforms that must orchestrate complex software and hardware functions. The desired tasks for these systems include visual localization, depth estimation, 3D mapping, disease detection and segmentation, automated navigation, active control, path realization and optional therapeutic modules such as targeted drug delivery and biopsy sampling. Data-driven algorithms promise to enable many advanced functionalities for capsule endoscopes, but real-world data is challenging to obtain. Physically-realistic simulations providing synthetic data have emerged as a solution to the development of data-driven algorithms. In this work, we present a comprehensive simulation platform for capsule endoscopy operations and introduce VR-Caps, a virtual active capsule environment that simulates a range of normal and abnormal tissue conditions (e.g., inflated, dry, wet etc.) and varied organ types, capsule endoscope designs (e.g., mono, stereo, dual and 360∘ camera), and the type, number, strength, and placement of internal and external magnetic sources that enable active locomotion. VR-Caps makes it possible to both independently or jointly develop, optimize, and test medical imaging and analysis software for the current and next-generation endoscopic capsule systems. To validate this approach, we train state-of-the-art deep neural networks to accomplish various medical image analysis tasks using simulated data from VR-Caps and evaluate the performance of these models on real medical data. Results demonstrate the usefulness and effectiveness of the proposed virtual platform in developing algorithms that quantify fractional coverage, camera trajectory, 3D map reconstruction, and disease classification. All of the code, pre-trained weights and created 3D organ models of the virtual environment with detailed instructions how to setup and use the environment are made publicly available at https://github.com/CapsuleEndoscope/VirtualCapsuleEndoscopy and a video demonstration can be seen in the supplementary videos (Video-I).
Collapse
Affiliation(s)
- Kağan İncetan
- Institute of Biomedical Engineering, Bogazici University, Istanbul, Turkey
| | - Ibrahim Omer Celik
- Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| | - Abdulhamid Obeid
- Institute of Biomedical Engineering, Bogazici University, Istanbul, Turkey
| | | | | | | | - Richard J Chen
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Faisal Mahmood
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Cancer Data Science, Dana Farber Cancer Institute, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Hunter Gilbert
- Deparment of Mechanical and Industrial Engineering, Louisiana State University, Baton Rouge, LA USA
| | - Nicholas J Durr
- Department of Biomedical Engineering, Johns Hopkins University (JHU), Baltimore, MD, USA
| | - Mehmet Turan
- Institute of Biomedical Engineering, Bogazici University, Istanbul, Turkey.
| |
Collapse
|
35
|
Wang S, Cong Y, Zhu H, Chen X, Qu L, Fan H, Zhang Q, Liu M. Multi-Scale Context-Guided Deep Network for Automated Lesion Segmentation With Endoscopy Images of Gastrointestinal Tract. IEEE J Biomed Health Inform 2021; 25:514-525. [PMID: 32750912 DOI: 10.1109/jbhi.2020.2997760] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Accurate lesion segmentation based on endoscopy images is a fundamental task for the automated diagnosis of gastrointestinal tract (GI Tract) diseases. Previous studies usually use hand-crafted features for representing endoscopy images, while feature definition and lesion segmentation are treated as two standalone tasks. Due to the possible heterogeneity between features and segmentation models, these methods often result in sub-optimal performance. Several fully convolutional networks have been recently developed to jointly perform feature learning and model training for GI Tract disease diagnosis. However, they generally ignore local spatial details of endoscopy images, as down-sampling operations (e.g., pooling and convolutional striding) may result in irreversible loss of image spatial information. To this end, we propose a multi-scale context-guided deep network (MCNet) for end-to-end lesion segmentation of endoscopy images in GI Tract, where both global and local contexts are captured as guidance for model training. Specifically, one global subnetwork is designed to extract the global structure and high-level semantic context of each input image. Then we further design two cascaded local subnetworks based on output feature maps of the global subnetwork, aiming to capture both local appearance information and relatively high-level semantic information in a multi-scale manner. Those feature maps learned by three subnetworks are further fused for the subsequent task of lesion segmentation. We have evaluated the proposed MCNet on 1,310 endoscopy images from the public EndoVis-Ab and CVC-ClinicDB datasets for abnormal segmentation and polyp segmentation, respectively. Experimental results demonstrate that MCNet achieves [Formula: see text] and [Formula: see text] mean intersection over union (mIoU) on two datasets, respectively, outperforming several state-of-the-art approaches in automated lesion segmentation with endoscopy images of GI Tract.
Collapse
|
36
|
Chen MT, Papadakis M, Durr NJ. Speckle illumination SFDI for projector-free optical property mapping. Opt Lett 2021; 46:673-676. [PMID: 33528438 PMCID: PMC8285059 DOI: 10.1364/ol.411187] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 12/27/2020] [Indexed: 05/08/2023]
Abstract
Spatial frequency domain imaging can map tissue scattering and absorption properties over a wide field of view, making it useful for clinical applications such as wound assessment and surgical guidance. This technique has previously required the projection of fully characterized illumination patterns. Here, we show that random and unknown speckle illumination can be used to sample the modulation transfer function of tissues at known spatial frequencies, allowing the quantitative mapping of optical properties with simple laser diode illumination. We compute low- and high-spatial frequency response parameters from the local power spectral density for each pixel and use a lookup table to accurately estimate absorption and scattering coefficients in tissue phantoms, in vivo human hand, and ex vivo swine esophagus. Because speckle patterns can be generated over a large depth of field and field of view with simple coherent illumination, this approach may enable optical property mapping in new form-factors and applications, including endoscopy.
Collapse
Affiliation(s)
- Mason T. Chen
- Department of Biomedical Engineering, Johns Hopkins University, 3400 N. Charles Street, Baltimore, Maryland 21218, USA
| | - Melina Papadakis
- Department of Biomedical Engineering, Johns Hopkins University, 3400 N. Charles Street, Baltimore, Maryland 21218, USA
| | - Nicholas J. Durr
- Department of Biomedical Engineering, Johns Hopkins University, 3400 N. Charles Street, Baltimore, Maryland 21218, USA
| |
Collapse
|
37
|
Marzullo A, Moccia S, Calimeri F, De Momi E. AIM in Endoscopy Procedures. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_164-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2022]
|
38
|
Almalioglu Y, Bengisu Ozyoruk K, Gokce A, Incetan K, Irem Gokceler G, Ali Simsek M, Ararat K, Chen RJ, Durr NJ, Mahmood F, Turan M. EndoL2H: Deep Super-Resolution for Capsule Endoscopy. IEEE Trans Med Imaging 2020; 39:4297-4309. [PMID: 32795966 DOI: 10.1109/tmi.2020.3016744] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Although wireless capsule endoscopy is the preferred modality for diagnosis and assessment of small bowel diseases, the poor camera resolution is a substantial limitation for both subjective and automated diagnostics. Enhanced-resolution endoscopy has shown to improve adenoma detection rate for conventional endoscopy and is likely to do the same for capsule endoscopy. In this work, we propose and quantitatively validate a novel framework to learn a mapping from low-to-high-resolution endoscopic images. We combine conditional adversarial networks with a spatial attention block to improve the resolution by up to factors of 8× , 10× , 12× , respectively. Quantitative and qualitative studies demonstrate the superiority of EndoL2H over state-of-the-art deep super-resolution methods Deep Back-Projection Networks (DBPN), Deep Residual Channel Attention Networks (RCAN) and Super Resolution Generative Adversarial Network (SRGAN). Mean Opinion Score (MOS) tests were performed by 30 gastroenterologists qualitatively assess and confirm the clinical relevance of the approach. EndoL2H is generally applicable to any endoscopic capsule system and has the potential to improve diagnosis and better harness computational approaches for polyp detection and characterization. Our code and trained models are available at https://github.com/CapsuleEndoscope/EndoL2H.
Collapse
|
39
|
Chen MT, Durr NJ. Rapid tissue oxygenation mapping from snapshot structured-light images with adversarial deep learning. J Biomed Opt 2020; 25:JBO-200210SSR. [PMID: 33251783 PMCID: PMC7701163 DOI: 10.1117/1.jbo.25.11.112907] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Accepted: 11/10/2020] [Indexed: 05/06/2023]
Abstract
SIGNIFICANCE Spatial frequency-domain imaging (SFDI) is a powerful technique for mapping tissue oxygen saturation over a wide field of view. However, current SFDI methods either require a sequence of several images with different illumination patterns or, in the case of single-snapshot optical properties (SSOP), introduce artifacts and sacrifice accuracy. AIM We introduce OxyGAN, a data-driven, content-aware method to estimate tissue oxygenation directly from single structured-light images. APPROACH OxyGAN is an end-to-end approach that uses supervised generative adversarial networks. Conventional SFDI is used to obtain ground truth tissue oxygenation maps for ex vivo human esophagi, in vivo hands and feet, and an in vivo pig colon sample under 659- and 851-nm sinusoidal illumination. We benchmark OxyGAN by comparing it with SSOP and a two-step hybrid technique that uses a previously developed deep learning model to predict optical properties followed by a physical model to calculate tissue oxygenation. RESULTS When tested on human feet, cross-validated OxyGAN maps tissue oxygenation with an accuracy of 96.5%. When applied to sample types not included in the training set, such as human hands and pig colon, OxyGAN achieves a 93% accuracy, demonstrating robustness to various tissue types. On average, OxyGAN outperforms SSOP and a hybrid model in estimating tissue oxygenation by 24.9% and 24.7%, respectively. Finally, we optimize OxyGAN inference so that oxygenation maps are computed ∼10 times faster than previous work, enabling video-rate, 25-Hz imaging. CONCLUSIONS Due to its rapid acquisition and processing speed, OxyGAN has the potential to enable real-time, high-fidelity tissue oxygenation mapping that may be useful for many clinical applications.
Collapse
Affiliation(s)
- Mason T. Chen
- Johns Hopkins University, Department of Biomedical Engineering, Baltimore, Maryland, United States
| | - Nicholas J. Durr
- Johns Hopkins University, Department of Biomedical Engineering, Baltimore, Maryland, United States
- Address all correspondence to Nicholas J. Durr,
| |
Collapse
|
40
|
Mahmood F, Borders D, Chen RJ, Mckay GN, Salimian KJ, Baras A, Durr NJ. Deep Adversarial Training for Multi-Organ Nuclei Segmentation in Histopathology Images. IEEE Trans Med Imaging 2020; 39:3257-3267. [PMID: 31283474 PMCID: PMC8588951 DOI: 10.1109/tmi.2019.2927182] [Citation(s) in RCA: 110] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Nuclei mymargin segmentation is a fundamental task for various computational pathology applications including nuclei morphology analysis, cell type classification, and cancer grading. Deep learning has emerged as a powerful approach to segmenting nuclei but the accuracy of convolutional neural networks (CNNs) depends on the volume and the quality of labeled histopathology data for training. In particular, conventional CNN-based approaches lack structured prediction capabilities, which are required to distinguish overlapping and clumped nuclei. Here, we present an approach to nuclei segmentation that overcomes these challenges by utilizing a conditional generative adversarial network (cGAN) trained with synthetic and real data. We generate a large dataset of H&E training images with perfect nuclei segmentation labels using an unpaired GAN framework. This synthetic data along with real histopathology data from six different organs are used to train a conditional GAN with spectral normalization and gradient penalty for nuclei segmentation. This adversarial regression framework enforces higher-order spacial-consistency when compared to conventional CNN models. We demonstrate that this nuclei segmentation approach generalizes across different organs, sites, patients and disease states, and outperforms conventional approaches, especially in isolating individual and overlapping nuclei.
Collapse
|
41
|
Abstract
INTRODUCTION There has been a rapid development of deep learning (DL) models for medical imaging. However, DL requires a large labeled dataset for training the models. Getting large-scale labeled data remains a challenge, and multi-center datasets suffer from heterogeneity due to patient diversity and varying imaging protocols. Domain adaptation (DA) has been developed to transfer the knowledge from a labeled data domain to a related but unlabeled domain in either image space or feature space. DA is a type of transfer learning (TL) that can improve the performance of models when applied to multiple different datasets. OBJECTIVE In this survey, we review the state-of-the-art DL-based DA methods for medical imaging. We aim to summarize recent advances, highlighting the motivation, challenges, and opportunities, and to discuss promising directions for future work in DA for medical imaging. METHODS We surveyed peer-reviewed publications from leading biomedical journals and conferences between 2017-2020, that reported the use of DA in medical imaging applications, grouping them by methodology, image modality, and learning scenarios. RESULTS We mainly focused on pathology and radiology as application areas. Among various DA approaches, we discussed domain transformation (DT) and latent feature-space transformation (LFST). We highlighted the role of unsupervised DA in image segmentation and described opportunities for future development. CONCLUSION DA has emerged as a promising solution to deal with the lack of annotated training data. Using adversarial techniques, unsupervised DA has achieved good performance, especially for segmentation tasks. Opportunities include domain transferability, multi-modal DA, and applications that benefit from synthetic data.
Collapse
Affiliation(s)
- Anirudh Choudhary
- Department of Computational Science and Engineering, Georgia Institute of Technology, GA, USA
| | - Li Tong
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, GA, USA
| | - Yuanda Zhu
- School of Electrical and Computer Engineering, Georgia Institute of Technology, GA, USA
| | - May D. Wang
- Department of Computational Science and Engineering, Georgia Institute of Technology, GA, USA
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, GA, USA
- School of Electrical and Computer Engineering, Georgia Institute of Technology, GA, USA
| |
Collapse
|
42
|
Zhou X, Guo Y, Shen M, Yang G. Application of artificial intelligence in surgery. Front Med 2020; 14:417-30. [DOI: 10.1007/s11684-020-0770-0] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Accepted: 03/05/2020] [Indexed: 12/14/2022]
|
43
|
Ciuti G, Skonieczna-Żydecka K, Marlicz W, Iacovacci V, Liu H, Stoyanov D, Arezzo A, Chiurazzi M, Toth E, Thorlacius H, Dario P, Koulaouzidis A. Frontiers of Robotic Colonoscopy: A Comprehensive Review of Robotic Colonoscopes and Technologies. J Clin Med 2020; 9:E1648. [PMID: 32486374 PMCID: PMC7356873 DOI: 10.3390/jcm9061648] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 05/16/2020] [Accepted: 05/19/2020] [Indexed: 12/15/2022] Open
Abstract
Flexible colonoscopy remains the prime mean of screening for colorectal cancer (CRC) and the gold standard of all population-based screening pathways around the world. Almost 60% of CRC deaths could be prevented with screening. However, colonoscopy attendance rates are affected by discomfort, fear of pain and embarrassment or loss of control during the procedure. Moreover, the emergence and global thread of new communicable diseases might seriously affect the functioning of contemporary centres performing gastrointestinal endoscopy. Innovative solutions are needed: artificial intelligence (AI) and physical robotics will drastically contribute for the future of the healthcare services. The translation of robotic technologies from traditional surgery to minimally invasive endoscopic interventions is an emerging field, mainly challenged by the tough requirements for miniaturization. Pioneering approaches for robotic colonoscopy have been reported in the nineties, with the appearance of inchworm-like devices. Since then, robotic colonoscopes with assistive functionalities have become commercially available. Research prototypes promise enhanced accessibility and flexibility for future therapeutic interventions, even via autonomous or robotic-assisted agents, such as robotic capsules. Furthermore, the pairing of such endoscopic systems with AI-enabled image analysis and recognition methods promises enhanced diagnostic yield. By assembling a multidisciplinary team of engineers and endoscopists, the paper aims to provide a contemporary and highly-pictorial critical review for robotic colonoscopes, hence providing clinicians and researchers with a glimpse of the major changes and challenges that lie ahead.
Collapse
Affiliation(s)
- Gastone Ciuti
- The BioRobotics Institute, Scuola Superiore Sant’Anna, 56025 Pisa, Italy; (V.I.); (M.C.); (P.D.)
- Department of Excellence in Robotics & AI, Scuola Superiore Sant’Anna, 56127 Pisa, Italy
| | - Karolina Skonieczna-Żydecka
- Department of Human Nutrition and Metabolomics, Pomeranian Medical University in Szczecin, 71-460 Szczecin, Poland;
| | - Wojciech Marlicz
- Department of Gastroenterology, Pomeranian Medical University in Szczecin, 71-252 Szczecin, Poland;
- Endoklinika sp. z o.o., 70-535 Szczecin, Poland
| | - Veronica Iacovacci
- The BioRobotics Institute, Scuola Superiore Sant’Anna, 56025 Pisa, Italy; (V.I.); (M.C.); (P.D.)
- Department of Excellence in Robotics & AI, Scuola Superiore Sant’Anna, 56127 Pisa, Italy
| | - Hongbin Liu
- School of Biomedical Engineering & Imaging Sciences, Faculty of Life Sciences and Medicine, King’s College London, London SE1 7EH, UK;
| | - Danail Stoyanov
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London W1W 7TY, UK;
| | - Alberto Arezzo
- Department of Surgical Sciences, University of Torino, 10126 Torino, Italy;
| | - Marcello Chiurazzi
- The BioRobotics Institute, Scuola Superiore Sant’Anna, 56025 Pisa, Italy; (V.I.); (M.C.); (P.D.)
- Department of Excellence in Robotics & AI, Scuola Superiore Sant’Anna, 56127 Pisa, Italy
| | - Ervin Toth
- Department of Gastroenterology, Skåne University Hospital, Lund University, 20502 Malmö, Sweden;
| | - Henrik Thorlacius
- Department of Clinical Sciences, Section of Surgery, Lund University, 20502 Malmö, Sweden;
| | - Paolo Dario
- The BioRobotics Institute, Scuola Superiore Sant’Anna, 56025 Pisa, Italy; (V.I.); (M.C.); (P.D.)
- Department of Excellence in Robotics & AI, Scuola Superiore Sant’Anna, 56127 Pisa, Italy
| | | |
Collapse
|
44
|
Xu L, Li J, Hao Y, Zhang P, Ciuti G, Dario P, Huang Q. Depth Estimation for Local Colon Structure in Monocular Capsule Endoscopy Based on Brightness and Camera Motion. ROBOTICA 2021; 39:334-45. [DOI: 10.1017/s0263574720000399] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
SUMMARYWe present a 3D reconstruction method using brightness and camera motion estimation for registering local colon structure in colonoscopy. The proposed method is based on reverse projection from 2D fold contours to 3D space, motion estimation from 3D reconstructed points between neighboring frames, and model registration to reconstruct the fold structure. On the synthetic colon, the average percentages of the reconstructed depth error and circumference error are about 14.2% and 15.2%, respectively. The accuracy is enough for the navigation and control in capsule robot. This work demonstrates that the proposed method is superior to the methods using single-frame-based brightness intensity.
Collapse
|
45
|
Liu X, Sinha A, Ishii M, Hager GD, Reiter A, Taylor RH, Unberath M. Dense Depth Estimation in Monocular Endoscopy With Self-Supervised Learning Methods. IEEE Trans Med Imaging 2020; 39:1438-1447. [PMID: 31689184 PMCID: PMC7289272 DOI: 10.1109/tmi.2019.2950936] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
We present a self-supervised approach to training convolutional neural networks for dense depth estimation from monocular endoscopy data without a priori modeling of anatomy or shading. Our method only requires monocular endoscopic videos and a multi-view stereo method, e.g., structure from motion, to supervise learning in a sparse manner. Consequently, our method requires neither manual labeling nor patient computed tomography (CT) scan in the training and application phases. In a cross-patient experiment using CT scans as groundtruth, the proposed method achieved submillimeter mean residual error. In a comparison study to recent self-supervised depth estimation methods designed for natural video on in vivo sinus endoscopy data, we demonstrate that the proposed approach outperforms the previous methods by a large margin. The source code for this work is publicly available online at https://github.com/lppllppl920/EndoscopyDepthEstimation-Pytorch.
Collapse
|
46
|
Armin MA, Barnes N, Grimpen F, Salvado O. Learning colon centreline from optical colonoscopy, a new way to generate a map of the internal colon surface. Healthc Technol Lett 2020; 6:187-190. [PMID: 32038855 PMCID: PMC6952246 DOI: 10.1049/htl.2019.0073] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 10/02/2019] [Indexed: 11/20/2022] Open
Abstract
Optical colonoscopy is known as a gold standard screening method in detecting and removing cancerous polyps. During this procedure, some polyps may be undetected due to their positions, not being covered by the camera or missed by the surgeon. In this Letter, the authors introduce a novel convolutional neural network (ConvNet) algorithm to map the internal colon surface to a 2D map (visibility map), which can be used to increase the awareness of clinicians about areas they might miss. This was achieved by leveraging a colonoscopy simulator to generate a dataset consisting of colonoscopy video frames and their corresponding colon centreline (CCL) points in 3D camera coordinates. A pair of video frames were used as input to a ConvNet, whereas the output was a point on the CCL and its direction vector. By knowing CCL for each frame and roughly modelling the colon as a cylinder, frames could be unrolled to build a visibility map. They validated their results using both simulated and real colonoscopy frames. Their results showed that using consecutive simulated frames to learn the CCL can be generalised to real colonoscopy video frames to generate a visibility map.
Collapse
Affiliation(s)
| | - Nick Barnes
- CSIRO (Data61) 3D Computer Vision, Canberra, Australia.,College of Engineering and Computer Science (ANU), Canberra, Australia
| | - Florian Grimpen
- Department of Gastroenterology and Hepatology, Royal Brisbane and Women's Hospital, Brisbane, Australia
| | | |
Collapse
|
47
|
Abstract
OBJECTIVE This paper aims to propose a 3D laparoscopic imaging system that can realize dense 3D reconstruction in real time. METHODS Based on the active stereo technique which yields high-density, accurate and robust 3D reconstruction by combining structured light and stereo vision, we design a laparoscopic system consisting of two image feedback channels and one pattern projection channel. Remote high-speed image acquisition and pattern generation lay the foundation for the real-time dense 3D surface reconstruction and enable the miniaturization of the laparoscopic probe. To enhance the reconstruction efficiency and accuracy, we propose a novel active stereo method by which the dense 3D point cloud is obtained using only five patterns, while most existing multiple-shot structured light techniques require [Formula: see text] patterns. In our method, dual-frequency phase-shifting fringes are utilized to uniquely encode the pixels of the measured targets, and a dual-codeword matching scheme is developed to simplify the matching procedure and achieve high-precision reconstruction. RESULTS Compared with the existing structured light techniques, the proposed method shows better real-time efficiency and accuracy in both quantitative and qualitative ways. Ex-vivo experiments demonstrate the robustness of the proposed method to different biological organs and the effectiveness to lesions and deformations of the organs. Feasibility of the proposed system for real-time dense 3D reconstruction is verified in dynamic experiments. According to the experimental results, the system acquires 3D point clouds with a speed of 12 frames per second. Each frame contains more than 40,000 points, and the average errors tested on standard objects are less than 0.2 mm. SIGNIFICANCE This paper provides a new real-time dense 3D reconstruction method for 3D laparoscopic imaging. The established prototype system has shown good performance in reconstructing surface of biological tissues.
Collapse
|
48
|
Moccia S, Romeo L, Migliorelli L, Frontoni E, Zingaretti P. Supervised CNN Strategies for Optical Image Segmentation and Classification in Interventional Medicine. Intelligent Systems Reference Library 2020. [DOI: 10.1007/978-3-030-42750-4_8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
49
|
Azer SA. Deep learning with convolutional neural networks for identification of liver masses and hepatocellular carcinoma: A systematic review. World J Gastrointest Oncol 2019; 11:1218-1230. [PMID: 31908726 PMCID: PMC6937442 DOI: 10.4251/wjgo.v11.i12.1218] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/02/2019] [Revised: 07/09/2019] [Accepted: 10/03/2019] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Artificial intelligence, such as convolutional neural networks (CNNs), has been used in the interpretation of images and the diagnosis of hepatocellular cancer (HCC) and liver masses. CNN, a machine-learning algorithm similar to deep learning, has demonstrated its capability to recognise specific features that can detect pathological lesions. AIM To assess the use of CNNs in examining HCC and liver masses images in the diagnosis of cancer and evaluating the accuracy level of CNNs and their performance. METHODS The databases PubMed, EMBASE, and the Web of Science and research books were systematically searched using related keywords. Studies analysing pathological anatomy, cellular, and radiological images on HCC or liver masses using CNNs were identified according to the study protocol to detect cancer, differentiating cancer from other lesions, or staging the lesion. The data were extracted as per a predefined extraction. The accuracy level and performance of the CNNs in detecting cancer or early stages of cancer were analysed. The primary outcomes of the study were analysing the type of cancer or liver mass and identifying the type of images that showed optimum accuracy in cancer detection. RESULTS A total of 11 studies that met the selection criteria and were consistent with the aims of the study were identified. The studies demonstrated the ability to differentiate liver masses or differentiate HCC from other lesions (n = 6), HCC from cirrhosis or development of new tumours (n = 3), and HCC nuclei grading or segmentation (n = 2). The CNNs showed satisfactory levels of accuracy. The studies aimed at detecting lesions (n = 4), classification (n = 5), and segmentation (n = 2). Several methods were used to assess the accuracy of CNN models used. CONCLUSION The role of CNNs in analysing images and as tools in early detection of HCC or liver masses has been demonstrated in these studies. While a few limitations have been identified in these studies, overall there was an optimal level of accuracy of the CNNs used in segmentation and classification of liver cancers images.
Collapse
Affiliation(s)
- Samy A Azer
- Department of Medical Education, King Saud University College of Medicine, Riyadh 11461, Saudi Arabia
| |
Collapse
|
50
|
Wang S, Xing Y, Zhang L, Gao H, Zhang H. Deep Convolutional Neural Network for Ulcer Recognition in Wireless Capsule Endoscopy: Experimental Feasibility and Optimization. Comput Math Methods Med. 2019;2019:7546215. [PMID: 31641370 PMCID: PMC6766681 DOI: 10.1155/2019/7546215] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 08/18/2019] [Indexed: 01/17/2023]
Abstract
Wireless capsule endoscopy (WCE) has developed rapidly over the last several years and now enables physicians to examine the gastrointestinal tract without surgical operation. However, a large number of images must be analyzed to obtain a diagnosis. Deep convolutional neural networks (CNNs) have demonstrated impressive performance in different computer vision tasks. Thus, in this work, we aim to explore the feasibility of deep learning for ulcer recognition and optimize a CNN-based ulcer recognition architecture for WCE images. By analyzing the ulcer recognition task and characteristics of classic deep learning networks, we propose a HAnet architecture that uses ResNet-34 as the base network and fuses hyper features from the shallow layer with deep features in deeper layers to provide final diagnostic decisions. 1,416 independent WCE videos are collected for this study. The overall test accuracy of our HAnet is 92.05%, and its sensitivity and specificity are 91.64% and 92.42%, respectively. According to our comparisons of F1, F2, and ROC-AUC, the proposed method performs better than several off-the-shelf CNN models, including VGG, DenseNet, and Inception-ResNet-v2, and classical machine learning methods with handcrafted features for WCE image classification. Overall, this study demonstrates that recognizing ulcers in WCE images via the deep CNN method is feasible and could help reduce the tedious image reading work of physicians. Moreover, our HAnet architecture tailored for this problem gives a fine choice for the design of network structure.
Collapse
|