1
|
Hu L, Feng S, Wang B. Weakly Supervised Pose Estimation of Surgical Instrument from a Single Endoscopic Image. SENSORS (BASEL, SWITZERLAND) 2024; 24:3355. [PMID: 38894146 PMCID: PMC11174500 DOI: 10.3390/s24113355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Revised: 05/15/2024] [Accepted: 05/21/2024] [Indexed: 06/21/2024]
Abstract
Instrument pose estimation is a key demand in computer-aided surgery, and its main challenges lie in two aspects: Firstly, the difficulty of obtaining stable corresponding image feature points due to the instruments' high refraction and complicated background, and secondly, the lack of labeled pose data. This study aims to tackle the pose estimation problem of surgical instruments in the current endoscope system using a single endoscopic image. More specifically, a weakly supervised method based on the instrument's image segmentation contour is proposed, with the effective assistance of synthesized endoscopic images. Our method consists of the following three modules: a segmentation module to automatically detect the instrument in the input image, followed by a point inference module to predict the image locations of the implicit feature points of the instrument, and a point back-propagatable Perspective-n-Point module to estimate the pose from the tentative 2D-3D corresponding points. To alleviate the over-reliance on point correspondence accuracy, the local errors of feature point matching and the global inconsistency of the corresponding contours are simultaneously minimized. Our proposed method is validated with both real and synthetic images in comparison with the current state-of-the-art methods.
Collapse
Affiliation(s)
- Lihua Hu
- College of Computer Sciences and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, China; (L.H.); (S.F.)
| | - Shida Feng
- College of Computer Sciences and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, China; (L.H.); (S.F.)
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| | - Bo Wang
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| |
Collapse
|
2
|
Shi D, Rahimpour A, Ghafourian A, Naddaf Shargh MM, Upadhyay D, Lasky TA, Soltani I. Deep Bayesian-Assisted Keypoint Detection for Pose Estimation in Assembly Automation. SENSORS (BASEL, SWITZERLAND) 2023; 23:6107. [PMID: 37447956 DOI: 10.3390/s23136107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 06/27/2023] [Accepted: 06/28/2023] [Indexed: 07/15/2023]
Abstract
Pose estimation is crucial for automating assembly tasks, yet achieving sufficient accuracy for assembly automation remains challenging and part-specific. This paper presents a novel, streamlined approach to pose estimation that facilitates automation of assembly tasks. Our proposed method employs deep learning on a limited number of annotated images to identify a set of keypoints on the parts of interest. To compensate for network shortcomings and enhance accuracy we incorporated a Bayesian updating stage that leverages our detailed knowledge of the assembly part design. This Bayesian updating step refines the network output, significantly improving pose estimation accuracy. For this purpose, we utilized a subset of network-generated keypoint positions with higher quality as measurements, while for the remaining keypoints, the network outputs only serve as priors. The geometry data aid in constructing likelihood functions, which in turn result in enhanced posterior distributions of keypoint pixel positions. We then employed the maximum a posteriori (MAP) estimates of keypoint locations to obtain a final pose, allowing for an update to the nominal assembly trajectory. We evaluated our method on a 14-point snap-fit dash trim assembly for a Ford Mustang dashboard, demonstrating promising results. Our approach does not require tailoring to new applications, nor does it rely on extensive machine learning expertise or large amounts of training data. This makes our method a scalable and adaptable solution for the production floors.
Collapse
Affiliation(s)
- Debo Shi
- Department of Electrical and Computer Engineering, University of California Davis, Davis, CA 95616, USA
| | | | - Amin Ghafourian
- Department of Mechanical and Aerospace Engineering, University of California Davis, Davis, CA 95616, USA
| | | | - Devesh Upadhyay
- Greenfield Labs, Ford Motor Company, Palo Alto, CA 94304, USA
| | - Ty A Lasky
- Department of Mechanical and Aerospace Engineering, University of California Davis, Davis, CA 95616, USA
| | - Iman Soltani
- Department of Mechanical and Aerospace Engineering, University of California Davis, Davis, CA 95616, USA
| |
Collapse
|
3
|
Ren H, Lin L, Wang Y, Dong X. Robust 6-DoF Pose Estimation under Hybrid Constraints. SENSORS (BASEL, SWITZERLAND) 2022; 22:8758. [PMID: 36433356 PMCID: PMC9695601 DOI: 10.3390/s22228758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 11/04/2022] [Accepted: 11/08/2022] [Indexed: 06/16/2023]
Abstract
To solve the problem of the insufficient accuracy and stability of the two-stage pose estimation algorithm using heatmap in the problem of occluded object pose estimation, a new robust 6-DoF pose estimation algorithm under hybrid constraints is proposed in this paper. First, a new loss function suitable for heatmap regression is formulated to improve the quality of the predicted heatmaps and increase keypoint accuracy in complex scenes. Second, the heatmap regression network is expanded and a translation regression branch is added to constrain the pose further. Finally, a robust pose optimization module is used to fuse the heatmap and translation estimates and improve the pose estimation accuracy. The proposed algorithm achieves ADD(-S) accuracy rates of 93.5% and 46.2% on the LINEMOD dataset and the Occlusion LINEMOD dataset, which are better than other state-of-the-art algorithms. Compared with the conventional two-stage heatmap-based pose estimation algorithms, the mean estimation error is greatly reduced, and the stability of pose estimation is improved. The proposed algorithm can run at a maximum speed of 22 FPS, thus constituting both a performant and efficient method.
Collapse
Affiliation(s)
- Hong Ren
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Lin Lin
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yanjie Wang
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xin Dong
- State Key Laboratory on Integrated Optoelectronics, College of Electronic Science and Engineering, Jilin University, Changchun 130012, China
| |
Collapse
|
4
|
Deep learning based 3D target detection for indoor scenes. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03888-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
5
|
Padovan E, Marullo G, Tanzi L, Piazzolla P, Moos S, Porpiglia F, Vezzetti E. A deep learning framework for real-time 3D model registration in robot-assisted laparoscopic surgery. Int J Med Robot 2022; 18:e2387. [PMID: 35246913 PMCID: PMC9286374 DOI: 10.1002/rcs.2387] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 01/31/2022] [Accepted: 03/03/2022] [Indexed: 12/18/2022]
Abstract
Introduction The current study presents a deep learning framework to determine, in real‐time, position and rotation of a target organ from an endoscopic video. These inferred data are used to overlay the 3D model of patient's organ over its real counterpart. The resulting augmented video flow is streamed back to the surgeon as a support during laparoscopic robot‐assisted procedures. Methods This framework exploits semantic segmentation and, thereafter, two techniques, based on Convolutional Neural Networks and motion analysis, were used to infer the rotation. Results The segmentation shows optimal accuracies, with a mean IoU score greater than 80% in all tests. Different performance levels are obtained for rotation, depending on the surgical procedure. Discussion Even if the presented methodology has various degrees of precision depending on the testing scenario, this work sets the first step for the adoption of deep learning and augmented reality to generalise the automatic registration process.
Collapse
Affiliation(s)
- Erica Padovan
- Department of Management, Production and Design Engineering, Polytechnic University of Turin, Turin, Italy
| | - Giorgia Marullo
- Department of Management, Production and Design Engineering, Polytechnic University of Turin, Turin, Italy
| | - Leonardo Tanzi
- Department of Management, Production and Design Engineering, Polytechnic University of Turin, Turin, Italy
| | - Pietro Piazzolla
- Department of Oncology, Division of Urology, School of Medicine, University of Turin, Turin, Italy
| | - Sandro Moos
- Department of Management, Production and Design Engineering, Polytechnic University of Turin, Turin, Italy
| | - Francesco Porpiglia
- Department of Oncology, Division of Urology, School of Medicine, University of Turin, Turin, Italy
| | - Enrico Vezzetti
- Department of Management, Production and Design Engineering, Polytechnic University of Turin, Turin, Italy
| |
Collapse
|
6
|
An Improved Estimation Algorithm of Space Targets Pose Based on Multi-Modal Feature Fusion. MATHEMATICS 2021. [DOI: 10.3390/math9172085] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The traditional estimation methods of space targets pose are based on artificial features to match the transformation relationship between the image and the object model. With the explosion of deep learning technology, the approach based on deep neural networks (DNN) has significantly improved the performance of pose estimation. However, the current methods still have problems such as complex calculation, low accuracy, and poor real-time performance. Therefore, a new pose estimation algorithm is proposed in this paper. Firstly, the mask image of the target is obtained by the instance segmentation algorithm, and its point cloud image is obtained based on a depth map combined with camera parameters. Finally, the correlation among points is established to realize the prediction of pose based on multi-modal feature fusion. Experimental results in the YCB-Video dataset show that the proposed algorithm can recognize complex images at a speed of about 24 images per second with an accuracy of more than 80%. In conclusion, the proposed algorithm can realize fast pose estimation for complex stacked objects and has strong stability for different objects.
Collapse
|
7
|
Du G, Wang K, Lian S, Zhao K. Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review. Artif Intell Rev 2020. [DOI: 10.1007/s10462-020-09888-5] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|