Lian R, Li W, Hao J, Zhang Y, Jia F. Stereo Endoscopic Camera Pose Optimal Estimation by Structure Similarity Index Measure Integration.
Int J Med Robot 2025;
21:e70078. [PMID:
40413787 DOI:
10.1002/rcs.70078]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 04/24/2025] [Accepted: 05/16/2025] [Indexed: 05/27/2025]
Abstract
BACKGROUND
Accurate endoscopic camera pose estimation is crucial for real-time AR navigation systems. While current methods primarily use depth and optical flow, they often ignore structural inconsistencies between images.
METHODS
Leveraging the RAFT framework, we process sequential stereo RGB pairs to extract optical flow and depth features for pose estimation. To address structural inconsistencies, we refine the weights for both 2D and 3D residuals by computing SSIM indices for the left and right views, as well as pre- and post-optical flow transformations. The SSIM metric is also used in the loss function.
RESULTS
Experiments on the StereoMIS dataset demonstrate our method's improved pose estimation accuracy compared to rigid SLAM methods, showing a lower accumulated trajectory error (ATE-RMSE: 18.5 mm). Additionally, ablation experiments achieved an 11.49% reduction in average error.
CONCLUSION
The pose estimation accuracy has been improved by incorporating SSIM. The code is available at: https://github.com/lianrq/pose-estimation-by-SSIM-Integration.
Collapse