1
|
Yang S, Xiao D, Geng H, Ai D, Fan J, Fu T, Song H, Duan F, Yang J. Real-Time 3D Instrument Tip Tracking Using 2D X-Ray Fluoroscopy With Vessel Deformation Correction Under Free Breathing. IEEE Trans Biomed Eng 2025; 72:1422-1436. [PMID: 40117137 DOI: 10.1109/tbme.2024.3508840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2025]
Abstract
OBJECTIVE Accurate localization of the instrument tip within the hepatic vein is crucial for the success of transjugular intrahepatic portosystemic shunt (TIPS) procedures. Real-time tracking of the instrument tip in X-ray images is greatly influenced by vessel deformation due to patient's pose variation, respiratory motion, and puncture manipulation, frequently resulting in failed punctures. METHOD We propose a novel framework called deformable instrument tip tracking (DITT) to obtain the real-time tip positioning within the 3D deformable vasculature. First, we introduce a pose alignment module to improve the rigid matching between the preoperative vessel centerline and the intraoperative instrument centerline, in which the accurate matching of 3D/2D centerline features is implemented with an adaptive point sampling strategy. Second, a respiration compensation module using monoplane X-ray image sequences is constructed and provides the motion prior to predict intraoperative liver movement. Third, a deformation correction module is proposed to rectify the vessel deformation during procedures, in which a manifold regularization and the maximum likelihood-based acceleration are introduced to obtain the accurate and fast deformation learning. RESULTS Experimental results on simulated and clinical datasets show an average tracking error of 1.59 0.57 mm and 1.67 0.54 mm, respectively. CONCLUSION Our framework can track the tip in 3D vessel and dynamically overlap the branch roadmapping onto X-ray images to provide real-time guidance. SIGNIFICANCE Accurate and fast (43ms per frame) tip tracking with the proposed framework possesses a good potential for improving the outcomes of TIPS treatment and minimizes the usage of contrast agent.
Collapse
|
2
|
Zhang Y, Gao K, Yang Z, Li C, Cai M, Tian Y, Cheng H, Zhu Z. Parallax-Tolerant Weakly-Supervised Pixel-Wise Deep Color Correction for Image Stitching of Pinhole Camera Arrays. SENSORS (BASEL, SWITZERLAND) 2025; 25:732. [PMID: 39943371 PMCID: PMC11820881 DOI: 10.3390/s25030732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2024] [Revised: 01/23/2025] [Accepted: 01/23/2025] [Indexed: 02/16/2025]
Abstract
Camera arrays typically use image-stitching algorithms to generate wide field-of-view panoramas, but parallax and color differences caused by varying viewing angles often result in noticeable artifacts in the stitching result. However, existing solutions can only address specific color difference issues and are ineffective for pinhole images with parallax. To overcome these limitations, we propose a parallax-tolerant weakly supervised pixel-wise deep color correction framework for the image stitching of pinhole camera arrays. The total framework consists of two stages. In the first stage, based on the differences between high-dimensional feature vectors extracted by a convolutional module, a parallax-tolerant color correction network with dynamic loss weights is utilized to adaptively compensate for color differences in overlapping regions. In the second stage, we introduce a gradient-based Markov Random Field inference strategy for correction coefficients of non-overlapping regions to harmonize non-overlapping regions with overlapping regions. Additionally, we innovatively propose an evaluation metric called Color Differences Across the Seam to quantitatively measure the naturalness of transitions across the composition seam. Comparative experiments conducted on popular datasets and authentic images demonstrate that our approach outperforms existing solutions in both qualitative and quantitative evaluations, effectively eliminating visible artifacts and producing natural-looking composite images.
Collapse
Affiliation(s)
- Yanzheng Zhang
- Key Laboratory of Photoelectronic Imaging Technology and System, Ministry of Education of China, Beijing Institute of Technology, Beijing 100081, China; (Y.Z.); (Z.Y.); (C.L.); (M.C.); (H.C.)
| | - Kun Gao
- Key Laboratory of Photoelectronic Imaging Technology and System, Ministry of Education of China, Beijing Institute of Technology, Beijing 100081, China; (Y.Z.); (Z.Y.); (C.L.); (M.C.); (H.C.)
| | - Zhijia Yang
- Key Laboratory of Photoelectronic Imaging Technology and System, Ministry of Education of China, Beijing Institute of Technology, Beijing 100081, China; (Y.Z.); (Z.Y.); (C.L.); (M.C.); (H.C.)
| | - Chenrui Li
- Key Laboratory of Photoelectronic Imaging Technology and System, Ministry of Education of China, Beijing Institute of Technology, Beijing 100081, China; (Y.Z.); (Z.Y.); (C.L.); (M.C.); (H.C.)
| | - Mingfeng Cai
- Key Laboratory of Photoelectronic Imaging Technology and System, Ministry of Education of China, Beijing Institute of Technology, Beijing 100081, China; (Y.Z.); (Z.Y.); (C.L.); (M.C.); (H.C.)
| | - Yuexin Tian
- School of Innovation and Entrepreneurship, Southern University of Science and Technology, Shenzhen 518055, China;
| | - Haobo Cheng
- Key Laboratory of Photoelectronic Imaging Technology and System, Ministry of Education of China, Beijing Institute of Technology, Beijing 100081, China; (Y.Z.); (Z.Y.); (C.L.); (M.C.); (H.C.)
| | - Zhenyu Zhu
- Key Laboratory of Metallurgical Equipment and Control Technology, Ministry of Education, Wuhan University of Science and Technology, Wuhan 430081, China
| |
Collapse
|
3
|
Xia Y, Ma J. Grid-Guided Sparse Laplacian Consensus for Robust Feature Matching. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:1367-1381. [PMID: 40031433 DOI: 10.1109/tip.2025.3539469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Feature matching is a fundamental concern widely employed in computer vision applications. This paper introduces a novel and efficacious method named Grid-guided Sparse Laplacian Consensus, rooted in the concept of smooth constraints. To address challenging scenes such as severe deformation and independent motions, we devise grid-based adaptive matching guidance to construct multiple transformations based on motion coherence. Specifically, we obtain a set of precise yet sparse seed correspondences through motion statistics, facilitating the generation of an adaptive number of candidate correspondence sets. In addition, we propose an innovative formulation grounded in graph Laplacian for correspondence pruning, wherein mapping function estimation is formulated as a Bayesian model. We solve this utilizing EM algorithm with seed correspondences as initialization for optimal convergence. Sparse approximation is leveraged to reduce the time-space burden. A comprehensive set of experiments are conducted to demonstrate the superiority of our method over other state-of-the-art methods in both robustness to serious deformations and generalizability for various descriptors, as well as generalizability to multi motions. Additionally, experiments in geometric estimation, image registration, loop closure detection, and visual localization highlight the significance of our method across diverse scenes for high-level tasks.
Collapse
|
4
|
Yang J, Zhang X, Fan S, Ren C, Zhang Y. Mutual Voting for Ranking 3D Correspondences. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:4041-4057. [PMID: 37074893 DOI: 10.1109/tpami.2023.3268297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Consistent correspondences between point clouds are vital to 3D vision tasks such as registration and recognition. In this paper, we present a mutual voting method for ranking 3D correspondences. The key insight is to achieve reliable scoring results for correspondences by refining both voters and candidates in a mutual voting scheme. First, a graph is constructed for the initial correspondence set with the pairwise compatibility constraint. Second, nodal clustering coefficients are introduced to preliminarily remove a portion of outliers and speed up the following voting process. Third, we model nodes and edges in the graph as candidates and voters, respectively. Mutual voting is then performed in the graph to score correspondences. Finally, the correspondences are ranked based on the voting scores and top-ranked ones are identified as inliers. Feature matching, 3D point cloud registration, and 3D object recognition experiments on various datasets with different nuisances and modalities verify that MV is robust to heavy outliers under different challenging settings, and can significantly boost 3D point cloud registration and 3D object recognition performance.
Collapse
|
5
|
Zhang S, Ma J. ConvMatch: Rethinking Network Design for Two-View Correspondence Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:2920-2935. [PMID: 37983155 DOI: 10.1109/tpami.2023.3334515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Multilayer perceptron (MLP) has become the de facto backbone in two-view correspondence learning, for it can extract effective deep features from unordered correspondences individually. However, the problem of natively lacking context information limits its performance although many context-capturing modules are appended in the follow-up studies. In this paper, from a novel perspective, we design a correspondence learning network called ConvMatch that for the first time can leverage a convolutional neural network (CNN) as the backbone, inherently capable of context aggregation. Specifically, with the observation that sparse motion vectors and a dense motion field can be converted into each other with interpolating and sampling, we regularize the putative motion vectors by estimating the dense motion field implicitly, then rectify the errors caused by outliers in local areas with CNN, and finally obtain correct motion vectors from the rectified motion field. Moreover, we propose global information injection and bilateral convolution, to fit the overall spatial transformation better and accommodate the discontinuities of the motion field in case of large scene disparity. Extensive experiments reveal that ConvMatch consistently outperforms state-of-the-arts for relative pose estimation, homography estimation, and visual localization.
Collapse
|
6
|
Lin S, Chen X, Xiao G, Wang H, Huang F, Weng J. Multi-Stage Network With Geometric Semantic Attention for Two-View Correspondence Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:3031-3046. [PMID: 38656841 DOI: 10.1109/tip.2024.3391002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
The removal of outliers is crucial for establishing correspondence between two images. However, when the proportion of outliers reaches nearly 90%, the task becomes highly challenging. Existing methods face limitations in effectively utilizing geometric transformation consistency (GTC) information and incorporating geometric semantic neighboring information. To address these challenges, we propose a Multi-Stage Geometric Semantic Attention (MSGSA) network. The MSGSA network consists of three key modules: the multi-branch (MB) module, the GTC module, and the geometric semantic attention (GSA) module. The MB module, structured with a multi-branch design, facilitates diverse and robust spatial transformations. The GTC module captures transformation consistency information from the preceding stage. The GSA module categorizes input based on the prior stage's output, enabling efficient extraction of geometric semantic information through a graph-based representation and inter-category information interaction using Transformer. Extensive experiments on the YFCC100M and SUN3D datasets demonstrate that MSGSA outperforms current state-of-the-art methods in outlier removal and camera pose estimation, particularly in scenarios with a high prevalence of outliers. Source code is available at https://github.com/shuyuanlin.
Collapse
|
7
|
Liu S, Huang Z, Li J, Li A, Huang X. FILNet: Fast Image-Based Indoor Localization Using an Anchor Control Network. SENSORS (BASEL, SWITZERLAND) 2023; 23:8140. [PMID: 37836972 PMCID: PMC10575192 DOI: 10.3390/s23198140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 09/14/2023] [Accepted: 09/22/2023] [Indexed: 10/15/2023]
Abstract
This paper designs a fast image-based indoor localization method based on an anchor control network (FILNet) to improve localization accuracy and shorten the duration of feature matching. Particularly, two stages are developed for the proposed algorithm. The offline stage is to construct an anchor feature fingerprint database based on the concept of an anchor control network. This introduces detailed surveys to infer anchor features according to the information of control anchors using the visual-inertial odometry (VIO) based on Google ARcore. In addition, an affine invariance enhancement algorithm based on feature multi-angle screening and supplementation is developed to solve the image perspective transformation problem and complete the feature fingerprint database construction. In the online stage, a fast spatial indexing approach is adopted to improve the feature matching speed by searching for active anchors and matching only anchor features around the active anchors. Further, to improve the correct matching rate, a homography matrix filter model is used to verify the correctness of feature matching, and the correct matching points are selected. Extensive experiments in real-world scenarios are performed to evaluate the proposed FILNet. The experimental results show that in terms of affine invariance, compared with the initial local features, FILNet significantly improves the recall of feature matching from 26% to 57% when the angular deviation is less than 60 degrees. In the image feature matching stage, compared with the initial K-D tree algorithm, FILNet significantly improves the efficiency of feature matching, and the average time of the test image dataset is reduced from 30.3 ms to 12.7 ms. In terms of localization accuracy, compared with the benchmark method based on image localization, FILNet significantly improves the localization accuracy, and the percentage of images with a localization error of less than 0.1m increases from 31.61% to 55.89%.
Collapse
Affiliation(s)
- Sikang Liu
- School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China;
- State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430072, China;
| | - Zhao Huang
- School of Electronic Engineering, Queen Mary, University of London, London E1 4NS, UK; (Z.H.); (A.L.)
| | - Jiafeng Li
- State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430072, China;
| | - Anna Li
- School of Electronic Engineering, Queen Mary, University of London, London E1 4NS, UK; (Z.H.); (A.L.)
| | - Xingru Huang
- School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China;
| |
Collapse
|
8
|
Jia T, Taylor ZA, Chen X. Density-adaptive registration of pointclouds based on Dirichlet Process Gaussian Mixture Models. Phys Eng Sci Med 2023; 46:719-734. [PMID: 37014577 DOI: 10.1007/s13246-023-01245-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Accepted: 03/12/2023] [Indexed: 04/05/2023]
Abstract
We propose an algorithm for rigid registration of pre- and intra-operative patient anatomy, represented as pointclouds, during minimally invasive surgery. This capability is essential for development of augmented reality systems for guiding such interventions. Key challenges in this context are differences in the point density in the pre- and intra-operative pointclouds, and potentially low spatial overlap between the two. Solutions, correspondingly, must be robust to both of these phenomena. We formulated a pointclouds registration approach which considers the pointclouds after rigid transformation to be observations of a global non-parametric probabilistic model named Dirichlet Process Gaussian Mixture Model. The registration problem is solved by minimizing the Kullback-Leibler divergence in a variational Bayesian inference framework. By this means, all unknown parameters are recursively inferred, including, importantly, the optimal number of mixture model components, which ensures the model complexity efficiently matches that of the observed data. By presenting the pointclouds as KDTrees, both the data and model are expanded in a coarse-to-fine style. The scanning weight of each point is estimated by its neighborhood, imparting the algorithm with robustness to point density variations. Experiments on several datasets with different levels of noise, outliers and pointcloud overlap show that our method has a comparable accuracy, but higher efficiency than existing Gaussian Mixture Model methods, whose performance is sensitive to the number of model components.
Collapse
Affiliation(s)
- Tingting Jia
- School of Biomedical Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
- School of Mechanical Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Zeike A Taylor
- CISTIB Centre for Computational Imaging and Simulation Technologies in Biomedicine and the Institute of Medical and Biological Engineering, University of Leeds, Woodhouse Lane, Leeds, LS2 9JT, UK
| | - Xiaojun Chen
- School of Mechanical Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China.
- Institute of Medical Robotics, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China.
| |
Collapse
|
9
|
Liu X, Xiao G, Chen R, Ma J. PGFNet: Preference-Guided Filtering Network for Two-View Correspondence Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:1367-1378. [PMID: 37022827 DOI: 10.1109/tip.2023.3242598] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Accurate correspondence selection between two images is of great importance for numerous feature matching based vision tasks. The initial correspondences established by off-the-shelf feature extraction methods usually contain a large number of outliers, and this often leads to the difficulty in accurately and sufficiently capturing contextual information for the correspondence learning task. In this paper, we propose a Preference-Guided Filtering Network (PGFNet) to address this problem. The proposed PGFNet is able to effectively select correct correspondences and simultaneously recover the accurate camera pose of matching images. Specifically, we first design a novel iterative filtering structure to learn the preference scores of correspondences for guiding the correspondence filtering strategy. This structure explicitly alleviates the negative effects of outliers so that our network is able to capture more reliable contextual information encoded by the inliers for network learning. Then, to enhance the reliability of preference scores, we present a simple yet effective Grouped Residual Attention block as our network backbone, by designing a feature grouping strategy, a feature grouping manner, a hierarchical residual-like manner and two grouped attention operations. We evaluate PGFNet by extensive ablation studies and comparative experiments on the tasks of outlier removal and camera pose estimation. The results demonstrate outstanding performance gains over the existing state-of-the-art methods on different challenging scenes. The code is available at https://github.com/guobaoxiao/PGFNet.
Collapse
|
10
|
Bellavia F. SIFT Matching by Context Exposed. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2445-2457. [PMID: 35320089 DOI: 10.1109/tpami.2022.3161853] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This paper investigates how to step up local image descriptor matching by exploiting matching context information. Two main contexts are identified, originated respectively from the descriptor space and from the keypoint space. The former is generally used to design the actual matching strategy while the latter to filter matches according to the local spatial consistency. On this basis, a new matching strategy and a novel local spatial filter, named respectively blob matching and Delaunay Triangulation Matching (DTM) are devised. Blob matching provides a general matching framework by merging together several strategies, including rank-based pre-filtering as well as many-to-many and symmetric matching, enabling to achieve a global improvement upon each individual strategy. DTM alternates between Delaunay triangulation contractions and expansions to figure out and adjust keypoint neighborhood consistency. Experimental evaluation shows that DTM is comparable or better than the state-of-the-art in terms of matching accuracy and robustness. Evaluation is carried out according to a new benchmark devised for analyzing the matching pipeline in terms of correct correspondences on both planar and non-planar scenes, including several state-of-the-art methods as well as the common SIFT matching approach for reference. This evaluation can be of assistance for future research in this field.
Collapse
|
11
|
Deng Y, Ma J. ReDFeat: Recoupling Detection and Description for Multimodal Feature Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:591-602. [PMID: 37015497 DOI: 10.1109/tip.2022.3231135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Deep-learning-based local feature extraction algorithms that combine detection and description have made significant progress in visible image matching. However, the end-to-end training of such frameworks is notoriously unstable due to the lack of strong supervision of detection and the inappropriate coupling between detection and description. The problem is magnified in cross-modal scenarios, in which most methods heavily rely on the pre-training. In this paper, we recouple independent constraints of detection and description of multimodal feature learning with a mutual weighting strategy, in which the detected probabilities of robust features are forced to peak and repeat, while features with high detection scores are emphasized during optimization. Different from previous works, those weights are detached from back propagation so that the detected probability of indistinct features would not be directly suppressed and the training would be more stable. Moreover, we propose the Super Detector, a detector that possesses a large receptive field and is equipped with learnable non-maximum suppression layers, to fulfill the harsh terms of detection. Finally, we build a benchmark that contains cross visible, infrared, near-infrared and synthetic aperture radar image pairs for evaluating the performance of features in feature matching and image registration tasks. Extensive experiments demonstrate that features trained with the recoulped detection and description, named ReDFeat, surpass previous state-of-the-arts in the benchmark, while the model can be readily trained from scratch. The code is released at https://github.com/ACuOoOoO/ReDFeat.
Collapse
|
12
|
Fan A, Ma J, Jiang X, Ling H. Efficient Deterministic Search With Robust Loss Functions for Geometric Model Fitting. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:8212-8229. [PMID: 34473624 DOI: 10.1109/tpami.2021.3109784] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Geometric model fitting is a fundamental task in computer vision, which serves as the pre-requisite of many downstream applications. While the problem has a simple intrinsic structure where the solution can be parameterized within a few degrees of freedom, the ubiquitously existing outliers are the main challenge. In previous studies, random sampling techniques have been established as the practical choice, since optimization-based methods are usually too time-demanding. This prospective study is intended to design efficient algorithms that benefit from a general optimization-based view. In particular, two important types of loss functions are discussed, i.e., truncated and l1 losses, and efficient solvers have been derived for both upon specific approximations. Based on this philosophy, a class of algorithms are introduced to perform deterministic search for the inliers or geometric model. Recommendations are made based on theoretical and experimental analyses. Compared with the existing solutions, the proposed methods are both simple in computation and robust to outliers. Extensive experiments are conducted on publicly available datasets for geometric estimation, which demonstrate the superiority of our methods compared with the state-of-the-art ones. Additionally, we apply our method to the recent benchmark for wide-baseline stereo evaluation, leading to a significant improvement of performance. Our code is publicly available at https://github.com/AoxiangFan/EifficientDeterministicSearch.
Collapse
|
13
|
Robust two-phase registration method for three-dimensional point set under the Bayesian mixture framework. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01673-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
14
|
Quan S, Yin K, Ye K, Nan K. Robust Feature Matching for 3D Point Clouds with Progressive Consistency Voting. SENSORS (BASEL, SWITZERLAND) 2022; 22:7718. [PMID: 36298069 PMCID: PMC9610732 DOI: 10.3390/s22207718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Revised: 09/30/2022] [Accepted: 10/04/2022] [Indexed: 06/16/2023]
Abstract
Feature matching for 3D point clouds is a fundamental yet challenging problem in remote sensing and 3D computer vision. However, due to a number of nuisances, the initial feature correspondences generated by matching local keypoint descriptors may contain many outliers (incorrect correspondences). To remove outliers, this paper presents a robust method called progressive consistency voting (PCV). PCV aims at assigning a reliable confidence score to each correspondence such that reasonable correspondences can be achieved by simply finding top-scored ones. To compute the confidence score, we suggest fully utilizing the geometric consistency cue between correspondences and propose a voting-based scheme. In addition, we progressively mine convincing voters from the initial correspondence set and optimize the scoring result by considering top-scored correspondences at the last iteration. Experiments on several standard datasets verify that PCV outperforms five state-of-the-art methods under almost all tested conditions and is robust to noise, data decimation, clutter, occlusion, and data modality change. We also apply PCV to point cloud registration and show that it can significantly improve the registration performance.
Collapse
|
15
|
Yuan X, Maharjan A. Non-rigid point set registration: recent trends and challenges. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10292-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
16
|
Xia Y, Ma J. Locality-Guided Global-Preserving Optimization for Robust Feature Matching. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:5093-5108. [PMID: 35895644 DOI: 10.1109/tip.2022.3192993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Feature matching is a fundamental problem in many computer vision tasks. This paper proposes a novel effective framework for mismatch removal, named LOcality-guided Global-preserving Optimization (LOGO). To identify inliers from a putative matching set generated by feature descriptor similarity, we introduce a fixed-point progressive approach to optimize a graph-based objective, which represents a two-class assignment problem regarding an affinity matrix containing global structures. We introduce a strategy that a small initial set with a high inlier ratio exploits the topology of the affinity matrix to elicit other inliers based on their reliable geometry, which enhances the robustness to outliers. Geometrically, we provide a locality-guided matching strategy, i.e., using local topology consensus as a criterion to determine the initial set, thus expanding to yield the final feature matching set. In addition, we apply local affine transformations based on reference points to determine the local consensus and similarity scores of nodes and edges, ensuring the validity and generality for various scenarios including complex nonrigid transformations. Extensive experiments demonstrate the effectiveness and robustness of the proposed LOGO, which is competitive with the current state-of-the-art methods. It also exhibits favorable potential for high-level vision tasks, such as essential and fundamental matrix estimation, image registration and loop closure detection.
Collapse
|
17
|
|
18
|
Feature Matching via Motion-Consistency Driven Probabilistic Graphical Model. Int J Comput Vis 2022. [DOI: 10.1007/s11263-022-01644-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
19
|
A Novel and Effective Cooperative RANSAC Image Matching Method Using Geometry Histogram-Based Constructed Reduced Correspondence Set. REMOTE SENSING 2022. [DOI: 10.3390/rs14143256] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The success of many computer vision and pattern recognition applications depends on matching local features on two or more images. Because the initial correspondence set—i.e., the set of the initial feature pairs—is often contaminated by mismatches, removing mismatches is a necessary task prior to image matching. In this paper, we first propose a fast geometry histogram-based (GH-based) mismatch removal strategy to construct a reduced correspondence set Creduced,GH from the initial correspondence set Cini. Next, we propose an effective cooperative random sample consensus (COOSAC) method for remote sensing image matching. COOSAC consists of a RANSAC, called RANSACini working on Cini, and a tiny RANSAC, called RANSACtiny,GH working on a randomly selected subset of Creduced,GH. In RANSACtiny,GH, an iterative area constraint-based sampling strategy is proposed to estimate the model solution of Ctiny,GH until the specified confidence level is reached, and then RANSACini utilizes the estimated model solution of Ctiny,GH to calculate the inlier rate of Cini. COOSAC repeats the above cooperation between RANSACtiny,GH and RANSACini until the specified confidence level is reached, reporting the resultant model solution of Cini. For convenience, our image matching method is called the GH-COOSAC method. Based on several testing datasets, thorough experimental results demonstrate that the proposed GH-COOSAC method achieves lower computational cost and higher matching accuracy benefits when compared with the state-of-the-art image matching methods.
Collapse
|
20
|
CAISOV: Collinear Affine Invariance and Scale-Orientation Voting for Reliable Feature Matching. REMOTE SENSING 2022. [DOI: 10.3390/rs14133175] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Reliable feature matching plays an important role in the fields of computer vision and photogrammetry. Due to the complex transformation model caused by photometric and geometric deformations, and the limited discriminative power of local feature descriptors, initial matches with high outlier ratios cannot be addressed very well. This study proposes a reliable outlier-removal algorithm by combining two affine-invariant geometric constraints. First, a very simple geometric constraint, namely, CAI (collinear affine invariance) has been implemented, which is based on the observation that the collinear property of any two points is invariant under affine transformation. Second, after the first-step outlier removal based on the CAI constraint, the SOV (scale-orientation voting) scheme was then adopted to remove remaining outliers and recover the lost inliers, in which the peaks of both scale and orientation voting define the parameters of the geometric transformation model. Finally, match expansion was executed using the Delaunay triangulation of refined matches. By using close-range (rigid and non-rigid images) and UAV (unmanned aerial vehicle) datasets, comprehensive comparison and analysis are conducted in this study. The results demonstrate that the proposed outlier-removal algorithm achieves the best overall performance when compared with RANSAC-like and local geometric constraint-based methods, and it can also be applied to achieve reliable outlier removal in the workflow of SfM-based UAV image orientation.
Collapse
|
21
|
Rizzini DL, Fontana E. Rotation Estimation Based on Anisotropic Angular Radon Spectrum. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3182111] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Dario Lodi Rizzini
- RIMLab - Robotics and Intelligent Machines Laboratory, Dipartimento di Ingegneria e Architettura, University of Parma, Italy
| | - Ernesto Fontana
- RIMLab - Robotics and Intelligent Machines Laboratory, Dipartimento di Ingegneria e Architettura, University of Parma, Italy
| |
Collapse
|
22
|
A Robust Strategy for Large-Size Optical and SAR Image Registration. REMOTE SENSING 2022. [DOI: 10.3390/rs14133012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The traditional template matching strategy of optical and synthetic aperture radar (SAR) is sensitive to the nonlinear transformation between two images. In some cases, the optical and SAR image pairs do not conform to the affine transformation condition. To address this issue, this study presents a novel template matching strategy which uses the One-Class Support Vector Machine (SVM) to remove outliers. First, we propose a method to construct the similarity map dataset using the SEN1-2 dataset for training the One-Class SVM. Second, a four-step strategy for optical and SAR image registration is presented in this paper. In the first step, the optical image is divided into some grids. In the second step, the strongest Harris response point is selected as the feature point in each grid. In the third step, we use Gaussian pyramid features of oriented gradients (GPOG) descriptor to calculate the similarity map in the search region. The trained One-Class SVM is used to remove outliers through similarity maps in the fourth step. Furthermore, the number of improve matches (NIM) and the rate of improve matches (RIM) are designed to measure the effect of image registration. Finally, this paper designs two experiments to prove that the proposed strategy can correctly select the matching points through similarity maps. The experimental results of the One-Class SVM in dataset show that the One-Class SVM can select the correct points in different datasets. The image registration results obtained on the second experiment show that the proposed strategy is robust to the nonlinear transformation between optical and SAR images.
Collapse
|
23
|
Feature Matching for Remote-Sensing Image Registration via Neighborhood Topological and Affine Consistency. REMOTE SENSING 2022. [DOI: 10.3390/rs14112606] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Feature matching is a key method of feature-based image registration, which refers to establishing reliable correspondence between feature points extracted from two images. In order to eliminate false matchings from the initial matchings, we propose a simple and efficient method. The key principle of our method is to maintain the topological and affine transformation consistency among the neighborhood matches. We formulate this problem as a mathematical model and derive a closed solution with linear time and space complexity. More specifically, our method can remove mismatches from thousands of hypothetical correspondences within a few milliseconds. We conduct qualitative and quantitative experiments on our method on different types of remote-sensing datasets. The experimental results show that our method is general, and it can deal with all kinds of remote-sensing image pairs, whether rigid or non-rigid image deformation or image pairs with various shadow, projection distortion, noise, and geometric distortion. Furthermore, it is two orders of magnitude faster and more accurate than state-of-the-art methods and can be used for real-time applications.
Collapse
|
24
|
Wang HJ, Lee CY, Lai JH, Chang YC, Chen CM. Image registration method using representative feature detection and iterative coherent spatial mapping for infrared medical images with flat regions. Sci Rep 2022; 12:7932. [PMID: 35562370 PMCID: PMC9106756 DOI: 10.1038/s41598-022-11379-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 04/06/2022] [Indexed: 11/23/2022] Open
Abstract
In the registration of medical images, nonrigid registration targets, images with large displacement caused by different postures of the human body, and frequent variations in image intensity due to physiological phenomena are substantial problems that make medical images less suitable for intensity-based image registration modes. These problems also greatly increase the difficulty and complexity of feature detection and matching for feature-based image registration modes. This research introduces an automatic image registration algorithm for infrared medical images that offers the following benefits: effective detection of feature points in flat regions (cold patterns) that appear due to changes in the human body’s thermal patterns, improved mismatch removal through coherent spatial mapping for improved feature point matching, and large-displacement optical flow for optimal transformation. This method was compared with various classical gold standard image registration methods to evaluate its performance. The models were compared for the three key steps of the registration process—feature detection, feature point matching, and image transformation—and the results are presented visually and quantitatively. The results demonstrate that the proposed method outperforms existing methods in all tasks, including in terms of the features detected, uniformity of feature points, matching accuracy, and control point sparsity, and achieves optimal image transformation. The performance of the proposed method with four common image types was also evaluated, and the results verify that the proposed method has a high degree of stability and can effectively register medical images under a variety of conditions.
Collapse
Affiliation(s)
- Hao-Jen Wang
- Department of Biomedical Engineering, National Taiwan University, Taipei, Taiwan.,Department of Electrical Engineering, National United University, Taipei, Taiwan
| | - Chia-Yen Lee
- Department of Electrical Engineering, National United University, Taipei, Taiwan.
| | - Jhih-Hao Lai
- Department of Electrical Engineering, National United University, Taipei, Taiwan
| | - Yeun-Chung Chang
- Department of Medical Imaging, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan
| | - Chung-Ming Chen
- Department of Biomedical Engineering, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
25
|
Liu S, Fan J, Song D, Fu T, Lin Y, Xiao D, Song H, Wang Y, Yang J. Joint estimation of depth and motion from a monocular endoscopy image sequence using a multi-loss rebalancing network. BIOMEDICAL OPTICS EXPRESS 2022; 13:2707-2727. [PMID: 35774318 PMCID: PMC9203100 DOI: 10.1364/boe.457475] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 04/01/2022] [Accepted: 04/01/2022] [Indexed: 06/15/2023]
Abstract
Building an in vivo three-dimensional (3D) surface model from a monocular endoscopy is an effective technology to improve the intuitiveness and precision of clinical laparoscopic surgery. This paper proposes a multi-loss rebalancing-based method for joint estimation of depth and motion from a monocular endoscopy image sequence. The feature descriptors are used to provide monitoring signals for the depth estimation network and motion estimation network. The epipolar constraints of the sequence frame is considered in the neighborhood spatial information by depth estimation network to enhance the accuracy of depth estimation. The reprojection information of depth estimation is used to reconstruct the camera motion by motion estimation network with a multi-view relative pose fusion mechanism. The relative response loss, feature consistency loss, and epipolar consistency loss function are defined to improve the robustness and accuracy of the proposed unsupervised learning-based method. Evaluations are implemented on public datasets. The error of motion estimation in three scenes decreased by 42.1%,53.6%, and 50.2%, respectively. And the average error of 3D reconstruction is 6.456 ± 1.798mm. This demonstrates its capability to generate reliable depth estimation and trajectory reconstruction results for endoscopy images and meaningful applications in clinical.
Collapse
Affiliation(s)
- Shiyuan Liu
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Jingfan Fan
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Dengpan Song
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Tianyu Fu
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Yucong Lin
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Deqiang Xiao
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Hong Song
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Yongtian Wang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Jian Yang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| |
Collapse
|
26
|
Liu S, Fan J, Ai D, Song H, Fu T, Wang Y, Yang J. Feature matching for texture-less endoscopy images via superpixel vector field consistency. BIOMEDICAL OPTICS EXPRESS 2022; 13:2247-2265. [PMID: 35519251 PMCID: PMC9045917 DOI: 10.1364/boe.450259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 01/05/2022] [Accepted: 01/23/2022] [Indexed: 06/14/2023]
Abstract
Feature matching is an important technology to obtain the surface morphology of soft tissues in intraoperative endoscopy images. The extraction of features from clinical endoscopy images is a difficult problem, especially for texture-less images. The reduction of surface details makes the problem more challenging. We proposed an adaptive gradient-preserving method to improve the visual feature of texture-less images. For feature matching, we first constructed a spatial motion field by using the superpixel blocks and estimated its information entropy matching with the motion consistency algorithm to obtain the initial outlier feature screening. Second, we extended the superpixel spatial motion field to the vector field and constrained it with the vector feature to optimize the confidence of the initial matching set. Evaluations were implemented on public and undisclosed datasets. Our method increased by an order of magnitude in the three feature point extraction methods than the original image. In the public dataset, the accuracy and F1-score increased to 92.6% and 91.5%. The matching score was improved by 1.92%. In the undisclosed dataset, the reconstructed surface integrity of the proposed method was improved from 30% to 85%. Furthermore, we also presented the surface reconstruction result of differently sized images to validate the robustness of our method, which showed high-quality feature matching results. Overall, the experiment results proved the effectiveness of the proposed matching method. This demonstrates its capability to extract sufficient visual feature points and generate reliable feature matches for 3D reconstruction and meaningful applications in clinical.
Collapse
Affiliation(s)
- Shiyuan Liu
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Jingfan Fan
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Danni Ai
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Hong Song
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Tianyu Fu
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Yongtian Wang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Jian Yang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| |
Collapse
|
27
|
Augmented reality navigation with real-time tracking for facial repair surgery. Int J Comput Assist Radiol Surg 2022; 17:981-991. [PMID: 35286586 DOI: 10.1007/s11548-022-02589-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 02/26/2022] [Indexed: 11/05/2022]
Abstract
PURPOSE Facial repair surgeries (FRS) require accuracy for navigating the critical anatomy safely and quickly. The purpose of this paper is to develop a method to directly track the position of the patient using video data acquired from the single camera, which can achieve noninvasive, real time, and high positioning accuracy in FRS. METHODS Our method first performs camera calibration and registers the surface segmented from computed tomography to the patient. Then, a two-step constraint algorithm, which includes the feature local constraint and the distance standard deviation constraint, is used to find the optimal feature matching pair quickly. Finally, the movements of the camera and the patient decomposed from the image motion matrix are used to track the camera and the patient, respectively. RESULTS The proposed method achieved fusion error RMS of 1.44 ± 0.35, 1.50 ± 0.15, 1.63 ± 0.03 mm in skull phantom, cadaver mandible, and human experiments, respectively. The above errors of the proposed method were lower than those of the optical tracking system-based method. Additionally, the proposed method could process video streams up to 24 frames per second, which can meet the real-time requirements of FRS. CONCLUSIONS The proposed method does not rely on tracking markers attached to the patient; it could be executed automatically to maintain the correct augmented reality scene and overcome the decrease in positioning accuracy caused by patient movement during surgery.
Collapse
|
28
|
Abstract
This paper introduces an Unmanned Aerial Vehicle (UAV) image stitching method, based on the optimal seam algorithm and half-projective warp, that can effectively retain the original information of the image and obtain the ideal stitching effect. The existing seam stitching algorithms can eliminate the ghosting and blurring problems on the stitched images, but the deformation and angle distortion caused by image registration will remain in the stitching results. To overcome this situation, we propose a stitching strategy based on optimal seam and half-projective warp. Firstly, we define a new difference matrix in the overlapping region of the aligned image, which includes the color, structural and line difference information. Then, we constrain the search range of the seam by the minimum energy, and propose a seam search algorithm based on the global minimum energy to obtain the seam. Finally, combined with the seam position and half-projective warp, the shape of the stitched image is rectified to keep more regions in their original shape. The experimental results of several groups of UAV images show that our method has a superior stitching effect.
Collapse
|
29
|
Zhou H, Jayender J. EMDQ: Removal of Image Feature Mismatches in Real-Time. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 31:706-720. [PMID: 34914589 PMCID: PMC8777235 DOI: 10.1109/tip.2021.3134456] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This paper proposes a novel method for removing image feature mismatches in real-time that can handle both rigid and smooth deforming environments. Image distortion, parallax and object deformation may cause the pixel coordinates of feature matches to have non-rigid deformations, which cannot be represented using a single analytical rigid transformation. To solve this problem, we propose an algorithm based on the re-weighting and 1-point RANSAC strategy (R1P-RNSC), which operates under the assumption that a non-rigid deformation can be approximately represented by multiple rigid transformations. R1P-RNSC is fast but suffers from the drawback that local smoothing information cannot be considered, thus limiting its accuracy. To solve this problem, we propose a non-parametric algorithm based on the expectation-maximization algorithm and the dual quaternion-based representation (EMDQ). EMDQ generates dense and smooth deformation fields by interpolating among the feature matches, simultaneously removing mismatches that are inconsistent with the deformation field. It relies on the rigid transformations obtained by R1P-RNSC to improve its accuracy. The experimental results demonstrate that EMDQ has superior accuracy compared to other state-of-the-art mismatch removal methods. The ability to build correspondences for all image pixels using the dense deformation field is another contribution of this paper.
Collapse
|
30
|
Thermal Drift Correction for Laboratory Nano Computed Tomography via Outlier Elimination and Feature Point Adjustment. SENSORS 2021; 21:s21248493. [PMID: 34960584 PMCID: PMC8703391 DOI: 10.3390/s21248493] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 12/16/2021] [Accepted: 12/17/2021] [Indexed: 11/23/2022]
Abstract
Thermal drift of nano-computed tomography (CT) adversely affects the accurate reconstruction of objects. However, feature-based reference scan correction methods are sometimes unstable for images with similar texture and low contrast. In this study, based on the geometric position of features and the structural similarity (SSIM) of projections, a rough-to-refined rigid alignment method is proposed to align the projection. Using the proposed method, the thermal drift artifacts in reconstructed slices are reduced. Firstly, the initial features are obtained by speeded up robust features (SURF). Then, the outliers are roughly eliminated by the geometric position of global features. The features are refined by the SSIM between the main and reference projections. Subsequently, the SSIM between the neighborhood images of features are used to relocate the features. Finally, the new features are used to align the projections. The two-dimensional (2D) transmission imaging experiments reveal that the proposed method provides more accurate and robust results than the random sample consensus (RANSAC) and locality preserving matching (LPM) methods. For three-dimensional (3D) imaging correction, the proposed method is compared with the commonly used enhanced correlation coefficient (ECC) method and single-step discrete Fourier transform (DFT) algorithm. The results reveal that proposed method can retain the details more faithfully.
Collapse
|
31
|
Li X, Wen C, Wang L, Fang Y. Topology Constrained Shape Correspondence. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:3926-3937. [PMID: 32406841 DOI: 10.1109/tvcg.2020.2994013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
To better address the deformation and structural variation challenges inherently present in 3D shapes, researchers have shifted their focus from designing handcrafted point descriptors to learning point descriptors and their correspondences in a data-driven manner. Recent studies have developed deep neural networks for robust point descriptor and shape correspondence learning in consideration of local structural information. In this article, we developed a novel shape correspondence learning network, called TC-NET, which further enhances performance by encouraging the topological consistency between the embedding feature space and the input shape space. Specifically, in this article, we first calculate the topology-associated edge weights to represent the topological structure of each point. Then, in order to preserve this topological structure in high-dimensional feature space, a structural regularization term is defined to minimize the topology-consistent feature reconstruction loss (Topo-Loss) during the correspondence learning process. Our proposed method achieved state-of-the-art performance on three shape correspondence benchmark datasets. In addition, the proposed topology preservation concept can be easily generalized to other learning-based shape analysis tasks to regularize the topological structure of high-dimensional feature spaces.
Collapse
|
32
|
|
33
|
Learning Two-View Correspondences and Geometry via Local Neighborhood Correlation. ENTROPY 2021; 23:e23081024. [PMID: 34441164 PMCID: PMC8394602 DOI: 10.3390/e23081024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 07/23/2021] [Accepted: 07/23/2021] [Indexed: 11/17/2022]
Abstract
Seeking quality feature correspondences (also known as matches) is a foundational step in computer vision. In our work, a novel and effective network with a stable local constraint, named the Local Neighborhood Correlation Network (LNCNet), is proposed to capture abundant contextual information of each correspondence in the local region, followed by calculating the essential matrix and camera pose estimation. Firstly, the k-Nearest Neighbor (KNN) algorithm is used to divide the local neighborhood roughly. Then, we calculate the local neighborhood correlation matrix (LNC) between the selected correspondence and other correspondences in the local region, which is used to filter outliers to obtain more accurate local neighborhood information. We cluster the filtered information into feature vectors containing richer neighborhood contextual information so that they can be used to more accurately determine the probability of correspondences as inliers. Extensive experiments have demonstrated that our proposed LNCNet performs better than some state-of-the-art networks to accomplish outlier rejection and camera pose estimation tasks in complex outdoor and indoor scenes.
Collapse
|
34
|
Zhu H, Cui C, Deng L, Cheung RCC, Yan H. Elastic Net Constraint-Based Tensor Model for High-Order Graph Matching. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:4062-4074. [PMID: 31536028 DOI: 10.1109/tcyb.2019.2936176] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The procedure of establishing the correspondence between two sets of feature points is important in computer vision applications. In this article, an elastic net constraint-based tensor model is proposed for high-order graph matching. To control the tradeoff between the sparsity and the accuracy of the matching results, an elastic net constraint is introduced into the tensor-based graph matching model. Then, a nonmonotone spectral projected gradient (NSPG) method is derived to solve the proposed matching model. During the optimization of using NSPG, we propose an algorithm to calculate the projection on the feasible convex sets of elastic net constraint. Further, the global convergence of solving the proposed model using the NSPG method was proved. The superiority of the proposed method is verified through experiments on the synthetic data and natural images.
Collapse
|
35
|
Lyu W, Chen L, Zhou Z, Wu W. Weakly supervised object-aware convolutional neural networks for semantic feature matching. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.03.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
36
|
Yang L, Li Q, Song X, Cai W, Hou C, Xiong Z. An Improved Stereo Matching Algorithm for Vehicle Speed Measurement System Based on Spatial and Temporal Image Fusion. ENTROPY 2021; 23:e23070866. [PMID: 34356407 PMCID: PMC8305597 DOI: 10.3390/e23070866] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 07/04/2021] [Accepted: 07/05/2021] [Indexed: 11/20/2022]
Abstract
This paper proposes an improved stereo matching algorithm for vehicle speed measurement system based on spatial and temporal image fusion (STIF). Firstly, the matching point pairs in the license plate area with obviously abnormal distance to the camera are roughly removed according to the characteristic of license plate specification. Secondly, more mismatching point pairs are finely removed according to local neighborhood consistency constraint (LNCC). Thirdly, the optimum speed measurement point pairs are selected for successive stereo frame pairs by STIF of binocular stereo video, so that the 3D points corresponding to the matching point pairs for speed measurement in the successive stereo frame pairs are in the same position on the real vehicle, which can significantly improve the vehicle speed measurement accuracy. LNCC and STIF can be used not only for license plate, but also for vehicle logo, light, mirror etc. Experimental results demonstrate that the vehicle speed measurement system with the proposed LNCC+STIF stereo matching algorithm can significantly outperform the state-of-the-art system in accuracy.
Collapse
Affiliation(s)
- Lei Yang
- School of Electronic and Information, Zhongyuan University of Technology, Zhengzhou 450007, China; (Q.L.); (W.C.)
- Correspondence: (L.Y.); (X.S.)
| | - Qingyuan Li
- School of Electronic and Information, Zhongyuan University of Technology, Zhengzhou 450007, China; (Q.L.); (W.C.)
| | - Xiaowei Song
- School of Electronic and Information, Zhongyuan University of Technology, Zhengzhou 450007, China; (Q.L.); (W.C.)
- Dongjing Avenue Campus, Kaifeng University, Kaifeng 475004, China
- Correspondence: (L.Y.); (X.S.)
| | - Wenjing Cai
- School of Electronic and Information, Zhongyuan University of Technology, Zhengzhou 450007, China; (Q.L.); (W.C.)
| | - Chunping Hou
- School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China;
| | - Zixiang Xiong
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA;
| |
Collapse
|
37
|
HOLBP: Remote Sensing Image Registration Based on Histogram of Oriented Local Binary Pattern Descriptor. REMOTE SENSING 2021. [DOI: 10.3390/rs13122328] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Image registration has always been an important research topic. This paper proposes a novel method of constructing descriptors called the histogram of oriented local binary pattern descriptor (HOLBP) for fast and robust matching. There are three new components in our algorithm. First, we redefined the gradient and angle calculation template to make it more sensitive to edge information. Second, we proposed a new construction method of the HOLBP descriptor and improved the traditional local binary pattern (LBP) computation template. Third, the principle of uniform rotation-invariant LBP was applied to add 10-dimensional gradient direction information to form a 138-dimension HOLBP descriptor vector. The experimental results showed that our method is very stable in terms of accuracy and computational time for different test images.
Collapse
|
38
|
OSN: Onion-ring support neighbors for correspondence selection. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.01.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
39
|
Spherically Optimized RANSAC Aided by an IMU for Fisheye Image Matching. REMOTE SENSING 2021. [DOI: 10.3390/rs13102017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Fisheye cameras are widely used in visual localization due to the advantage of the wide field of view. However, the severe distortion in fisheye images lead to feature matching difficulties. This paper proposes an IMU-assisted fisheye image matching method called spherically optimized random sample consensus (So-RANSAC). We converted the putative correspondences into fisheye spherical coordinates and then used an inertial measurement unit (IMU) to provide relative rotation angles to assist fisheye image epipolar constraints and improve the accuracy of pose estimation and mismatch removal. To verify the performance of So-RANSAC, experiments were performed on fisheye images of urban drainage pipes and public data sets. The experimental results showed that So-RANSAC can effectively improve the mismatch removal accuracy, and its performance was superior to the commonly used fisheye image matching methods in various experimental scenarios.
Collapse
|
40
|
Liu T, Wang J, Yang B, Wang X. NGDNet: Nonuniform Gaussian-label distribution learning for infrared head pose estimation and on-task behavior understanding in the classroom. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.12.090] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
41
|
Shin BC, Seo JK. A Posteriori Outlier Rejection Approach Owing to the Well-ordering Property of a Sample Consensus Method for the Stitching of Drone-based Thermal Aerial Images. J Imaging Sci Technol 2021. [DOI: 10.2352/j.imagingsci.technol.2021.65.2.020504] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
|
42
|
Gong L, Zheng J, Ping Z, Wang Y, Wang S, Zuo S. Robust Mosaicing of Endomicroscopic Videos via Context-Weighted Correlation Ratio. IEEE Trans Biomed Eng 2021; 68:579-591. [PMID: 32746056 DOI: 10.1109/tbme.2020.3007768] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2022]
Abstract
Probe-based confocal laser endomicroscopy (pCLE) is a promising imaging tool that provides in situ and in vivo optical imaging to perform real-time pathological assessments. However, due to limited field of view, it is difficult for clinicians to get a full understanding of the scanned tissues. In this paper, we develop a novel mosaicing framework to assemble all frame sequences into a full view image. First, a hybrid rigid registration that combines feature matching and template matching is presented to achieve a global alignment of all frames. Then, the parametric free-form deformation (FFD) model with a multiresolution architecture is implemented to accommodate non-rigid tissue distortions. More importantly, we devise a robust similarity metric called context-weighted correlation ratio (CWCR) to promote registration accuracy, where spatial and geometric contexts are incorporated into the estimation of functional intensity dependence. Experiments on both robotic setup and manual manipulation have demonstrated that the proposed scheme significantly precedes some state-of-the-art mosaicing schemes in the presence of intensity fluctuations, insufficient overlap and tissue distortions. Moreover, the comparisons of the proposed CWCR metric and two other metrics have validated the effectiveness of the context-weighted strategy in quantifying the differences between two frames. Benefiting from more rational and delicate mosaics, the proposed scheme is more suitable to instruct diagnosis and treatment during optical biopsies.
Collapse
|
43
|
Liu Y, Li Y, Dai L, Yang C, Wei L, Lai T, Chen R. Robust feature matching via advanced neighborhood topology consensus. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.09.047] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
44
|
He X, Matsumaru T. Estimation of Flat Object Deformation Using RGB-D Sensor for Robot Reproduction. SENSORS 2020; 21:s21010105. [PMID: 33375309 PMCID: PMC7795394 DOI: 10.3390/s21010105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 12/21/2020] [Accepted: 12/24/2020] [Indexed: 11/16/2022]
Abstract
This paper introduces a system that can estimate the deformation process of a deformed flat object (folded plane) and generate the input data for a robot with human-like dexterous hands and fingers to reproduce the same deformation of another similar object. The system is based on processing RGB data and depth data with three core techniques: a weighted graph clustering method for non-rigid point matching and clustering; a refined region growing method for plane detection on depth data based on an offset error defined by ourselves; and a novel sliding checking model to check the bending line and adjacent relationship between each pair of planes. Through some evaluation experiments, we show the improvement of the core techniques to conventional studies. By applying our approach to different deformed papers, the performance of the entire system is confirmed to have around 1.59 degrees of average angular error, which is similar to the smallest angular discrimination of human eyes. As a result, for the deformation of the flat object caused by folding, if our system can get at least one feature point cluster on each plane, it can get spatial information of each bending line and each plane with acceptable accuracy. The subject of this paper is a folded plane, but we will develop it into a robotic reproduction of general object deformation.
Collapse
|
45
|
Deng L, Yuan X, Deng C, Chen J, Cai Y. Image Stitching Based on Nonrigid Warping for Urban Scene. SENSORS 2020; 20:s20247050. [PMID: 33317036 PMCID: PMC7763989 DOI: 10.3390/s20247050] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 12/05/2020] [Accepted: 12/06/2020] [Indexed: 11/16/2022]
Abstract
Image stitching based on a global alignment model is widely used in computer vision. However, the resulting stitched image may look blurry or ghosted due to parallax. To solve this problem, we propose a parallax-tolerant image stitching method based on nonrigid warping in this paper. Given a group of putative feature correspondences between overlapping images, we first use a semiparametric function fitting, which introduces a motion coherence constraint to remove outliers. Then, the input images are warped according to a nonrigid warp model based on Gaussian radial basis functions. The nonrigid warping is a kind of elastic deformation that is flexible and smooth enough to eliminate moderate parallax errors. This leads to high-precision alignment in the overlapped region. For the nonoverlapping region, we use a rigid similarity model to reduce distortion. Through effective transition, the nonrigid warping of the overlapped region and the rigid warping of the nonoverlapping region can be used jointly. Our method can obtain more accurate local alignment while maintaining the overall shape of the image. Experimental results on several challenging data sets for urban scene show that the proposed approach is better than state-of-the-art approaches in both qualitative and quantitative indicators.
Collapse
|
46
|
Ultrarobust support vector registration. APPL INTELL 2020. [DOI: 10.1007/s10489-020-01967-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
47
|
Reliable and Efficient UAV Image Matching via Geometric Constraints Structured by Delaunay Triangulation. REMOTE SENSING 2020. [DOI: 10.3390/rs12203390] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Outlier removal is a crucial step in local feature-based unmanned aerial vehicle (UAV) image matching. Inspired by our previous work, this paper proposes a method for reliable and efficient outlier removal in UAV image matching. The inputs of the method are only two images without any other auxiliary data. The core idea is to design local geometric constraints within the neighboring structure via the Delaunay triangulation and use a two-stage method for outlier removal and match refinement. In the filter stage, initial matches are first organized as the Delaunay triangulation (DT) and its corresponding graph, and their dissimilarity scores are computed from the affine-invariant spatial angular order (SAO), which is used to achieve hierarchical outlier removal. In addition, by using the triangle constraint between the refined Delaunay triangulation and its corresponding graph, missed inliers are resumed from match expansion. In the verification stage, retained matches are refined using the RANSAC-based global geometric constraint. Therefore, the two-stage algorithm is termed DTSAO-RANSAC. Finally, using four datasets, DTSAO-RANSAC is comprehensively analyzed and compared with other methods in feature matching and image orientation tests. The experimental results demonstrate that compared with the LO-RANSAC algorithm, DTSAO-RANSAC can achieve efficient outlier removal with speedup ratios ranging from 4 to 16 and, it can provide reliable matching results for image orientation of UAV datasets.
Collapse
|
48
|
Abstract
Feature matching is to detect and match corresponding feature points in stereo pairs, which is one of the key techniques in accurate camera orientations. However, several factors limit the feature matching accuracy, e.g., image textures, viewing angles of stereo cameras, and resolutions of stereo pairs. To improve the feature matching accuracy against these limiting factors, this paper imposes spatial smoothness constraints over the whole feature point sets with the underlying assumption that feature points should have similar matching results with their surrounding high-confidence points and proposes a robust feature matching method with the spatial smoothness constraints (RMSS). The core algorithm constructs a graph structure from the feature point sets and then formulates the feature matching problem as the optimization of a global energy function with first-order, spatial smoothness constraints based on the graph. For computational purposes, the global optimization of the energy function is then broken into sub-optimizations of each feature point, and an approximate solution of the energy function is iteratively derived as the matching results of the whole feature point sets. Experiments on close-range datasets with some above limiting factors show that the proposed method was capable of greatly improving the matching robustness and matching accuracy of some feature descriptors (e.g., scale-invariant feature transform (SIFT) and Speeded Up Robust Features (SURF)). After the optimization of the proposed method, the inlier number of SIFT and SURF was increased by average 131.9% and 113.5%, the inlier percentages between the inlier number and the total matches number of SIFT and SURF were increased by average 259.0% and 307.2%, and the absolute matching accuracy of SIFT and SURF was improved by average 80.6% and 70.2%.
Collapse
|
49
|
Wang Y, Mei X, Ma Y, Huang J, Fan F, Ma J. Learning to find reliable correspondences with local neighborhood consensus. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.04.016] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
50
|
Ma J, Jiang X, Fan A, Jiang J, Yan J. Image Matching from Handcrafted to Deep Features: A Survey. Int J Comput Vis 2020. [DOI: 10.1007/s11263-020-01359-2] [Citation(s) in RCA: 230] [Impact Index Per Article: 46.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
AbstractAs a fundamental and critical task in various visual applications, image matching can identify then correspond the same or similar structure/content from two or more images. Over the past decades, growing amount and diversity of methods have been proposed for image matching, particularly with the development of deep learning techniques over the recent years. However, it may leave several open questions about which method would be a suitable choice for specific applications with respect to different scenarios and task requirements and how to design better image matching methods with superior performance in accuracy, robustness and efficiency. This encourages us to conduct a comprehensive and systematic review and analysis for those classical and latest techniques. Following the feature-based image matching pipeline, we first introduce feature detection, description, and matching techniques from handcrafted methods to trainable ones and provide an analysis of the development of these methods in theory and practice. Secondly, we briefly introduce several typical image matching-based applications for a comprehensive understanding of the significance of image matching. In addition, we also provide a comprehensive and objective comparison of these classical and latest techniques through extensive experiments on representative datasets. Finally, we conclude with the current status of image matching technologies and deliver insightful discussions and prospects for future works. This survey can serve as a reference for (but not limited to) researchers and engineers in image matching and related fields.
Collapse
|