1
|
Yang Z, Zhao K, Yang S, Xiong Y, Zhang C, Deng L, Zhang D. Research on a Density-Based Clustering Method for Eliminating Inter-Frame Feature Mismatches in Visual SLAM Under Dynamic Scenes. SENSORS (BASEL, SWITZERLAND) 2025; 25:622. [PMID: 39943261 PMCID: PMC11820649 DOI: 10.3390/s25030622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2024] [Revised: 01/12/2025] [Accepted: 01/17/2025] [Indexed: 02/16/2025]
Abstract
Visual SLAM relies on the motion information of static feature points in keyframes for both localization and map construction. Dynamic feature points interfere with inter-frame motion pose estimation, thereby affecting the accuracy of map construction and the overall robustness of the visual SLAM system. To address this issue, this paper proposes a method for eliminating feature mismatches between frames in visual SLAM under dynamic scenes. First, a spatial clustering-based RANSAC method is introduced. This method eliminates mismatches by leveraging the distribution of dynamic and static feature points, clustering the points, and separating dynamic from static clusters, retaining only the static clusters to generate a high-quality dataset. Next, the RANSAC method is introduced to fit the geometric model of feature matches, eliminating local mismatches in the high-quality dataset with fewer iterations. The accuracy of the DSSAC-RANSAC method in eliminating feature mismatches between frames is then tested on both indoor and outdoor dynamic datasets, and the robustness of the proposed algorithm is further verified on self-collected outdoor datasets. Experimental results demonstrate that the proposed algorithm reduces the average reprojection error by 58.5% and 49.2%, respectively, when compared to traditional RANSAC and GMS-RANSAC methods. The reprojection error variance is reduced by 65.2% and 63.0%, while the processing time is reduced by 69.4% and 31.5%, respectively. Finally, the proposed algorithm is integrated into the initialization thread of ORB-SLAM2 and the tracking thread of ORB-SLAM3 to validate its effectiveness in eliminating feature mismatches between frames in visual SLAM.
Collapse
Affiliation(s)
- Zhiyong Yang
- Engineering Research and Design Institute of Agricultural Equipment, Hubei University of Technology, Wuhan 430068, China;
- Hubei Engineering Research Center for Intellectualization of Agricultural Equipment, Wuhan 430068, China
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Kun Zhao
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Shengze Yang
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Yuhong Xiong
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Changjin Zhang
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Lielei Deng
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Daode Zhang
- Engineering Research and Design Institute of Agricultural Equipment, Hubei University of Technology, Wuhan 430068, China;
- Hubei Engineering Research Center for Intellectualization of Agricultural Equipment, Wuhan 430068, China
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| |
Collapse
|
2
|
Maharmeh E, Alsayed Z, Nashashibi F. A Comprehensive Survey on the Integrity of Localization Systems. SENSORS (BASEL, SWITZERLAND) 2025; 25:358. [PMID: 39860728 PMCID: PMC11768486 DOI: 10.3390/s25020358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2024] [Revised: 12/18/2024] [Accepted: 12/26/2024] [Indexed: 01/27/2025]
Abstract
This survey extends and refines the existing definitions of integrity and protection level in localization systems (localization as a broad term, i.e., not limited to GNSS-based localization). In our definition, we study integrity from two aspects: quality and quantity. Unlike existing reviews, this survey examines integrity methods covering various localization techniques and sensors. We classify localization techniques as optimization-based, fusion-based, and SLAM-based. A new classification of integrity methods is introduced, evaluating their applications, effectiveness, and limitations. Comparative tables summarize strengths and gaps across key criteria, such as algorithms, evaluation methods, sensor data, and more. The survey presents a general probabilistic model addressing diverse error types in localization systems. Findings reveal a significant research imbalance: 73.3% of surveyed papers focus on GNSS-based methods, while only 26.7% explore non-GNSS approaches like fusion, optimization, or SLAM, with few addressing protection level calculations. Robust modeling is highlighted as a promising integrity method, combining quantification and qualification to address critical gaps. This approach offers a unified framework for improving localization system reliability and safety. This survey provides key insights for developing more robust localization systems, contributing to safer and more efficient autonomous operations.
Collapse
Affiliation(s)
- Elias Maharmeh
- Valeo Mobility Tech Center (VMTC), 6 Rue Daniel Costantini, 94000 Créteil, France;
- Inria-ASTRA Team, 48 Rue Barrault, 75013 Paris, France;
| | - Zayed Alsayed
- Valeo Mobility Tech Center (VMTC), 6 Rue Daniel Costantini, 94000 Créteil, France;
- Inria-ASTRA Team, 48 Rue Barrault, 75013 Paris, France;
| | | |
Collapse
|
3
|
Xia Y, Ma J. Grid-Guided Sparse Laplacian Consensus for Robust Feature Matching. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:1367-1381. [PMID: 40031433 DOI: 10.1109/tip.2025.3539469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Feature matching is a fundamental concern widely employed in computer vision applications. This paper introduces a novel and efficacious method named Grid-guided Sparse Laplacian Consensus, rooted in the concept of smooth constraints. To address challenging scenes such as severe deformation and independent motions, we devise grid-based adaptive matching guidance to construct multiple transformations based on motion coherence. Specifically, we obtain a set of precise yet sparse seed correspondences through motion statistics, facilitating the generation of an adaptive number of candidate correspondence sets. In addition, we propose an innovative formulation grounded in graph Laplacian for correspondence pruning, wherein mapping function estimation is formulated as a Bayesian model. We solve this utilizing EM algorithm with seed correspondences as initialization for optimal convergence. Sparse approximation is leveraged to reduce the time-space burden. A comprehensive set of experiments are conducted to demonstrate the superiority of our method over other state-of-the-art methods in both robustness to serious deformations and generalizability for various descriptors, as well as generalizability to multi motions. Additionally, experiments in geometric estimation, image registration, loop closure detection, and visual localization highlight the significance of our method across diverse scenes for high-level tasks.
Collapse
|
4
|
Yang J, Zhang X, Fan S, Ren C, Zhang Y. Mutual Voting for Ranking 3D Correspondences. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:4041-4057. [PMID: 37074893 DOI: 10.1109/tpami.2023.3268297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Consistent correspondences between point clouds are vital to 3D vision tasks such as registration and recognition. In this paper, we present a mutual voting method for ranking 3D correspondences. The key insight is to achieve reliable scoring results for correspondences by refining both voters and candidates in a mutual voting scheme. First, a graph is constructed for the initial correspondence set with the pairwise compatibility constraint. Second, nodal clustering coefficients are introduced to preliminarily remove a portion of outliers and speed up the following voting process. Third, we model nodes and edges in the graph as candidates and voters, respectively. Mutual voting is then performed in the graph to score correspondences. Finally, the correspondences are ranked based on the voting scores and top-ranked ones are identified as inliers. Feature matching, 3D point cloud registration, and 3D object recognition experiments on various datasets with different nuisances and modalities verify that MV is robust to heavy outliers under different challenging settings, and can significantly boost 3D point cloud registration and 3D object recognition performance.
Collapse
|
5
|
Zhang S, Ma J. ConvMatch: Rethinking Network Design for Two-View Correspondence Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:2920-2935. [PMID: 37983155 DOI: 10.1109/tpami.2023.3334515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Multilayer perceptron (MLP) has become the de facto backbone in two-view correspondence learning, for it can extract effective deep features from unordered correspondences individually. However, the problem of natively lacking context information limits its performance although many context-capturing modules are appended in the follow-up studies. In this paper, from a novel perspective, we design a correspondence learning network called ConvMatch that for the first time can leverage a convolutional neural network (CNN) as the backbone, inherently capable of context aggregation. Specifically, with the observation that sparse motion vectors and a dense motion field can be converted into each other with interpolating and sampling, we regularize the putative motion vectors by estimating the dense motion field implicitly, then rectify the errors caused by outliers in local areas with CNN, and finally obtain correct motion vectors from the rectified motion field. Moreover, we propose global information injection and bilateral convolution, to fit the overall spatial transformation better and accommodate the discontinuities of the motion field in case of large scene disparity. Extensive experiments reveal that ConvMatch consistently outperforms state-of-the-arts for relative pose estimation, homography estimation, and visual localization.
Collapse
|
6
|
Wang H, Song C, Wang J, Gao P. A raster-based spatial clustering method with robustness to spatial outliers. Sci Rep 2024; 14:4103. [PMID: 38374209 PMCID: PMC10876529 DOI: 10.1038/s41598-024-53066-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 01/27/2024] [Indexed: 02/21/2024] Open
Abstract
Spatial clustering is an essential method for the comprehensive understanding of a region. Spatial clustering divides all spatial units into different clusters. The attributes of each cluster of the spatial units are similar, and simultaneously, they are as continuous as spatially possible. In spatial clustering, the handling of spatial outliers is important. It is necessary to improve spatial integration so that each cluster is connected as much as possible, while protecting spatial outliers can help avoid the excessive masking of attribute differences This paper proposes a new spatial clustering method for raster data robust to spatial outliers. The method employs a sliding window to scan the entire region to determine spatial outliers. Additionally, a mechanism based on the range and standard deviation of the spatial units in each window is designed to judge whether the spatial integration should be further improved or the spatial outliers should be protected. To demonstrate the usefulness of the proposed method, we applied it in two case study areas, namely, Changping District and Pinggu District in Beijing. The results show that the proposed method can retain the spatial outliers while ensuring that the clusters are roughly contiguous. This method can be used as a simple but powerful and easy-to-interpret alternative to existing geographical spatial clustering methods.
Collapse
Affiliation(s)
- Haoyu Wang
- Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China
| | - Changqing Song
- Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China.
| | - Jinfeng Wang
- Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, 100101, China
| | - Peichao Gao
- Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University, Beijing, 100875, China
| |
Collapse
|
7
|
El Saer A, Grammatikopoulos L, Sfikas G, Karras G, Petsa E. A Novel Framework for Image Matching and Stitching for Moving Car Inspection under Illumination Challenges. SENSORS (BASEL, SWITZERLAND) 2024; 24:1083. [PMID: 38400240 PMCID: PMC10891783 DOI: 10.3390/s24041083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 02/02/2024] [Accepted: 02/05/2024] [Indexed: 02/25/2024]
Abstract
Vehicle exterior inspection is a critical operation for identifying defects and ensuring the overall safety and integrity of vehicles. Visual-based inspection of moving objects, such as vehicles within dynamic environments abounding with reflections, presents significant challenges, especially when time and accuracy are of paramount importance. Conventional exterior inspections of vehicles require substantial labor, which is both costly and prone to errors. Recent advancements in deep learning have reduced labor work by enabling the use of segmentation algorithms for defect detection and description based on simple RGB camera acquisitions. Nonetheless, these processes struggle with issues of image orientation leading to difficulties in accurately differentiating between detected defects. This results in numerous false positives and additional labor effort. Estimating image poses enables precise localization of vehicle damages within a unified 3D reference system, following initial detections in the 2D imagery. A primary challenge in this field is the extraction of distinctive features and the establishment of accurate correspondences between them, a task that typical image matching techniques struggle to address for highly reflective moving objects. In this study, we introduce an innovative end-to-end pipeline tailored for efficient image matching and stitching, specifically addressing the challenges posed by moving objects in static uncalibrated camera setups. Extracting features from moving objects with strong reflections presents significant difficulties, beyond the capabilities of current image matching algorithms. To tackle this, we introduce a novel filtering scheme that can be applied to every image matching process, provided that the input features are sufficient. A critical aspect of this module involves the exclusion of points located in the background, effectively distinguishing them from points that pertain to the vehicle itself. This is essential for accurate feature extraction and subsequent analysis. Finally, we generate a high-quality image mosaic by employing a series of sequential stereo-rectified pairs.
Collapse
Affiliation(s)
- Andreas El Saer
- Department of Surveying and Geoinformatics Engineering, University of West Attica, 12243 Athens, Greece; (A.E.S.); (G.S.); (E.P.)
| | - Lazaros Grammatikopoulos
- Department of Surveying and Geoinformatics Engineering, University of West Attica, 12243 Athens, Greece; (A.E.S.); (G.S.); (E.P.)
| | - Giorgos Sfikas
- Department of Surveying and Geoinformatics Engineering, University of West Attica, 12243 Athens, Greece; (A.E.S.); (G.S.); (E.P.)
| | - George Karras
- School of Rural, Surveying and Geoinformatics Engineering, National Technical University of Athens, 15780 Athens, Greece;
| | - Elli Petsa
- Department of Surveying and Geoinformatics Engineering, University of West Attica, 12243 Athens, Greece; (A.E.S.); (G.S.); (E.P.)
| |
Collapse
|
8
|
Misra I, Kumar Rohil M, Manthira Moorthi S, Dhar D. Enhanced Multispectral Band-to-Band Registration Using Co-Occurrence Scale Space and Spatial Confined RANSAC Guided Segmented Affine Transformation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:6521-6534. [PMID: 40030230 DOI: 10.1109/tip.2024.3494555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Band-to-Band Registration (BBR) is a pre-requisite image processing operation essential for specific remote sensing multispectral sensors. BBR aims to align spectral wavelength channels at sub-pixel level accuracy over each other. The paper presents a novel BBR technique utilizing Co-occurrence Scale Space (CSS) for feature point detection and Spatial Confined RANSAC (SC-RANSAC) for removing outlier matched control points. Additionally, the Segmented Affine Transformation (SAT) model reduces distortion and ensures consistent BBR. The methodology developed is evaluated with Nano-MX multispectral images onboard the Indian Nano Satellite (INS-2B) covering diverse landscapes. BBR performance using the proposed method is also verified visually at a 4X zoom level on satellite scenes dominated by cloud pixels. The band misregistration effect on the Normalized Difference Vegetation Index (NDVI) from INS-2B is analyzed and cross-validated with the closest acquisition Landsat-9 OLI NDVI map before and after BBR correction. The experimental evaluation shows that the proposed BBR approach outperforms the state-of-the-art image registration techniques.
Collapse
|
9
|
Liu S, Huang Z, Li J, Li A, Huang X. FILNet: Fast Image-Based Indoor Localization Using an Anchor Control Network. SENSORS (BASEL, SWITZERLAND) 2023; 23:8140. [PMID: 37836972 PMCID: PMC10575192 DOI: 10.3390/s23198140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 09/14/2023] [Accepted: 09/22/2023] [Indexed: 10/15/2023]
Abstract
This paper designs a fast image-based indoor localization method based on an anchor control network (FILNet) to improve localization accuracy and shorten the duration of feature matching. Particularly, two stages are developed for the proposed algorithm. The offline stage is to construct an anchor feature fingerprint database based on the concept of an anchor control network. This introduces detailed surveys to infer anchor features according to the information of control anchors using the visual-inertial odometry (VIO) based on Google ARcore. In addition, an affine invariance enhancement algorithm based on feature multi-angle screening and supplementation is developed to solve the image perspective transformation problem and complete the feature fingerprint database construction. In the online stage, a fast spatial indexing approach is adopted to improve the feature matching speed by searching for active anchors and matching only anchor features around the active anchors. Further, to improve the correct matching rate, a homography matrix filter model is used to verify the correctness of feature matching, and the correct matching points are selected. Extensive experiments in real-world scenarios are performed to evaluate the proposed FILNet. The experimental results show that in terms of affine invariance, compared with the initial local features, FILNet significantly improves the recall of feature matching from 26% to 57% when the angular deviation is less than 60 degrees. In the image feature matching stage, compared with the initial K-D tree algorithm, FILNet significantly improves the efficiency of feature matching, and the average time of the test image dataset is reduced from 30.3 ms to 12.7 ms. In terms of localization accuracy, compared with the benchmark method based on image localization, FILNet significantly improves the localization accuracy, and the percentage of images with a localization error of less than 0.1m increases from 31.61% to 55.89%.
Collapse
Affiliation(s)
- Sikang Liu
- School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China;
- State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430072, China;
| | - Zhao Huang
- School of Electronic Engineering, Queen Mary, University of London, London E1 4NS, UK; (Z.H.); (A.L.)
| | - Jiafeng Li
- State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430072, China;
| | - Anna Li
- School of Electronic Engineering, Queen Mary, University of London, London E1 4NS, UK; (Z.H.); (A.L.)
| | - Xingru Huang
- School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China;
| |
Collapse
|
10
|
Jurdana V, Lopac N, Vrankic M. Sparse Time-Frequency Distribution Reconstruction Using the Adaptive Compressed Sensed Area Optimized with the Multi-Objective Approach. SENSORS (BASEL, SWITZERLAND) 2023; 23:4148. [PMID: 37112488 PMCID: PMC10143442 DOI: 10.3390/s23084148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 04/17/2023] [Accepted: 04/19/2023] [Indexed: 06/19/2023]
Abstract
Compressive sensing (CS) of the signal ambiguity function (AF) and enforcing the sparsity constraint on the resulting signal time-frequency distribution (TFD) has been shown to be an efficient method for time-frequency signal processing. This paper proposes a method for adaptive CS-AF area selection, which extracts the magnitude-significant AF samples through a clustering approach using the density-based spatial clustering algorithm. Moreover, an appropriate criterion for the performance of the method is formalized, i.e., component concentration and preservation, as well as interference suppression, are measured utilizing the information obtained from the short-term and the narrow-band Rényi entropies, while component connectivity is evaluated using the number of regions with continuously-connected samples. The CS-AF area selection and reconstruction algorithm parameters are optimized using an automatic multi-objective meta-heuristic optimization method, minimizing the here-proposed combination of measures as objective functions. Consistent improvement in CS-AF area selection and TFD reconstruction performance has been achieved without requiring a priori knowledge of the input signal for multiple reconstruction algorithms. This was demonstrated for both noisy synthetic and real-life signals.
Collapse
Affiliation(s)
- Vedran Jurdana
- Faculty of Engineering, University of Rijeka, 51000 Rijeka, Croatia;
| | - Nikola Lopac
- Faculty of Maritime Studies, University of Rijeka, 51000 Rijeka, Croatia
- Center for Artificial Intelligence and Cybersecurity, University of Rijeka, 51000 Rijeka, Croatia
| | - Miroslav Vrankic
- Faculty of Engineering, University of Rijeka, 51000 Rijeka, Croatia;
| |
Collapse
|
11
|
Bellavia F. SIFT Matching by Context Exposed. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2445-2457. [PMID: 35320089 DOI: 10.1109/tpami.2022.3161853] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This paper investigates how to step up local image descriptor matching by exploiting matching context information. Two main contexts are identified, originated respectively from the descriptor space and from the keypoint space. The former is generally used to design the actual matching strategy while the latter to filter matches according to the local spatial consistency. On this basis, a new matching strategy and a novel local spatial filter, named respectively blob matching and Delaunay Triangulation Matching (DTM) are devised. Blob matching provides a general matching framework by merging together several strategies, including rank-based pre-filtering as well as many-to-many and symmetric matching, enabling to achieve a global improvement upon each individual strategy. DTM alternates between Delaunay triangulation contractions and expansions to figure out and adjust keypoint neighborhood consistency. Experimental evaluation shows that DTM is comparable or better than the state-of-the-art in terms of matching accuracy and robustness. Evaluation is carried out according to a new benchmark devised for analyzing the matching pipeline in terms of correct correspondences on both planar and non-planar scenes, including several state-of-the-art methods as well as the common SIFT matching approach for reference. This evaluation can be of assistance for future research in this field.
Collapse
|
12
|
DenseFilter: Feature Correspondence Filter Based on Dense Networks for VSLAM. J INTELL ROBOT SYST 2022. [DOI: 10.1007/s10846-022-01735-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
13
|
Xia Y, Ma J. Locality-Guided Global-Preserving Optimization for Robust Feature Matching. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:5093-5108. [PMID: 35895644 DOI: 10.1109/tip.2022.3192993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Feature matching is a fundamental problem in many computer vision tasks. This paper proposes a novel effective framework for mismatch removal, named LOcality-guided Global-preserving Optimization (LOGO). To identify inliers from a putative matching set generated by feature descriptor similarity, we introduce a fixed-point progressive approach to optimize a graph-based objective, which represents a two-class assignment problem regarding an affinity matrix containing global structures. We introduce a strategy that a small initial set with a high inlier ratio exploits the topology of the affinity matrix to elicit other inliers based on their reliable geometry, which enhances the robustness to outliers. Geometrically, we provide a locality-guided matching strategy, i.e., using local topology consensus as a criterion to determine the initial set, thus expanding to yield the final feature matching set. In addition, we apply local affine transformations based on reference points to determine the local consensus and similarity scores of nodes and edges, ensuring the validity and generality for various scenarios including complex nonrigid transformations. Extensive experiments demonstrate the effectiveness and robustness of the proposed LOGO, which is competitive with the current state-of-the-art methods. It also exhibits favorable potential for high-level vision tasks, such as essential and fundamental matrix estimation, image registration and loop closure detection.
Collapse
|
14
|
|
15
|
A Novel and Effective Cooperative RANSAC Image Matching Method Using Geometry Histogram-Based Constructed Reduced Correspondence Set. REMOTE SENSING 2022. [DOI: 10.3390/rs14143256] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The success of many computer vision and pattern recognition applications depends on matching local features on two or more images. Because the initial correspondence set—i.e., the set of the initial feature pairs—is often contaminated by mismatches, removing mismatches is a necessary task prior to image matching. In this paper, we first propose a fast geometry histogram-based (GH-based) mismatch removal strategy to construct a reduced correspondence set Creduced,GH from the initial correspondence set Cini. Next, we propose an effective cooperative random sample consensus (COOSAC) method for remote sensing image matching. COOSAC consists of a RANSAC, called RANSACini working on Cini, and a tiny RANSAC, called RANSACtiny,GH working on a randomly selected subset of Creduced,GH. In RANSACtiny,GH, an iterative area constraint-based sampling strategy is proposed to estimate the model solution of Ctiny,GH until the specified confidence level is reached, and then RANSACini utilizes the estimated model solution of Ctiny,GH to calculate the inlier rate of Cini. COOSAC repeats the above cooperation between RANSACtiny,GH and RANSACini until the specified confidence level is reached, reporting the resultant model solution of Cini. For convenience, our image matching method is called the GH-COOSAC method. Based on several testing datasets, thorough experimental results demonstrate that the proposed GH-COOSAC method achieves lower computational cost and higher matching accuracy benefits when compared with the state-of-the-art image matching methods.
Collapse
|
16
|
Xia H, Weng J, Zhang JZ, Gao Y. Rural E-Commerce Model with Attention Mechanism: Role of Li Ziqi’s Short Videos from the Perspective of Heterogeneous Knowledge Management. JOURNAL OF GLOBAL INFORMATION TECHNOLOGY MANAGEMENT 2022. [DOI: 10.1080/1097198x.2022.2062992] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Huosong Xia
- School of Management, Wuhan Textile University, Wuhan, Hubei, China
| | - Juan Weng
- School of Management, Wuhan Textile University, Wuhan, Hubei, China
| | | | - Yangmei Gao
- School of Management, Wuhan Textile University, Wuhan, Hubei, China
| |
Collapse
|
17
|
Chen J, Chen S, Chen X, Dai Y, Yang Y. CSR-Net: Learning Adaptive Context Structure Representation for Robust Feature Correspondence. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:3197-3210. [PMID: 35427222 DOI: 10.1109/tip.2022.3166284] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Feature matching, which refers to identifying and then corresponding the same or similar visual pattern from two or more images, is a key technique in any image processing task that requires establishing good correspondences between images. Given potential correspondences (matches) in two scenes, a novel whole-part deep learning framework, termed as Context Structure Representation Network (CSR-Net), is designed to infer the probabilities of arbitrary correspondences being inliers. Traditional approaches commonly build the local relation between correspondences by manually engineered criteria. Different from existing attempts, the main idea of our work is to learn explicitly neighborhood structure of each correspondence, allowing us to formulate the matching problem into a dynamic local structure consensus evaluation in an end-to-end fashion. For this purpose, we propose a permutation-invariant STructure Representation (STR) learning module, which can easily merge different types of networks into a unified architecture to deal with sparse matches directly. By the collaborative use of STR, we introduce a Context-Aware Attention (CAA) mechanism to adaptively re-calibrate structure features via a rotation-invariant context aware encoding and simple feature gating, thus arising the ability of fine-grained patterns recognition. Moreover, to further weaken the cost of establishing reliable correspondences, the CSR-Net is formulated as whole-part consensus learning, where the aim of whole level is compensating rigid transformations. In order to demonstrate our CSR-Net can effectively boost the baselines, we intensively experiment on image matching and other visual tasks. The results of the experiment confirm that the matching performances of CSR-Net have significantly improved over nine state-of-the-art competitors.
Collapse
|
18
|
Abstract
This paper introduces an Unmanned Aerial Vehicle (UAV) image stitching method, based on the optimal seam algorithm and half-projective warp, that can effectively retain the original information of the image and obtain the ideal stitching effect. The existing seam stitching algorithms can eliminate the ghosting and blurring problems on the stitched images, but the deformation and angle distortion caused by image registration will remain in the stitching results. To overcome this situation, we propose a stitching strategy based on optimal seam and half-projective warp. Firstly, we define a new difference matrix in the overlapping region of the aligned image, which includes the color, structural and line difference information. Then, we constrain the search range of the seam by the minimum energy, and propose a seam search algorithm based on the global minimum energy to obtain the seam. Finally, combined with the seam position and half-projective warp, the shape of the stitched image is rectified to keep more regions in their original shape. The experimental results of several groups of UAV images show that our method has a superior stitching effect.
Collapse
|
19
|
Liu W, Wang C, Chen S, Bian X, Lai B, Shen X, Cheng M, Lai SH, Weng D, Li J. Y-Net: Learning Domain Robust Feature Representation for ground camera image and large-scale image-based point cloud registration. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.10.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
20
|
Learning Two-View Correspondences and Geometry via Local Neighborhood Correlation. ENTROPY 2021; 23:e23081024. [PMID: 34441164 PMCID: PMC8394602 DOI: 10.3390/e23081024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 07/23/2021] [Accepted: 07/23/2021] [Indexed: 11/17/2022]
Abstract
Seeking quality feature correspondences (also known as matches) is a foundational step in computer vision. In our work, a novel and effective network with a stable local constraint, named the Local Neighborhood Correlation Network (LNCNet), is proposed to capture abundant contextual information of each correspondence in the local region, followed by calculating the essential matrix and camera pose estimation. Firstly, the k-Nearest Neighbor (KNN) algorithm is used to divide the local neighborhood roughly. Then, we calculate the local neighborhood correlation matrix (LNC) between the selected correspondence and other correspondences in the local region, which is used to filter outliers to obtain more accurate local neighborhood information. We cluster the filtered information into feature vectors containing richer neighborhood contextual information so that they can be used to more accurately determine the probability of correspondences as inliers. Extensive experiments have demonstrated that our proposed LNCNet performs better than some state-of-the-art networks to accomplish outlier rejection and camera pose estimation tasks in complex outdoor and indoor scenes.
Collapse
|
21
|
Lyu W, Chen L, Zhou Z, Wu W. Weakly supervised object-aware convolutional neural networks for semantic feature matching. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.03.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
22
|
Liang B, Li N, He Z, Wang Z, Fu Y, Lu T. News Video Summarization Combining SURF and Color Histogram Features. ENTROPY 2021; 23:e23080982. [PMID: 34441122 PMCID: PMC8393319 DOI: 10.3390/e23080982] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 07/21/2021] [Accepted: 07/27/2021] [Indexed: 11/16/2022]
Abstract
Because the data volume of news videos is increasing exponentially, a way to quickly browse a sketch of the video is important in various applications, such as news media, archives and publicity. This paper proposes a news video summarization method based on SURF features and an improved clustering algorithm, to overcome the defects in existing algorithms that fail to account for changes in shot complexity. Firstly, we extracted SURF features from the video sequences and matched the features between adjacent frames, and then detected the abrupt and gradual boundaries of the shot by calculating similarity scores between adjacent frames with the help of double thresholds. Secondly, we used an improved clustering algorithm to cluster the color histogram of the video frames within the shot, which merged the smaller clusters and then selected the frame closest to the cluster center as the key frame. The experimental results on both the public and self-built datasets show the superiority of our method over the alternatives in terms of accuracy and speed. Additionally, the extracted key frames demonstrate low redundancy and can credibly represent a sketch of news videos.
Collapse
Affiliation(s)
- Buyun Liang
- School of Computer Science, Wuhan University, Wuhan 430072, China; (B.L.); (Z.W.); (Y.F.)
| | - Na Li
- The Archives of Wuhan University, Wuhan University, Wuhan 430072, China
- Correspondence: (N.L.); (Z.H.); Tel.: +86-18971302643 (N.L.); +86-18986213167 (Z.H.)
| | - Zheng He
- School of Computer Science, Wuhan University, Wuhan 430072, China; (B.L.); (Z.W.); (Y.F.)
- Correspondence: (N.L.); (Z.H.); Tel.: +86-18971302643 (N.L.); +86-18986213167 (Z.H.)
| | - Zhongyuan Wang
- School of Computer Science, Wuhan University, Wuhan 430072, China; (B.L.); (Z.W.); (Y.F.)
| | - Youming Fu
- School of Computer Science, Wuhan University, Wuhan 430072, China; (B.L.); (Z.W.); (Y.F.)
| | - Tao Lu
- School of Computer Science and Engineering, Wuhan Institute of Technology, Wuhan 430073, China;
| |
Collapse
|
23
|
Yang L, Li Q, Song X, Cai W, Hou C, Xiong Z. An Improved Stereo Matching Algorithm for Vehicle Speed Measurement System Based on Spatial and Temporal Image Fusion. ENTROPY 2021; 23:e23070866. [PMID: 34356407 PMCID: PMC8305597 DOI: 10.3390/e23070866] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 07/04/2021] [Accepted: 07/05/2021] [Indexed: 11/20/2022]
Abstract
This paper proposes an improved stereo matching algorithm for vehicle speed measurement system based on spatial and temporal image fusion (STIF). Firstly, the matching point pairs in the license plate area with obviously abnormal distance to the camera are roughly removed according to the characteristic of license plate specification. Secondly, more mismatching point pairs are finely removed according to local neighborhood consistency constraint (LNCC). Thirdly, the optimum speed measurement point pairs are selected for successive stereo frame pairs by STIF of binocular stereo video, so that the 3D points corresponding to the matching point pairs for speed measurement in the successive stereo frame pairs are in the same position on the real vehicle, which can significantly improve the vehicle speed measurement accuracy. LNCC and STIF can be used not only for license plate, but also for vehicle logo, light, mirror etc. Experimental results demonstrate that the vehicle speed measurement system with the proposed LNCC+STIF stereo matching algorithm can significantly outperform the state-of-the-art system in accuracy.
Collapse
Affiliation(s)
- Lei Yang
- School of Electronic and Information, Zhongyuan University of Technology, Zhengzhou 450007, China; (Q.L.); (W.C.)
- Correspondence: (L.Y.); (X.S.)
| | - Qingyuan Li
- School of Electronic and Information, Zhongyuan University of Technology, Zhengzhou 450007, China; (Q.L.); (W.C.)
| | - Xiaowei Song
- School of Electronic and Information, Zhongyuan University of Technology, Zhengzhou 450007, China; (Q.L.); (W.C.)
- Dongjing Avenue Campus, Kaifeng University, Kaifeng 475004, China
- Correspondence: (L.Y.); (X.S.)
| | - Wenjing Cai
- School of Electronic and Information, Zhongyuan University of Technology, Zhengzhou 450007, China; (Q.L.); (W.C.)
| | - Chunping Hou
- School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China;
| | - Zixiang Xiong
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA;
| |
Collapse
|
24
|
Automatic Sub-Pixel Co-Registration of Remote Sensing Images Using Phase Correlation and Harris Detector. REMOTE SENSING 2021. [DOI: 10.3390/rs13122314] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In this paper, we propose a new approach for sub-pixel co-registration based on Fourier phase correlation combined with the Harris detector. Due to the limitation of the standard phase correlation method to achieve only pixel-level accuracy, another approach is required to reach sub-pixel matching precision. We first applied the Harris corner detector to extract corners from both references and sensed images. Then, we identified their corresponding points using phase correlation between the image pairs. To achieve sub-pixel registration accuracy, two optimization algorithms were used. The effectiveness of the proposed method was tested with very high-resolution (VHR) remote sensing images, including Pleiades satellite images and aerial imagery. Compared with the speeded-up robust features (SURF)-based method, phase correlation with the Blackman window function produced 91% more matches with high reliability. Moreover, the results of the optimization analysis have revealed that Nelder–Mead algorithm performs better than the two-point step size gradient algorithm regarding localization accuracy and computation time. The proposed approach achieves better accuracy than 0.5 pixels and outperforms the speeded-up robust features (SURF)-based method. It can achieve sub-pixel accuracy in the presence of noise and produces large numbers of correct matching points.
Collapse
|
25
|
Xiao G, Wang H, Ma J, Suter D. Segmentation by Continuous Latent Semantic Analysis for Multi-structure Model Fitting. Int J Comput Vis 2021. [DOI: 10.1007/s11263-021-01468-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
26
|
Liu Y, Li Y, Dai L, Yang C, Wei L, Lai T, Chen R. Robust feature matching via advanced neighborhood topology consensus. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.09.047] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
27
|
Wang Y, Mei X, Ma Y, Huang J, Fan F, Ma J. Learning to find reliable correspondences with local neighborhood consensus. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.04.016] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
28
|
Ma J, Jiang X, Fan A, Jiang J, Yan J. Image Matching from Handcrafted to Deep Features: A Survey. Int J Comput Vis 2020. [DOI: 10.1007/s11263-020-01359-2] [Citation(s) in RCA: 230] [Impact Index Per Article: 46.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
AbstractAs a fundamental and critical task in various visual applications, image matching can identify then correspond the same or similar structure/content from two or more images. Over the past decades, growing amount and diversity of methods have been proposed for image matching, particularly with the development of deep learning techniques over the recent years. However, it may leave several open questions about which method would be a suitable choice for specific applications with respect to different scenarios and task requirements and how to design better image matching methods with superior performance in accuracy, robustness and efficiency. This encourages us to conduct a comprehensive and systematic review and analysis for those classical and latest techniques. Following the feature-based image matching pipeline, we first introduce feature detection, description, and matching techniques from handcrafted methods to trainable ones and provide an analysis of the development of these methods in theory and practice. Secondly, we briefly introduce several typical image matching-based applications for a comprehensive understanding of the significance of image matching. In addition, we also provide a comprehensive and objective comparison of these classical and latest techniques through extensive experiments on representative datasets. Finally, we conclude with the current status of image matching technologies and deliver insightful discussions and prospects for future works. This survey can serve as a reference for (but not limited to) researchers and engineers in image matching and related fields.
Collapse
|
29
|
Almasi R, Vafaei A, Ghasemi Z, Ommani MR, Dehghani AR, Rabbani H. Registration of fluorescein angiography and optical coherence tomography images of curved retina via scanning laser ophthalmoscopy photographs. BIOMEDICAL OPTICS EXPRESS 2020; 11:3455-3476. [PMID: 33014544 PMCID: PMC7510895 DOI: 10.1364/boe.395784] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Revised: 05/27/2020] [Accepted: 05/27/2020] [Indexed: 05/18/2023]
Abstract
Accurate and automatic registration of multimodal retinal images such as fluorescein angiography (FA) and optical coherence tomography (OCT) enables utilization of supplementary information. FA is a gold standard imaging modality that depicts neurovascular structure of retina and is used for diagnosing neurovascular-related diseases such as diabetic retinopathy (DR). Unlike FA, OCT is non-invasive retinal imaging modality that provides cross-sectional data of retina. Due to differences in contrast, resolution and brightness of multimodal retinal images, the images resulted from vessel extraction of image pairs are not exactly the same. Also, prevalent feature detection, extraction and matching schemes do not result in perfect matches. In addition, the relationships between retinal image pairs are usually modeled by affine transformation, which cannot generate accurate alignments due to the non-planar retina surface. In this paper, a precise registration scheme is proposed to align FA and OCT images via scanning laser ophthalmoscopy (SLO) photographs as intermediate images. For this purpose, first a retinal vessel segmentation is applied to extract main blood vessels from the FA and SLO images. Next, a novel global registration is proposed based on the Gaussian model for curved surface of retina. For doing so, first a global rigid transformation is applied to FA vessel-map image using a new feature-based method to align it with SLO vessel-map photograph, in a way that outlier matched features resulted from not-perfect vessel segmentation are completely eliminated. After that, the transformed image is globally registered again considering Gaussian model for curved surface of retina to improve the precision of the previous step. Eventually a local non-rigid transformation is exploited to register two images perfectly. The experimental results indicate the presented scheme is more precise compared to other registration methods.
Collapse
Affiliation(s)
- Ramin Almasi
- Department of Computer Engineering, Faculty of Engineering, University of Isfahan, Isfahan, Iran
| | - Abbas Vafaei
- Department of Computer Engineering, Faculty of Engineering, University of Isfahan, Isfahan, Iran
| | - Zeinab Ghasemi
- Department of Electrical and Computer Engineering, University of Detroit Mercy, Detroit, MI 48202, USA
| | | | - Ali Reza Dehghani
- Didavaran Eye Clinic, Isfahan, Iran
- Department of Ophthalmology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Hossein Rabbani
- Medical Image & Signal Processing Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|