1
|
Wei W, Wei P, Liao Z, Qin J, Cheng X, Liu M, Zheng N. Semantic Consistency Reasoning for 3-D Object Detection in Point Clouds. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:3356-3369. [PMID: 38113156 DOI: 10.1109/tnnls.2023.3341097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2023]
Abstract
Point cloud-based 3-D object detection is a significant and critical issue in numerous applications. While most existing methods attempt to capitalize on the geometric characteristics of point clouds, they neglect the internal semantic properties of point and the consistency between the semantic and geometric clues. We introduce a semantic consistency (SC) mechanism for 3-D object detection in this article, by reasoning about the semantic relations between 3-D object boxes and its internal points. This mechanism is based on a natural principle: the semantic category of a 3-D bounding box should be consistent with the categories of all points within the box. Driven by the SC mechanism, we propose a novel SC network (SCNet) to detect 3-D objects from point clouds. Specifically, the SCNet is composed of a feature extraction module, a detection decision module, and a semantic segmentation module. In inference, the feature extraction and the detection decision modules are used to detect 3-D objects. In training, the semantic segmentation module is jointly trained with the other two modules to produce more robust and applicable model parameters. The performance is greatly boosted through reasoning about the relations between the output 3-D object boxes and segmented points. The proposed SC mechanism is model-agnostic and can be integrated into other base 3-D object detection models. We test the proposed model on three challenging indoor and outdoor benchmark datasets: ScanNetV2, SUN RGB-D, and KITTI. Furthermore, to validate the universality of the SC mechanism, we implement it in three different 3-D object detectors. The experiments show that the performance is impressively improved and the extensive ablation studies also demonstrate the effectiveness of the proposed model.
Collapse
|
2
|
Chen G, Wang M, Zhang Q, Yuan L, Yue Y. Full Transformer Framework for Robust Point Cloud Registration With Deep Information Interaction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13368-13382. [PMID: 37163402 DOI: 10.1109/tnnls.2023.3267333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Point cloud registration is an essential technology in computer vision and robotics. Recently, transformer-based methods have achieved advanced performance in point cloud registration by utilizing the advantages of the transformer in order-invariance and modeling dependencies to aggregate information. However, they still suffer from indistinct feature extraction, sensitivity to noise, and outliers, owing to three major limitations: 1) the adoption of CNNs fails to model global relations due to their local receptive fields, resulting in extracted features susceptible to noise; 2) the shallow-wide architecture of transformers and the lack of positional information lead to indistinct feature extraction due to inefficient information interaction; and 3) the insufficient consideration of geometrical compatibility leads to the ambiguous identification of incorrect correspondences. To address the above-mentioned limitations, a novel full transformer network for point cloud registration is proposed, named the deep interaction transformer (DIT), which incorporates: 1) a point cloud structure extractor (PSE) to retrieve structural information and model global relations with the local feature integrator (LFI) and transformer encoders; 2) a deep-narrow point feature transformer (PFT) to facilitate deep information interaction across a pair of point clouds with positional information, such that transformers establish comprehensive associations and directly learn the relative position between points; and 3) a geometric matching-based correspondence confidence evaluation (GMCCE) method to measure spatial consistency and estimate correspondence confidence by the designed triangulated descriptor. Extensive experiments on the ModelNet40, ScanObjectNN, and 3DMatch datasets demonstrate that our method is capable of precisely aligning point clouds, consequently, achieving superior performance compared with state-of-the-art methods. The code is publicly available at https://github.com/CGuangyan-BIT/DIT.
Collapse
|
3
|
Zhang Z, Chen S, Wang Z, Yang J. PlaneSeg: Building a Plug-In for Boosting Planar Region Segmentation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:11486-11500. [PMID: 37027268 DOI: 10.1109/tnnls.2023.3262544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Existing methods in planar region segmentation suffer the problems of vague boundaries and failure to detect small-sized regions. To address these, this study presents an end-to-end framework, named PlaneSeg, which can be easily integrated into various plane segmentation models. Specifically, PlaneSeg contains three modules, namely, the edge feature extraction module, the multiscale module, and the resolution-adaptation module. First, the edge feature extraction module produces edge-aware feature maps for finer segmentation boundaries. The learned edge information acts as a constraint to mitigate inaccurate boundaries. Second, the multiscale module combines feature maps of different layers to harvest spatial and semantic information from planar objects. The multiformity of object information can help recognize small-sized objects to produce more accurate segmentation results. Third, the resolution-adaptation module fuses the feature maps produced by the two aforementioned modules. For this module, a pairwise feature fusion is adopted to resample the dropped pixels and extract more detailed features. Extensive experiments demonstrate that PlaneSeg outperforms other state-of-the-art approaches on three downstream tasks, including plane segmentation, 3-D plane reconstruction, and depth prediction. Code is available at https://github.com/nku-zhichengzhang/PlaneSeg.
Collapse
|
4
|
Dekamin A, Wahab MIM, Guergachi A, Keshavjee K. FIUS: Fixed partitioning undersampling method. Clin Chim Acta 2021; 522:174-183. [PMID: 34425104 DOI: 10.1016/j.cca.2021.08.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 07/30/2021] [Accepted: 08/18/2021] [Indexed: 11/28/2022]
Abstract
BACKGROUND AND OBJECTIVE In the medical field, data techniques for prediction and finding patterns of prevalent diseases are of increasing interest. Classification is one of the methods used to provide insight into predicting the future onset of type 2 diabetes of those at high risk of progression from pre-diabetes to diabetes. When applying classification techniques to real-world datasets, imbalanced class distribution has been one of the most significant limitations that leads to patients' misclassification. In this paper, we propose a novel balancing method to improve the prediction performance of type 2 diabetes mellitus in imbalanced electronic medical records (EMR). METHODS A novel undersampling method is proposed by utilizing a fixed partitioning distribution scheme in a regular grid. The proposed approach retains valuable information when balancing methods are applied to datasets. RESULTS The best AUC of 80% compared to other classifiers was obtained from the logistic regression (LR) classifier for EMR by applying our proposed undersampling method to balance the data. The new method improved the performance of the LR classifier compared to existing undersampling methods used in the balancing stage. CONCLUSION The results demonstrate the effectiveness and high performance of the proposed method for predicting diabetes in a Canadian imbalanced dataset. Our methodology can be used in other areas to overcome the limitations of imbalanced class distributions.
Collapse
Affiliation(s)
- Azam Dekamin
- Department of Mechanical and Industrial Engineering, Ryerson University, 350 Victoria Street, Toronto, ON M5B 2K3, Canada.
| | - M I M Wahab
- Department of Mechanical and Industrial Engineering, Ryerson University, 350 Victoria Street, Toronto, ON M5B 2K3, Canada
| | - Aziz Guergachi
- Ted Rogers, School of Information Technology Management, Ryerson University, 350 Victoria Street, Toronto, ON M5B 2K3, Canada
| | - Karim Keshavjee
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON M5T 3M6, Canada
| |
Collapse
|
5
|
Outdoor Scene Understanding Based on Multi-Scale PBA Image Features and Point Cloud Features. SENSORS 2019; 19:s19204546. [PMID: 31635059 PMCID: PMC6832457 DOI: 10.3390/s19204546] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2019] [Revised: 10/11/2019] [Accepted: 10/17/2019] [Indexed: 11/17/2022]
Abstract
Outdoor scene understanding based on the results of point cloud classification plays an important role in mobile robots and autonomous vehicles equipped with a light detection and ranging (LiDAR) system. In this paper, a novel model named Panoramic Bearing Angle (PBA) images is proposed which is generated from 3D point clouds. In a PBA model, laser point clouds are projected onto the spherical surface to establish the correspondence relationship between the laser ranging point and the image pixels, and then we use the relative location relationship of the laser point in the 3D space to calculate the gray value of the corresponding pixel. To extract robust features from 3D laser point clouds, both image pyramid model and point cloud pyramid model are utilized to extract multiple-scale features from PBA images and original point clouds, respectively. A Random Forest classifier is used to accomplish feature screening on extracted high-dimensional features to obtain the initial classification results. Moreover, reclassification is carried out to correct the misclassification points by remapping the classification results into the PBA images and using superpixel segmentation, which makes full use of the contextual information between laser points. Within each superpixel block, the reclassification is carried out again based on the results of the initial classification results, so as to correct some misclassification points and improve the classification accuracy. Two datasets published by ETH Zurich and MINES ParisTech are used to test the classification performance, and the results show the precision and recall rate of the proposed algorithms.
Collapse
|
6
|
Abiodun OI, Jantan A, Omolara AE, Dada KV, Mohamed NA, Arshad H. State-of-the-art in artificial neural network applications: A survey. Heliyon 2018; 4:e00938. [PMID: 30519653 PMCID: PMC6260436 DOI: 10.1016/j.heliyon.2018.e00938] [Citation(s) in RCA: 501] [Impact Index Per Article: 71.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Revised: 10/19/2018] [Accepted: 11/13/2018] [Indexed: 11/16/2022] Open
Abstract
This is a survey of neural network applications in the real-world scenario. It provides a taxonomy of artificial neural networks (ANNs) and furnish the reader with knowledge of current and emerging trends in ANN applications research and area of focus for researchers. Additionally, the study presents ANN application challenges, contributions, compare performances and critiques methods. The study covers many applications of ANN techniques in various disciplines which include computing, science, engineering, medicine, environmental, agriculture, mining, technology, climate, business, arts, and nanotechnology, etc. The study assesses ANN contributions, compare performances and critiques methods. The study found that neural-network models such as feedforward and feedback propagation artificial neural networks are performing better in its application to human problems. Therefore, we proposed feedforward and feedback propagation ANN models for research focus based on data analysis factors like accuracy, processing speed, latency, fault tolerance, volume, scalability, convergence, and performance. Moreover, we recommend that instead of applying a single method, future research can focus on combining ANN models into one network-wide application.
Collapse
Affiliation(s)
- Oludare Isaac Abiodun
- School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia
- Department of Computer Science, Bingham University, Karu, Nigeria
| | - Aman Jantan
- School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia
| | | | - Kemi Victoria Dada
- Department of Mathematical Sciences, Nasarawa State University, Keffi, Nigeria
| | | | - Humaira Arshad
- Department of Computer Science and Information Technology, Islamia University of Bahawalpur, Pakistan
| |
Collapse
|
7
|
Gao H, Zhang X, Fang Y, Yuan J. A line segment extraction algorithm using laser data based on seeded region growing. INT J ADV ROBOT SYST 2018. [DOI: 10.1177/1729881418755245] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
This article presents a novel line segment extraction algorithm using two-dimensional (2D) laser data, which is composed of four main procedures: seed-segment detection, region growing, overlap region processing, and endpoint generation. Different from existing approaches, the proposed algorithm borrows the idea of seeded region growing in the field of image processing, which is more efficient with more precise endpoints of the extracted line segments. Comparative experimental results with respect to the well-known Split-and-Merge algorithm are presented to show superior performance of the proposed approach in terms of efficiency, correctness, and precision, using real 2D data taken from our hallway and laboratory.
Collapse
Affiliation(s)
- Haiming Gao
- Institute of Robotics and Automatic Information System, Tianjin Key Laboratory of Intelligent Robotics, Nankai University, Tianjin, People’s Republic of China
| | - Xuebo Zhang
- Institute of Robotics and Automatic Information System, Tianjin Key Laboratory of Intelligent Robotics, Nankai University, Tianjin, People’s Republic of China
| | - Yongchun Fang
- Institute of Robotics and Automatic Information System, Tianjin Key Laboratory of Intelligent Robotics, Nankai University, Tianjin, People’s Republic of China
| | - Jing Yuan
- Institute of Robotics and Automatic Information System, Tianjin Key Laboratory of Intelligent Robotics, Nankai University, Tianjin, People’s Republic of China
| |
Collapse
|