1
|
Sun X, Zeng K. RST: Rough Set Transformer for Point Cloud Learning. SENSORS (BASEL, SWITZERLAND) 2023; 23:9042. [PMID: 38005431 PMCID: PMC10674457 DOI: 10.3390/s23229042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 10/25/2023] [Accepted: 11/06/2023] [Indexed: 11/26/2023]
Abstract
Point cloud data generated by LiDAR sensors play a critical role in 3D sensing systems, with applications encompassing object classification, part segmentation, and point cloud recognition. Leveraging the global learning capacity of dot product attention, transformers have recently exhibited outstanding performance in point cloud learning tasks. Nevertheless, existing transformer models inadequately address the challenges posed by uncertainty features in point clouds, which can introduce errors in the dot product attention mechanism. In response to this, our study introduces a novel global guidance approach to tolerate uncertainty and provide a more reliable guidance. We redefine the granulation and lower-approximation operators based on neighborhood rough set theory. Furthermore, we introduce a rough set-based attention mechanism tailored for point cloud data and present the rough set transformer (RST) network. Our approach utilizes granulation concepts derived from token clusters, enabling us to explore relationships between concepts from an approximation perspective, rather than relying on specific dot product functions. Empirically, our work represents the pioneering fusion of rough set theory and transformer networks for point cloud learning. Our experimental results, including point cloud classification and segmentation tasks, demonstrate the superior performance of our method. Our method establishes concepts based on granulation generated from clusters of tokens. Subsequently, relationships between concepts can be explored from an approximation perspective, instead of relying on specific dot product or addition functions. Empirically, our work represents the pioneering fusion of rough set theory and transformer networks for point cloud learning. Our experimental results, including point cloud classification and segmentation tasks, demonstrate the superior performance of our method.
Collapse
Affiliation(s)
| | - Kai Zeng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China;
| |
Collapse
|
2
|
Keramatfar A, Rafiee M, Amirkhani H. Graph Neural Networks: A bibliometrics overview. MACHINE LEARNING WITH APPLICATIONS 2022. [DOI: 10.1016/j.mlwa.2022.100401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|
3
|
Zhang G, Zhu D, Shi W, Li J, Zhang X. SemRegionNet: Region ensemble 3D semantic instance segmentation network with semantic spatial aware discriminative loss. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
4
|
Lin F, Xu Y, Zhang Z, Gao C, Yamada KD. Cosmos Propagation Network: Deep learning model for point cloud completion. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.08.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
5
|
Xu Y, Arai S, Liu D, Lin F, Kosuge K. FPCC: Fast point cloud clustering-based instance segmentation for industrial bin-picking. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.023] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
6
|
|
7
|
Liu D, Tian Y, Zhang Y, Gelernter J, Wang X. Heterogeneous data fusion and loss function design for tooth point cloud segmentation. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07379-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
8
|
Ye C, Pan H, Yu X, Gao H. A spatially enhanced network with camera-lidar fusion for 3D semantic segmentation. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2020.12.135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
9
|
Zhao Y, Zhang L, Liu Y, Meng D, Cui Z, Gao C, Gao X, Lian C, Shen D. Two-Stream Graph Convolutional Network for Intra-Oral Scanner Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:826-835. [PMID: 34714743 DOI: 10.1109/tmi.2021.3124217] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Precise segmentation of teeth from intra-oral scanner images is an essential task in computer-aided orthodontic surgical planning. The state-of-the-art deep learning-based methods often simply concatenate the raw geometric attributes (i.e., coordinates and normal vectors) of mesh cells to train a single-stream network for automatic intra-oral scanner image segmentation. However, since different raw attributes reveal completely different geometric information, the naive concatenation of different raw attributes at the (low-level) input stage may bring unnecessary confusion in describing and differentiating between mesh cells, thus hampering the learning of high-level geometric representations for the segmentation task. To address this issue, we design a two-stream graph convolutional network (i.e., TSGCN), which can effectively handle inter-view confusion between different raw attributes to more effectively fuse their complementary information and learn discriminative multi-view geometric representations. Specifically, our TSGCN adopts two input-specific graph-learning streams to extract complementary high-level geometric representations from coordinates and normal vectors, respectively. Then, these single-view representations are further fused by a self-attention module to adaptively balance the contributions of different views in learning more discriminative multi-view representations for accurate and fully automatic tooth segmentation. We have evaluated our TSGCN on a real-patient dataset of dental (mesh) models acquired by 3D intraoral scanners. Experimental results show that our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
Collapse
|
10
|
|
11
|
Zhao Y, Zhang L, Yang C, Tan Y, Liu Y, Li P, Huang T, Gao C. 3D Dental model segmentation with graph attentional convolution network. Pattern Recognit Lett 2021. [DOI: 10.1016/j.patrec.2021.09.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
12
|
|
13
|
Huo X, Cui G, Tan J, Shao K. Automatic measurement of axial vertebral rotation in 3D vertebral models. Biomed Phys Eng Express 2021; 7. [PMID: 34598167 DOI: 10.1088/2057-1976/ac2c55] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Accepted: 10/01/2021] [Indexed: 11/11/2022]
Abstract
Axial Vertebral Rotation (AVR) is a significant indicator of adolescent idiopathic scoliosis (AIS). A host of methods are provided to measure AVR on coronal plane radiographs or 3D vertebral model. This paper provides a method of automatic AVR measurement in 3D vertebral model that is based on point cloud segmentation neural network and the tip of the spinous process searching algorithm. An improved PointNet using multi-input and attention mechanism named Multi-Input PointNet is proposed, which can segment the upper and lower endplates of the vertebral model accurately to determine the transverse plane of vertebral model. An algorithm is developed to search the tip of the spinous process according to the special structure of vertebrae. AVR angle is measured automatically using the midline of vertebral model and projection ofy-axis on the transverse plane of vertebral model based on points obtained above. We compare automatic measurement results with manual measurement results on different vertebral models. The experiment shows that automatic results can achieve accuracy of manual measurement results and the correlation coefficient of them is 0.986, proving our automatic AVR measurement method performs well.
Collapse
Affiliation(s)
- Xing Huo
- Hefei University of Technology, Hefei, Anhui, People's Republic of China
| | - Guangpeng Cui
- Hefei University of Technology, Hefei, Anhui, People's Republic of China
| | - Jieqing Tan
- Hefei University of Technology, Hefei, Anhui, People's Republic of China
| | - Kun Shao
- Hefei University of Technology, Hefei, Anhui, People's Republic of China
| |
Collapse
|
14
|
|
15
|
Zhu L, Wang B, Tian G, Wang W, Li C. Towards point cloud completion: Point Rank Sampling and Cross-Cascade Graph CNN. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.07.035] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
16
|
Xu Y, Zheng C, Xu R, Quan Y, Ling H. Multi-View 3D Shape Recognition via Correspondence-Aware Deep Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5299-5312. [PMID: 34038361 DOI: 10.1109/tip.2021.3082310] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In recent years, multi-view learning has emerged as a promising approach for 3D shape recognition, which identifies a 3D shape based on its 2D views taken from different viewpoints. Usually, the correspondences inside a view or across different views encode the spatial arrangement of object parts and the symmetry of the object, which provide useful geometric cues for recognition. However, such view correspondences have not been explicitly and fully exploited in existing work. In this paper, we propose a correspondence-aware representation (CAR) module, which explicitly finds potential intra-view correspondences and cross-view correspondences via k NN search in semantic space and then aggregates the shape features from the correspondences via learned transforms. Particularly, the spatial relations of correspondences in terms of their viewpoint positions and intra-view locations are taken into account for learning correspondence-aware features. Incorporating the CAR module into a ResNet-18 backbone, we propose an effective deep model called CAR-Net for 3D shape classification and retrieval. Extensive experiments have demonstrated the effectiveness of the CAR module as well as the excellent performance of the CAR-Net.
Collapse
|
17
|
DNet: Dynamic Neighborhood Feature Learning in Point Cloud. SENSORS 2021; 21:s21072327. [PMID: 33810586 PMCID: PMC8037691 DOI: 10.3390/s21072327] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2021] [Revised: 03/16/2021] [Accepted: 03/23/2021] [Indexed: 11/17/2022]
Abstract
Neighborhood selection is very important for local region feature learning in point cloud learning networks. Different neighborhood selection schemes may lead to quite different results for point cloud processing tasks. The existing point cloud learning networks mainly adopt the approach of customizing the neighborhood, without considering whether the selected neighborhood is reasonable or not. To solve this problem, this paper proposes a new point cloud learning network, denoted as Dynamic neighborhood Network (DNet), to dynamically select the neighborhood and learn the features of each point. The proposed DNet has a multi-head structure which has two important modules: the Feature Enhancement Layer (FELayer) and the masking mechanism. The FELayer enhances the manifold features of the point cloud, while the masking mechanism is used to remove the neighborhood points with low contribution. The DNet can learn the manifold features and spatial geometric features of point cloud, and obtain the relationship between each point and its effective neighborhood points through the masking mechanism, so that the dynamic neighborhood features of each point can be obtained. Experimental results on three public datasets demonstrate that compared with the state-of-the-art learning networks, the proposed DNet shows better superiority and competitiveness in point cloud processing task.
Collapse
|
18
|
KVGCN: A KNN Searching and VLAD Combined Graph Convolutional Network for Point Cloud Segmentation. REMOTE SENSING 2021. [DOI: 10.3390/rs13051003] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Semantic segmentation of the sensed point cloud data plays a significant role in scene understanding and reconstruction, robot navigation, etc. This work presents a Graph Convolutional Network integrating K-Nearest Neighbor searching (KNN) and Vector of Locally Aggregated Descriptors (VLAD). KNN searching is utilized to construct the topological graph of each point and its neighbors. Then, we perform convolution on the edges of constructed graph to extract representative local features by multiple Multilayer Perceptions (MLPs). Afterwards, a trainable VLAD layer, NetVLAD, is embedded in the feature encoder to aggregate the local and global contextual features. The designed feature encoder is repeated for multiple times, and the extracted features are concatenated in a jump-connection style to strengthen the distinctiveness of features and thereby improve the segmentation. Experimental results on two datasets show that the proposed work settles the shortcoming of insufficient local feature extraction and promotes the accuracy (mIoU 60.9% and oAcc 87.4% for S3DIS) of semantic segmentation comparing to existing models.
Collapse
|
19
|
|
20
|
Wen X, Han Z, Liu X, Liu YS. Point2SpatialCapsule: Aggregating Features and Spatial Relationships of Local Regions on Point Clouds using Spatial-aware Capsules. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:8855-8869. [PMID: 32894715 DOI: 10.1109/tip.2020.3019925] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Learning discriminative shape representation directly on point clouds is still challenging in 3D shape analysis and understanding. Recent studies usually involve three steps: first splitting a point cloud into some local regions, then extracting the corresponding feature of each local region, and finally aggregating all individual local region features into a global feature as shape representation using simple max-pooling. However, such pooling-based feature aggregation methods do not adequately take the spatial relationships (e.g. the relative locations to other regions) between local regions into account, which greatly limits the ability to learn discriminative shape representation. To address this issue, we propose a novel deep learning network, named Point2SpatialCapsule, for aggregating features and spatial relationships of local regions on point clouds, which aims to learn more discriminative shape representation. Compared with the traditional max-pooling based feature aggregation networks, Point2SpatialCapsule can explicitly learn not only geometric features of local regions but also the spatial relationships among them. Point2SpatialCapsule consists of two main modules. To resolve the disorder problem of local regions, the first module, named geometric feature aggregation, is designed to aggregate the local region features into the learnable cluster centers, which explicitly encodes the spatial locations from the original 3D space. The second module, named spatial relationship aggregation, is proposed for further aggregating the clustered features and the spatial relationships among them in the feature space using the spatial-aware capsules developed in this paper. Compared to the previous capsule network based methods, the feature routing on the spatial-aware capsules can learn more discriminative spatial relationships among local regions for point clouds, which establishes a direct mapping between log priors and the spatial locations through feature clusters. Experimental results demonstrate that Point2SpatialCapsule outperforms the state-of-the-art methods in the 3D shape classification, retrieval and segmentation tasks under the well-known ModelNet and ShapeNet datasets.
Collapse
|