1
|
Fu Z, Guo Y, Chen M, Hu Q, Laga H, Boussaid F, Bennamoun M. WSSIC-Net: Weakly-Supervised Semantic Instance Completion of 3D Point Cloud Scenes. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:2008-2019. [PMID: 40146644 DOI: 10.1109/tip.2024.3520013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/29/2025]
Abstract
Semantic instance completion aims to recover the complete 3D shapes of foreground objects together with their labels from a partial 2.5D scan of a scene. Previous works have relied on full supervision, which requires ground-truth annotations, in the form of bounding boxes and complete 3D objects. This has greatly limited their real-world application because the acquisition of ground-truth data is very costly and time-consuming. To address this bottleneck, we propose a Weakly-Supervised Semantic Instance Completion Network (WSSIC-Net), which learns real-world partial point cloud object completion without requiring the ground truth of complete 3D objects. Instead, WSSIC-Net leverages 3D ground-truth bounding boxes, partial objects of a raw scene, and unpaired synthetic 3D point clouds. More specifically, a 3D detector is used to encode partial point clouds into proposal features, which are then fed into two branches. The first branch uses fully supervised box prediction based on proposal features. The second branch, hereinafter called instance completion, leverages the proposal features as partial object features to achieve weakly-supervised instance completion. A Generative Adversarial Network (GAN) completes the partial features of the 2.5D foreground objects of real-world scenes using only unpaired but semantically-consistent complete synthetic point clouds. In our experiments, we demonstrate that the fully-supervised 3D detection and the weakly-supervised instance completion complement one another. The qualitative and quantitative evaluations on the ScanNet v2 dataset demonstrate that the proposed "weakly-supervised" approach consistently achieves comparable performance to the state-of-the-art "fully supervised" methods.
Collapse
|
2
|
Chen B, Lv X, Zhao Y, Yu L. TPDC: Point Cloud Completion by Triangular Pyramid Features and Divide-and-Conquer in Complex Environments. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:6029-6040. [PMID: 38758619 DOI: 10.1109/tnnls.2024.3397988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2024]
Abstract
Point cloud completion recovers the complete point clouds from partial ones, providing numerous point cloud information for downstream tasks such as 3-D reconstruction and target detection. However, previous methods usually suffer from unstructured prediction of points in local regions and the discrete nature of the point cloud. To resolve these problems, we propose a point cloud completion network called TPDC. Representing the point cloud as a set of unordered features of points with local geometric information, we devise a Triangular Pyramid Extractor (TPE), using the simplest 3-D structure-a triangular pyramid-to convert the point cloud to a sequence of local geometric information. Our insight of revealing local geometric information in a complex environment is to design a Divide-and-Conquer Splitting Module in a Divide-and-Conquer Splitting Decoder (DCSD) to learn point-splitting patterns that can fit local regions the best. This module employs the Divide-and-Conquer approach to parallelly handle tasks related to fitting ground-truth values to base points and predicting the displacement of split points. This approach aims to make the base points align more closely with the ground-truth values while also forecasting the displacement of split points relative to the base points. Furthermore, we propose a more realistic and challenging benchmark, ShapeNetMask, with more random point cloud input, more complex random item occlusion, and more realistic random environmental perturbations. The results show that our method outperforms both widely used benchmarks as well as the new benchmark.
Collapse
|
3
|
Jian L, Qiu W, Cheng Y. Accurate estimation of concrete consumption in tunnel lining using terrestrial laser scanning. Sci Rep 2024; 14:2705. [PMID: 38302548 PMCID: PMC10834988 DOI: 10.1038/s41598-023-51132-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 12/31/2023] [Indexed: 02/03/2024] Open
Abstract
Accurate estimation of concrete (including shotcrete) consumption plays a crucial role in tunnel construction. A novel method has been introduced to accurately estimate concrete consumption with terrestrial laser scanning (TLS). The estimation needs to capture TLS data of tunnel surfaces at different stages of construction. Unrolling point clouds, a novel two-stage algorithm consisting of noise removal and hole filling has been used to generate resampled points. Furthermore, resampled points from two scans (before and after lining construction) ultimately generate an innovative computation model composed of multiple hexahedral elements, which is used for calculating volumes. The proposed technique was applied to the Tiantaishan highway tunnel and Da Fang Shan high-speed railway tunnel. The calculation relative error of the rebound rate is 0.19%, and the average relative error in predicting the demand for secondary lining concrete is 0.15%. Compared with 3D Delaunay with curve fitting, the proposed technique offers a more straightforward operation and higher accuracy. Considering factors such as tunnel geometry, support design, and concrete properties, a computational model will provide valuable insights into optimizing resource allocation and reducing material waste during construction.
Collapse
Affiliation(s)
- Liao Jian
- School of Civil Engineering, Key Laboratory of Transportation Tunnel Engineering, Ministry of Education, Southwest Jiaotong University, No. 111, North Section, Second Ring Road, Jinniu District, Chengdu, 610031, Sichuan, China
| | - Wenge Qiu
- School of Civil Engineering, Key Laboratory of Transportation Tunnel Engineering, Ministry of Education, Southwest Jiaotong University, No. 111, North Section, Second Ring Road, Jinniu District, Chengdu, 610031, Sichuan, China
- Chengdu Tianyou Tunnelkey Co., Ltd., Chengdu, 610031, Sichuan, China
| | - Yunjian Cheng
- School of Civil Engineering and Geomatics, Southwest Petroleum University, Chengdu, 610500, Sichuan, China.
| |
Collapse
|
4
|
Liao W, Subpa-Asa A, Asano Y, Zheng Y, Kajita H, Imanishi N, Yagi T, Aiso S, Kishi K, Sato I. Reliability-Aware Restoration Framework for 4D Spectral Photoacoustic Data. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:15445-15461. [PMID: 37651493 DOI: 10.1109/tpami.2023.3310981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
Spectral photoacoustic imaging (PAI) is a new technology that is able to provide 3D geometric structure associated with 1D wavelength-dependent absorption information of the interior of a target in a non-invasive manner. It has potentially broad applications in clinical and medical diagnosis. Unfortunately, the usability of spectral PAI is severely affected by a time-consuming data scanning process and complex noise. Therefore in this study, we propose a reliability-aware restoration framework to recover clean 4D data from incomplete and noisy observations. To the best of our knowledge, this is the first attempt for the 4D spectral PA data restoration problem that solves data completion and denoising simultaneously. We first present a sequence of analyses, including modeling of data reliability in the depth and spectral domains, developing an adaptive correlation graph, and analyzing local patch orientation. On the basis of these analyses, we explore global sparsity and local self-similarity for restoration. We demonstrated the effectiveness of our proposed approach through experiments on real data captured from patients, where our approach outperformed the state-of-the-art methods in both objective evaluation and subjective assessment.
Collapse
|
5
|
Xiang P, Wen X, Liu YS, Cao YP, Wan P, Zheng W, Han Z. Snowflake Point Deconvolution for Point Cloud Completion and Generation With Skip-Transformer. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:6320-6338. [PMID: 36282830 DOI: 10.1109/tpami.2022.3217161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Most existing point cloud completion methods suffer from the discrete nature of point clouds and the unstructured prediction of points in local regions, which makes it difficult to reveal fine local geometric details. To resolve this issue, we propose SnowflakeNet with snowflake point deconvolution (SPD) to generate complete point clouds. SPD models the generation of point clouds as the snowflake-like growth of points, where child points are generated progressively by splitting their parent points after each SPD. Our insight into the detailed geometry is to introduce a skip-transformer in the SPD to learn the point splitting patterns that can best fit the local regions. The skip-transformer leverages attention mechanism to summarize the splitting patterns used in the previous SPD layer to produce the splitting in the current layer. The locally compact and structured point clouds generated by SPD precisely reveal the structural characteristics of the 3D shape in local patches, which enables us to predict highly detailed geometries. Moreover, since SPD is a general operation that is not limited to completion, we explore its applications in other generative tasks, including point cloud auto-encoding, generation, single image reconstruction, and upsampling. Our experimental results outperform state-of-the-art methods under widely used benchmarks.
Collapse
|
6
|
Qayyum A, Ilahi I, Shamshad F, Boussaid F, Bennamoun M, Qadir J. Untrained Neural Network Priors for Inverse Imaging Problems: A Survey. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:6511-6536. [PMID: 36063506 DOI: 10.1109/tpami.2022.3204527] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In recent years, advancements in machine learning (ML) techniques, in particular, deep learning (DL) methods have gained a lot of momentum in solving inverse imaging problems, often surpassing the performance provided by hand-crafted approaches. Traditionally, analytical methods have been used to solve inverse imaging problems such as image restoration, inpainting, and superresolution. Unlike analytical methods for which the problem is explicitly defined and the domain knowledge is carefully engineered into the solution, DL models do not benefit from such prior knowledge and instead make use of large datasets to predict an unknown solution to the inverse problem. Recently, a new paradigm of training deep models using a single image, named untrained neural network prior (UNNP) has been proposed to solve a variety of inverse tasks, e.g., restoration and inpainting. Since then, many researchers have proposed various applications and variants of UNNP. In this paper, we present a comprehensive review of such studies and various UNNP applications for different tasks and highlight various open research problems which require further research.
Collapse
|
7
|
Wen X, Xiang P, Han Z, Cao YP, Wan P, Zheng W, Liu YS. PMP-Net++: Point Cloud Completion by Transformer-Enhanced Multi-Step Point Moving Paths. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:852-867. [PMID: 35290184 DOI: 10.1109/tpami.2022.3159003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Point cloud completion concerns to predict missing part for incomplete 3D shapes. A common strategy is to generate complete shape according to incomplete input. However, unordered nature of point clouds will degrade generation of high-quality 3D shapes, as detailed topology and structure of unordered points are hard to be captured during the generative process using an extracted latent code. We address this problem by formulating completion as point cloud deformation process. Specifically, we design a novel neural network, named PMP-Net++, to mimic behavior of an earth mover. It moves each point of incomplete input to obtain a complete point cloud, where total distance of point moving paths (PMPs) should be the shortest. Therefore, PMP-Net++ predicts unique PMP for each point according to constraint of point moving distances. The network learns a strict and unique correspondence on point-level, and thus improves quality of predicted complete shape. Moreover, since moving points heavily relies on per-point features learned by network, we further introduce a transformer-enhanced representation learning network, which significantly improves completion performance of PMP-Net++. We conduct comprehensive experiments in shape completion, and further explore application on point cloud up-sampling, which demonstrate non-trivial improvement of PMP-Net++ over state-of-the-art point cloud completion/up-sampling methods.
Collapse
|
8
|
Lv T, Pan Z, Wei W, Yang G, Song J, Wang X, Sun L, Li Q, Sun X. Iterative deep neural networks based on proximal gradient descent for image restoration. PLoS One 2022; 17:e0276373. [PMID: 36331931 PMCID: PMC9635693 DOI: 10.1371/journal.pone.0276373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 10/06/2022] [Indexed: 11/06/2022] Open
Abstract
The algorithm unfolding networks with explainability of algorithms and higher efficiency of Deep Neural Networks (DNN) have received considerable attention in solving ill-posed inverse problems. Under the algorithm unfolding network framework, we propose a novel end-to-end iterative deep neural network and its fast network for image restoration. The first one is designed making use of proximal gradient descent algorithm of variational models, which consists of denoiser and reconstruction sub-networks. The second one is its accelerated version with momentum factors. For sub-network of denoiser, we embed the Convolutional Block Attention Module (CBAM) in previous U-Net for adaptive feature refinement. Experiments on image denoising and deblurring demonstrate that competitive performances in quality and efficiency are gained by compared with several state-of-the-art networks for image restoration. Proposed unfolding DNN can be easily extended to solve other similar image restoration tasks, such as image super-resolution, image demosaicking, etc.
Collapse
Affiliation(s)
- Ting Lv
- College of Computer Science and Technology, Qingdao University, Qingdao, Shandong Province, China
| | - Zhenkuan Pan
- College of Computer Science and Technology, Qingdao University, Qingdao, Shandong Province, China
- * E-mail: (ZP); (WW)
| | - Weibo Wei
- College of Computer Science and Technology, Qingdao University, Qingdao, Shandong Province, China
- * E-mail: (ZP); (WW)
| | - Guangyu Yang
- College of Computer Science and Technology, Qingdao University, Qingdao, Shandong Province, China
| | - Jintao Song
- College of Computer Science and Technology, Qingdao University, Qingdao, Shandong Province, China
| | - Xuqing Wang
- College of Computer Science and Technology, Qingdao University, Qingdao, Shandong Province, China
| | - Lu Sun
- College of Computer Science and Technology, Qingdao University, Qingdao, Shandong Province, China
| | - Qian Li
- College of Computer Science and Technology, Qingdao University, Qingdao, Shandong Province, China
| | - Xiatao Sun
- School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
9
|
Dinesh C, Cheung G, Bajic IV. Point Cloud Video Super-Resolution via Partial Point Coupling and Graph Smoothness. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:4117-4132. [PMID: 35696478 DOI: 10.1109/tip.2022.3166644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Point cloud (PC) is a collection of discrete geometric samples of a physical object in 3D space. A PC video consists of temporal frames evenly spaced in time, each containing a static PC at one time instant. PCs in adjacent frames typically do not have point-to-point (P2P) correspondence, and thus exploiting temporal redundancy for PC restoration across frames is difficult. In this paper, we focus on the super-resolution (SR) problem for PC video: increase point density of PCs in video frames while preserving salient geometric features consistently across time. We accomplish this with two ideas. First, we establish partial P2P coupling between PCs of adjacent frames by interpolating interior points in a low-resolution PC patch in frame t and translating them to a corresponding patch in frame t+1 , via a motion model computed by iterative closest point (ICP). Second, we promote piecewise smoothness in 3D geometry in each patch using feature graph Laplacian regularizer (FGLR) in an easily computable quadratic form. The two ideas translate to an unconstrained quadratic programming (QP) problem with a system of linear equations as solution-one where we ensure the numerical stability by upper-bounding the condition number of the coefficient matrix. Finally, to improve the accuracy of the ICP motion model, we re-sample points in a super-resolved patch at time t to better match a low-resolution patch at time t+1 via bipartite graph matching after each SR iteration. Experimental results show temporally consistent super-resolved PC videos generated by our scheme, outperforming SR competitors that optimized on a per-frame basis, in two established PC metrics.
Collapse
|
10
|
Uddin K, Jeong TH, Oh BT. Incomplete Region Estimation and Restoration of 3D Point Cloud Human Face Datasets. SENSORS (BASEL, SWITZERLAND) 2022; 22:723. [PMID: 35161471 PMCID: PMC8840486 DOI: 10.3390/s22030723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 01/06/2022] [Accepted: 01/17/2022] [Indexed: 06/14/2023]
Abstract
Owing to imperfect scans, occlusions, low reflectance of the scanned surface, and packet loss, there may be several incomplete regions in the 3D point cloud dataset. These missing regions can degrade the performance of recognition, classification, segmentation, or upsampling methods in point cloud datasets. In this study, we propose a new approach to estimate the incomplete regions of 3D point cloud human face datasets using the masking method. First, we perform some preprocessing on the input point cloud, such as rotation in the left and right angles. Then, we project the preprocessed point cloud onto a 2D surface and generate masks. Finally, we interpolate the 2D projection and the mask to produce the estimated point cloud. We also designed a deep learning model to restore the estimated point cloud to improve its quality. We use chamfer distance (CD) and hausdorff distance (HD) to evaluate the proposed method on our own human face and large-scale facial model (LSFM) datasets. The proposed method achieves an average CD and HD results of 1.30 and 21.46 for our own and 1.35 and 9.08 for the LSFM datasets, respectively. The proposed method shows better results than the existing methods.
Collapse
|
11
|
Chen J, Yi JSK, Kahoush M, Cho ES, Cho YK. Point Cloud Scene Completion of Obstructed Building Facades with Generative Adversarial Inpainting. SENSORS 2020; 20:s20185029. [PMID: 32899749 PMCID: PMC7571037 DOI: 10.3390/s20185029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2020] [Revised: 08/29/2020] [Accepted: 09/02/2020] [Indexed: 12/03/2022]
Abstract
Collecting 3D point cloud data of buildings is important for many applications such as urban mapping, renovation, preservation, and energy simulation. However, laser-scanned point clouds are often difficult to analyze, visualize, and interpret due to incompletely scanned building facades caused by numerous sources of defects such as noise, occlusions, and moving objects. Several point cloud scene completion algorithms have been proposed in the literature, but they have been mostly applied to individual objects or small-scale indoor environments and not on large-scale scans of building facades. This paper introduces a method of performing point cloud scene completion of building facades using orthographic projection and generative adversarial inpainting methods. The point cloud is first converted into the 2D structured representation of depth and color images using an orthographic projection approach. Then, a data-driven 2D inpainting approach is used to predict the complete version of the scene, given the incomplete scene in the image domain. The 2D inpainting process is fully automated and uses a customized generative-adversarial network based on Pix2Pix that is trainable end-to-end. The inpainted 2D image is finally converted back into a 3D point cloud using depth remapping. The proposed method is compared against several baseline methods, including geometric methods such as Poisson reconstruction and hole-filling, as well as learning-based methods such as the point completion network (PCN) and TopNet. Performance evaluation is carried out based on the task of reconstructing real-world building facades from partial laser-scanned point clouds. Experimental results using the performance metrics of voxel precision, voxel recall, position error, and color error showed that the proposed method has the best performance overall.
Collapse
Affiliation(s)
- Jingdao Chen
- Institute for Robotics and Intelligent Machines, Georgia Institute of Technology, 801 Atlantic Dr. N.W., Atlanta, GA 30332, USA
- Correspondence:
| | - John Seon Keun Yi
- School of Computer Science, Georgia Institute of Technology, 801 Atlantic Dr. N.W., Atlanta, GA 30332, USA; (J.S.K.Y.); (M.K.)
| | - Mark Kahoush
- School of Computer Science, Georgia Institute of Technology, 801 Atlantic Dr. N.W., Atlanta, GA 30332, USA; (J.S.K.Y.); (M.K.)
| | - Erin S. Cho
- Alpharetta High School, 3595 Webb Bridge Rd, Alpharetta, GA 30005, USA;
| | - Yong K. Cho
- School of Civil and Environmental Engineering, Georgia Institute of Technology, 790 Atlantic Dr. N.W., Atlanta, GA 30332, USA;
| |
Collapse
|