1
|
Su H, Liu Q, Yuan H, Cheng Q, Hamzaoui R. Support Vector Regression-based Reduced-Reference Perceptual Quality Model for Compressed Point Clouds. IEEE TRANSACTIONS ON MULTIMEDIA 2023; 26:6238-6249. [PMID: 39600490 PMCID: PMC11586859 DOI: 10.1109/tmm.2023.3347638] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Video-based point cloud compression (V-PCC) is a state-of-the-art moving picture experts group (MPEG) standard for point cloud compression. V-PCC can be used to compress both static and dynamic point clouds in a lossless, near lossless, or lossy way. Many objective quality metrics have been proposed for distorted point clouds. Most of these metrics are full-reference metrics that require both the original point cloud and the distorted one. However, in some real-time applications, the original point cloud is not available, and no-reference or reduced-reference quality metrics are needed. Three main challenges in the design of a reduced-reference quality metric are how to build a set of features that characterize the visual quality of the distorted point cloud, how to select the most effective features from this set, and how to map the selected features to a perceptual quality score. We address the first challenge by proposing a comprehensive set of features consisting of compression, geometry, normal, curvature, and luminance features. To deal with the second challenge, we use the least absolute shrinkage and selection operator (LASSO) method, which is a variable selection method for regression problems. Finally, we map the selected features to the mean opinion score in a nonlinear space. Although we have used only 19 features in our current implementation, our metric is flexible enough to allow any number of features, including future more effective ones. Experimental results on the Waterloo point cloud dataset version 2 (WPC2.0) and the MPEG point cloud compression dataset (M-PCCD) show that our method, namely PCQAML, outperforms state-of-the-art full-reference and reduced-reference quality metrics in terms of Pearson linear correlation coefficient, Spearman rank order correlation coefficient, Kendall's rank-order correlation coefficient, and root mean squared error.
Collapse
Affiliation(s)
- Honglei Su
- College of Electronics and Information, Qingdao University, Qingdao 266071, China
| | - Qi Liu
- College of Electronics and Information, Qingdao University, Qingdao 266237, China
| | - Hui Yuan
- School of Control Science and Engineering, Shandong University, Ji'nan 250061, China
| | - Qiang Cheng
- Department of Computer Science, University of Kentucky, USA
| | - Raouf Hamzaoui
- School of Engineering and Sustainable Development, De Montfort University, Leicester, UK
| |
Collapse
|
2
|
Quality Assessment of View Synthesis Based on Visual Saliency and Texture Naturalness. ELECTRONICS 2022. [DOI: 10.3390/electronics11091384] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Depth-Image-Based-Rendering (DIBR) is one of the core techniques for generating new views in 3D video applications. However, the distortion characteristics of the DIBR synthetic view are different from the 2D image. It is necessary to study the unique distortion characteristics of DIBR views and design effective and efficient algorithms to evaluate the DIBR-synthesized image and guide DIBR algorithms. In this work, the visual saliency and texture natrualness features are extracted to evaluate the quality of the DIBR views. After extracting the feature, we adopt machine learning method for mapping the extracted feature to the quality score of the DIBR views. Experiments constructed on two synthetic view databases IETR and IRCCyN/IVC, and the results show that our proposed algorithm performs better than the compared synthetic view quality evaluation methods.
Collapse
|
3
|
Jakhetiya V, Chaudhary S, Subudhi BN, Lin W, Guntuku SC. Perceptually Unimportant Information Reduction and Cosine Similarity-Based Quality Assessment of 3D-Synthesized Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:2027-2039. [PMID: 35167450 DOI: 10.1109/tip.2022.3147981] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Quality assessment of 3D-synthesized images has traditionally been based on detecting specific categories of distortions such as stretching, black-holes, blurring, etc. However, such approaches have limitations in accurately detecting distortions entirely in 3D synthesized images affecting their performance. This work proposes an algorithm to efficiently detect the distortions and subsequently evaluate the perceptual quality of 3D synthesized images. The process of generation of 3D synthesized images produces a few pixel shift between reference and 3D synthesized image, and hence they are not properly aligned with each other. To address this, we propose using morphological operation (opening) in the residual image to reduce perceptually unimportant information between the reference and the distorted 3D synthesized image. The residual image suppresses the perceptually unimportant information and highlights the geometric distortions which significantly affect the overall quality of 3D synthesized images. We utilized the information present in the residual image to quantify the perceptual quality measure and named this algorithm as Perceptually Unimportant Information Reduction (PU-IR) algorithm. At the same time, the residual image cannot capture the minor structural and geometric distortions due to the usage of erosion operation. To address this, we extract the perceptually important deep features from the pre-trained VGG-16 architectures on the Laplacian pyramid. The distortions in 3D synthesized images are present in patches, and the human visual system perceives even the small levels of these distortions. With this view, to compare these deep features between reference and distorted image, we propose using cosine similarity and named this algorithm as Deep Features extraction and comparison using Cosine Similarity (DF-CS) algorithm. The cosine similarity is based upon their similarity rather than computing the magnitude of the difference of deep features. Finally, the pooling is done to obtain the objective quality scores using simple multiplication to both PU-IR and DF-CS algorithms. Our source code is available online: https://github.com/sadbhawnathakur/3D-Image-Quality-Assessment.
Collapse
|
4
|
Jakhetiya V, Mumtaz D, Subudhi BN, Guntuku SC. Stretching Artifacts Identification for Quality Assessment of 3D-Synthesized Views. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:1737-1750. [PMID: 35100114 DOI: 10.1109/tip.2022.3145997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Existing Quality Assessment (QA) algorithms consider identifying "black-holes" to assess perceptual quality of 3D-synthesized views. However, advancements in rendering and inpainting techniques have made black-hole artifacts near obsolete. Further, 3D-synthesized views frequently suffer from stretching artifacts due to occlusion that in turn affect perceptual quality. Existing QA algorithms are found to be inefficient in identifying these artifacts, as has been seen by their performance on the IETR dataset. We found, empirically, that there is a relationship between the number of blocks with stretching artifacts in view and the overall perceptual quality. Building on this observation, we propose a Convolutional Neural Network (CNN) based algorithm that identifies the blocks with stretching artifacts and incorporates the number of blocks with the stretching artifacts to predict the quality of 3D-synthesized views. To address the challenge with existing 3D-synthesized views dataset, which has few samples, we collect images from other related datasets to increase the sample size and increase generalization while training our proposed CNN-based algorithm. The proposed algorithm identifies blocks with stretching distortions and subsequently fuses them to predict perceptual quality without reference, achieving improvement in performance compared to existing no-reference QA algorithms that are not trained on the IETR dataset. The proposed algorithm can also identify the blocks with stretching artifacts efficiently, which can further be used in downstream applications to improve the quality of 3D views. Our source code is available online: https://github.com/sadbhawnathakur/3D-Image-Quality-Assessment.
Collapse
|
5
|
Wang G, Shi Q, Shao Y, Tang L. DIBR-Synthesized Image Quality Assessment With Texture and Depth Information. Front Neurosci 2021; 15:761610. [PMID: 34803593 PMCID: PMC8597928 DOI: 10.3389/fnins.2021.761610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 10/11/2021] [Indexed: 11/13/2022] Open
Abstract
Accurately predicting the quality of depth-image-based-rendering (DIBR) synthesized images is of great significance in promoting DIBR techniques. Recently, many DIBR-synthesized image quality assessment (IQA) algorithms have been proposed to quantify the distortion that existed in texture images. However, these methods ignore the damage of DIBR algorithms on the depth structure of DIBR-synthesized images and thus fail to accurately evaluate the visual quality of DIBR-synthesized images. To this end, this paper presents a DIBR-synthesized image quality assessment metric with Texture and Depth Information, dubbed as TDI. TDI predicts the quality of DIBR-synthesized images by jointly measuring the synthesized image's colorfulness, texture structure, and depth structure. The design principle of our TDI includes two points: (1) DIBR technologies bring color deviation to DIBR-synthesized images, and so measuring colorfulness can effectively predict the quality of DIBR-synthesized images. (2) In the hole-filling process, DIBR technologies introduce the local geometric distortion, which destroys the texture structure of DIBR-synthesized images and affects the relationship between the foreground and background of DIBR-synthesized images. Thus, we can accurately evaluate DIBR-synthesized image quality through a joint representation of texture and depth structures. Experiments show that our TDI outperforms the competing state-of-the-art algorithms in predicting the visual quality of DIBR-synthesized images.
Collapse
Affiliation(s)
- Guangcheng Wang
- School of Transportation and Civil Engineering, Nantong University, Nantong, China
| | - Quan Shi
- School of Transportation and Civil Engineering, Nantong University, Nantong, China
| | - Yeqin Shao
- School of Transportation and Civil Engineering, Nantong University, Nantong, China
| | - Lijuan Tang
- School of Electronics and Information, Jiangsu Vocational College of Business, Nantong, China
| |
Collapse
|
6
|
Jin C, Peng Z, Zou W, Chen F, Jiang G, Yu M. No-Reference Quality Assessment for 3D Synthesized Images Based on Visual-Entropy-Guided Multi-Layer Features Analysis. ENTROPY (BASEL, SWITZERLAND) 2021; 23:770. [PMID: 34207229 PMCID: PMC8233917 DOI: 10.3390/e23060770] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/01/2021] [Accepted: 06/14/2021] [Indexed: 11/16/2022]
Abstract
Multiview video plus depth is one of the mainstream representations of 3D scenes in emerging free viewpoint video, which generates virtual 3D synthesized images through a depth-image-based-rendering (DIBR) technique. However, the inaccuracy of depth maps and imperfect DIBR techniques result in different geometric distortions that seriously deteriorate the users' visual perception. An effective 3D synthesized image quality assessment (IQA) metric can simulate human visual perception and determine the application feasibility of the synthesized content. In this paper, a no-reference IQA metric based on visual-entropy-guided multi-layer features analysis for 3D synthesized images is proposed. According to the energy entropy, the geometric distortions are divided into two visual attention layers, namely, bottom-up layer and top-down layer. The feature of salient distortion is measured by regional proportion plus transition threshold on a bottom-up layer. In parallel, the key distribution regions of insignificant geometric distortion are extracted by a relative total variation model, and the features of these distortions are measured by the interaction of decentralized attention and concentrated attention on top-down layers. By integrating the features of both bottom-up and top-down layers, a more visually perceptive quality evaluation model is built. Experimental results show that the proposed method is superior to the state-of-the-art in assessing the quality of 3D synthesized images.
Collapse
Affiliation(s)
- Chongchong Jin
- Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211, China; (C.J.); (W.Z.); (F.C.); (G.J.); (M.Y.)
| | - Zongju Peng
- Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211, China; (C.J.); (W.Z.); (F.C.); (G.J.); (M.Y.)
- School of Electrical and Electronic Engineering, Chongqing University of Technology, Chongqing 400054, China
| | - Wenhui Zou
- Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211, China; (C.J.); (W.Z.); (F.C.); (G.J.); (M.Y.)
| | - Fen Chen
- Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211, China; (C.J.); (W.Z.); (F.C.); (G.J.); (M.Y.)
- School of Electrical and Electronic Engineering, Chongqing University of Technology, Chongqing 400054, China
| | - Gangyi Jiang
- Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211, China; (C.J.); (W.Z.); (F.C.); (G.J.); (M.Y.)
| | - Mei Yu
- Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211, China; (C.J.); (W.Z.); (F.C.); (G.J.); (M.Y.)
| |
Collapse
|
7
|
Tian S, Zhang L, Zou W, Li X, Su T, Morin L, Déforges O. Quality assessment of DIBR-synthesized views: An overview. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.09.062] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
8
|
Sandic-Stankovic DD, Kukolj DD, Le Callet P. Fast Blind Quality Assessment of DIBR-Synthesized Video Based on High-High Wavelet Subband. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:5524-5536. [PMID: 31180890 DOI: 10.1109/tip.2019.2919416] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Free-viewpoint video, as the development direction of the next-generation video technologies, uses the depth-image-based rendering (DIBR) technique for the synthesis of video sequences at viewpoints, where real captured videos are missing. As reference videos at multiple viewpoints are not available, a blind reliable real-time quality metric of the synthesized video is needed. Although no-reference quality metrics dedicated to synthesized views successfully evaluate synthesized images, they are not that effective when evaluating synthesized video due to additional temporal flicker distortion typical only for video. In this paper, a new fast no-reference quality metric of synthesized video with synthesis distortions is proposed. It is guided by the fact that the DIBR-synthesized images are characterized by increased high frequency content. The metric is designed under the assumption that the perceived quality of DIBR-synthesized video can be estimated by quantifying the selected areas in the high-high wavelet subband. The threshold is used to select the most important distortion sensitive regions. The proposed No-Reference Morphological Wavelet with Threshold (NR_MWT) metric is computationally extremely efficient, comparable to PSNR, as the morphological wavelet transformation uses very short filters and only integer arithmetic. It is completely blind, without using machine learning techniques. Tested on the publicly available dataset of synthesized video sequences characterized by synthesis distortions, the metric achieves better performances and higher computational efficiency than the state-of-the-art metrics dedicated to DIBR-synthesized images and videos.
Collapse
|
9
|
Wang G, Wang Z, Gu K, Li L, Xia Z, Wu L. Blind Quality Metric of DIBR-Synthesized Images in the Discrete Wavelet Transform Domain. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:1802-1814. [PMID: 31613757 DOI: 10.1109/tip.2019.2945675] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Free viewpoint video (FVV) has received considerable attention owing to its widespread applications in several areas such as immersive entertainment, remote surveillance and distanced education. Since FVV images are synthesized via a depth image-based rendering (DIBR) procedure in the "blind" environment (without reference images), a real-time and reliable blind quality assessment metric is urgently required. However, the existing image quality assessment metrics are insensitive to the geometric distortions engendered by DIBR. In this research, a novel blind method of DIBR-synthesized images is proposed based on measuring geometric distortion, global sharpness and image complexity. First, a DIBR-synthesized image is decomposed into wavelet subbands by using discrete wavelet transform. Then, the Canny operator is employed to detect the edges of the binarized low-frequency subband and high-frequency subbands. The edge similarities between the binarized low-frequency subband and high-frequency subbands are further computed to quantify geometric distortions in DIBR-synthesized images. Second, the log-energies of wavelet subbands are calculated to evaluate global sharpness in DIBR-synthesized images. Third, a hybrid filter combining the autoregressive and bilateral filters is adopted to compute image complexity. Finally, the overall quality score is derived to normalize geometric distortion and global sharpness by the image complexity. Experiments show that our proposed quality method is superior to the competing reference-free state-of-the-art DIBR-synthesized image quality models.
Collapse
|
10
|
Zhang Y, Zhang H, Yu M, Kwong S, Ho YS. Sparse Representation based Video Quality Assessment for Synthesized 3D Videos. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:509-524. [PMID: 31369374 DOI: 10.1109/tip.2019.2929433] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The temporal flicker distortion is one of the most annoying noises in synthesized virtual view videos when they are rendered by compressed multi-view video plus depth in Three Dimensional (3D) video system. To assess the synthesized view video quality and further optimize the compression techniques in 3D video system, objective video quality assessment which can accurately measure the flicker distortion is highly needed. In this paper, we propose a full reference sparse representation based video quality assessment method towards synthesized 3D videos. Firstly, a synthesized video, treated as a 3D volume data with spatial (X-Y) and temporal (T) domains, is reformed and decomposed as a number of spatially neighboring temporal layers, i.e., X-T or Y-T planes. Gradient features in temporal layers of the synthesized video and strong edges of depth maps are used as key features in detecting the location of flicker distortions. Secondly, dictionary learning and sparse representation for the temporal layers are then derived and applied to effectively represent the temporal flicker distortion. Thirdly, a rank pooling method is used to pool all the temporal layer scores and obtain the score for the flicker distortion. Finally, the temporal flicker distortion measurement is combined with the conventional spatial distortion measurement to assess the quality of synthesized 3D videos. Experimental results on synthesized video quality database demonstrate our proposed method is significantly superior to other state-of-the-art methods, especially on the view synthesis distortions induced from depth videos.
Collapse
|