1
|
Sun W, Wen W, Min X, Lan L, Zhai G, Ma K. Analysis of Video Quality Datasets via Design of Minimalistic Video Quality Models. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:7056-7071. [PMID: 38625773 DOI: 10.1109/tpami.2024.3385364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2024]
Abstract
Blind video quality assessment (BVQA) plays an indispensable role in monitoring and improving the end-users' viewing experience in various real-world video-enabled media applications. As an experimental field, the improvements of BVQA models have been measured primarily on a few human-rated VQA datasets. Thus, it is crucial to gain a better understanding of existing VQA datasets in order to properly evaluate the current progress in BVQA. Towards this goal, we conduct a first-of-its-kind computational analysis of VQA datasets via designing minimalistic BVQA models. By minimalistic, we restrict our family of BVQA models to build only upon basic blocks: a video preprocessor (for aggressive spatiotemporal downsampling), a spatial quality analyzer, an optional temporal quality analyzer, and a quality regressor, all with the simplest possible instantiations. By comparing the quality prediction performance of different model variants on eight VQA datasets with realistic distortions, we find that nearly all datasets suffer from the easy dataset problem of varying severity, some of which even admit blind image quality assessment (BIQA) solutions. We additionally justify our claims by comparing our model generalization capabilities on these VQA datasets, and by ablating a dizzying set of BVQA design choices related to the basic building blocks. Our results cast doubt on the current progress in BVQA, and meanwhile shed light on good practices of constructing next-generation VQA datasets and models.
Collapse
|
2
|
Cao Y, Min X, Sun W, Zhai G. Subjective and Objective Audio-Visual Quality Assessment for User Generated Content. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:3847-3861. [PMID: 37428674 DOI: 10.1109/tip.2023.3290528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/12/2023]
Abstract
In recent years, User Generated Content (UGC) has grown dramatically in video sharing applications. It is necessary for service-providers to use video quality assessment (VQA) to monitor and control users' Quality of Experience when watching UGC videos. However, most existing UGC VQA studies only focus on the visual distortions of videos, ignoring that the perceptual quality also depends on the accompanying audio signals. In this paper, we conduct a comprehensive study on UGC audio-visual quality assessment (AVQA) from both subjective and objective perspectives. Specially, we construct the first UGC AVQA database named SJTU-UAV database, which includes 520 in-the-wild UGC audio and video (A/V) sequences collected from the YFCC100m database. A subjective AVQA experiment is conducted on the database to obtain the mean opinion scores (MOSs) of the A/V sequences. To demonstrate the content diversity of the SJTU-UAV database, we give a detailed analysis of the SJTU-UAV database as well as other two synthetically-distorted AVQA databases and one authentically-distorted VQA database, from both the audio and video aspects. Then, to facilitate the development of AVQA fields, we construct a benchmark of AVQA models on the proposed SJTU-UAV database and other two AVQA databases, of which the benchmark models consist of AVQA models designed for synthetically distorted A/V sequences and AVQA models built through combining the popular VQA methods and audio features via support vector regressor (SVR). Finally, considering benchmark AVQA models perform poorly in assessing in-the-wild UGC videos, we further propose an effective AVQA model via jointly learning quality-aware audio and visual feature representations in the temporal domain, which is seldom investigated by existing AVQA models. Our proposed model outperforms the aforementioned benchmark AVQA models on the SJTU-UAV database and two synthetically distorted AVQA databases. The SJTU-UAV database and the code of the proposed model will be released to facilitate further research.
Collapse
|
3
|
Han Z, Liu Y, Xie R, Zhai G. Image Quality Assessment for Realistic Zoom Photos. SENSORS (BASEL, SWITZERLAND) 2023; 23:4724. [PMID: 37430638 DOI: 10.3390/s23104724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Revised: 04/25/2023] [Accepted: 05/08/2023] [Indexed: 07/12/2023]
Abstract
New CMOS imaging sensor (CIS) techniques in smartphones have helped user-generated content dominate our lives over traditional DSLRs. However, tiny sensor sizes and fixed focal lengths also lead to more grainy details, especially for zoom photos. Moreover, multi-frame stacking and post-sharpening algorithms would produce zigzag textures and over-sharpened appearances, for which traditional image-quality metrics may over-estimate. To solve this problem, a real-world zoom photo database is first constructed in this paper, which includes 900 tele-photos from 20 different mobile sensors and ISPs. Then we propose a novel no-reference zoom quality metric which incorporates the traditional estimation of sharpness and the concept of image naturalness. More specifically, for the measurement of image sharpness, we are the first to combine the total energy of the predicted gradient image with the entropy of the residual term under the framework of free-energy theory. To further compensate for the influence of over-sharpening effect and other artifacts, a set of model parameters of mean subtracted contrast normalized (MSCN) coefficients are utilized as the natural statistics representatives. Finally, these two measures are combined linearly. Experimental results on the zoom photo database demonstrate that our quality metric can achieve SROCC and PLCC over 0.91, while the performance of single sharpness or naturalness index is around 0.85. Moreover, compared with the best tested general-purpose and sharpness models, our zoom metric outperforms them by 0.072 and 0.064 in SROCC, respectively.
Collapse
Affiliation(s)
- Zongxi Han
- Institute of Image Communication and Information Processing, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yutao Liu
- School of Computer Science and Technology, Ocean University of China, Qingdao 266100, China
| | - Rong Xie
- Institute of Image Communication and Information Processing, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Guangtao Zhai
- Institute of Image Communication and Information Processing, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
4
|
Zhang W, Li D, Ma C, Zhai G, Yang X, Ma K. Continual Learning for Blind Image Quality Assessment. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2864-2878. [PMID: 35635807 DOI: 10.1109/tpami.2022.3178874] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The explosive growth of image data facilitates the fast development of image processing and computer vision methods for emerging visual applications, meanwhile introducing novel distortions to processed images. This poses a grand challenge to existing blind image quality assessment (BIQA) models, which are weak at adapting to subpopulation shift. Recent work suggests training BIQA methods on the combination of all available human-rated IQA datasets. However, this type of approach is not scalable to a large number of datasets and is cumbersome to incorporate a newly created dataset as well. In this paper, we formulate continual learning for BIQA, where a model learns continually from a stream of IQA datasets, building on what was learned from previously seen data. We first identify five desiderata in the continual setting with three criteria to quantify the prediction accuracy, plasticity, and stability, respectively. We then propose a simple yet effective continual learning method for BIQA. Specifically, based on a shared backbone network, we add a prediction head for a new dataset and enforce a regularizer to allow all prediction heads to evolve with new data while being resistant to catastrophic forgetting of old data. We compute the overall quality score by a weighted summation of predictions from all heads. Extensive experiments demonstrate the promise of the proposed continual learning method in comparison to standard training techniques for BIQA, with and without experience replay. We made the code publicly available at https://github.com/zwx8981/BIQA_CL.
Collapse
|
5
|
Chen J, Qin F, Lu F, Guo L, Li C, Yan K, Zhou X. CSPP-IQA: a multi-scale spatial pyramid pooling-based approach for blind image quality assessment. Neural Comput Appl 2022:1-12. [PMID: 36276656 PMCID: PMC9573815 DOI: 10.1007/s00521-022-07874-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 09/21/2022] [Indexed: 11/30/2022]
Abstract
The traditional image quality assessment (IQA) methods are usually based on convolutional neural networks (CNNs). For these IQA methods using CNNs, limited by the feature size of the fully connected layer, the input image needs be tailored to a pre-defined size, which usually results in destroying the original structure and content of the input image and thus reduces the accuracy of the quality assessment. In this paper, a blind image quality assessment method (named CSPP-IQA), which is based on multi-scale spatial pyramid pooling, is proposed. CSPP-IQA allows inputting the original image when assessing the image quality without any image adjustment. Moreover, by facilitating the convolutional block attention module and image understanding module, CSPP-IQA achieved better accuracy, generalization and efficiency than traditional IQA methods. The result of experiments running on real-scene IQA datasets in this study verified the effectiveness and efficiency of CSPP-IQA.
Collapse
Affiliation(s)
- Jingjing Chen
- Zhejiang University City College, Hangzhou, China
- School of Economics, Fudan University, Shanghai, China
| | - Feng Qin
- College of Computer Science and Technology, Shanghai University of Electric Power, Shanghai, China
| | - Fangfang Lu
- College of Computer Science and Technology, Shanghai University of Electric Power, Shanghai, China
- Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Lingling Guo
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Chao Li
- Zhijiang College, Zhejiang University of Technology, Shaoxing, China
| | - Ke Yan
- Department of the Built Environment, National University of Singapore, Singapore, Singapore
| | - Xiaokang Zhou
- Faculty of Data Science, Shiga University, Hikone, 522-8522 Japan
- RIKEN Center for Advanced Intelligence Project, Tokyo, Japan
| |
Collapse
|
6
|
Yang W, Wu J, Tian S, Li L, Dong W, Shi G. Fine-Grained Image Quality Caption With Hierarchical Semantics Degradation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:3578-3590. [PMID: 35511851 DOI: 10.1109/tip.2022.3171445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Blind image quality assessment (BIQA), which is capable of precisely and automatically estimating human perceived image quality with no pristine image for comparison, attracts extensive attention and is of wide applications. Recently, many existing BIQA methods commonly represent image quality with a quantitative value, which is inconsistent with human cognition. Generally, human beings are good at perceiving image quality in terms of semantic description rather than quantitative value. Moreover, cognition is a needs-oriented task where humans are able to extract image contents with local to global semantics as they need. The mediocre quality value represents coarse or holistic image quality and fails to reflect degradation on hierarchical semantics. In this paper, to comply with human cognition, a novel quality caption model is inventively proposed to measure fine-grained image quality with hierarchical semantics degradation. Research on human visual system indicates there are hierarchy and reverse hierarchy correlations between hierarchical semantics. Meanwhile, empirical evidence shows that there are also bi-directional degradation dependencies between them. Thus, a novel bi-directional relationship-based network (BDRNet) is proposed for semantics degradation description, through adaptively exploring those correlations and degradation dependencies in a bi-directional manner. Extensive experiments demonstrate that our method outperforms the state-of-the-arts in terms of both evaluation performance and generalization ability.
Collapse
|
7
|
Song T, Li L, Zhu H, Qian J. IE-IQA: Intelligibility Enriched Generalizable No-Reference Image Quality Assessment. Front Neurosci 2021; 15:739138. [PMID: 34744610 PMCID: PMC8566698 DOI: 10.3389/fnins.2021.739138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Accepted: 09/08/2021] [Indexed: 11/13/2022] Open
Abstract
Image quality assessment (IQA) for authentic distortions in the wild is challenging. Though current IQA metrics have achieved decent performance for synthetic distortions, they still cannot be satisfactorily applied to realistic distortions because of the generalization problem. Improving generalization ability is an urgent task to make IQA algorithms serviceable in real-world applications, while relevant research is still rare. Fundamentally, image quality is determined by both distortion degree and intelligibility. However, current IQA metrics mostly focus on the distortion aspect and do not fully investigate the intelligibility, which is crucial for achieving robust quality estimation. Motivated by this, this paper presents a new framework for building highly generalizable image quality model by integrating the intelligibility. We first analyze the relation between intelligibility and image quality. Then we propose a bilateral network to integrate the above two aspects of image quality. During the fusion process, feature selection strategy is further devised to avoid negative transfer. The framework not only catches the conventional distortion features but also integrates intelligibility features properly, based on which a highly generalizable no-reference image quality model is achieved. Extensive experiments are conducted based on five intelligibility tasks, and the results demonstrate that the proposed approach outperforms the state-of-the-art metrics, and the intelligibility task consistently improves metric performance and generalization ability.
Collapse
Affiliation(s)
- Tianshu Song
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Leida Li
- School of Artificial Intelligence, Xidian University, Xi'an, China.,Pazhou Lab, Guangzhou, China
| | - Hancheng Zhu
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
| | - Jiansheng Qian
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| |
Collapse
|
8
|
Zhang W, Ma K, Zhai G, Yang X. Uncertainty-Aware Blind Image Quality Assessment in the Laboratory and Wild. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:3474-3486. [PMID: 33661733 DOI: 10.1109/tip.2021.3061932] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Performance of blind image quality assessment (BIQA) models has been significantly boosted by end-to-end optimization of feature engineering and quality regression. Nevertheless, due to the distributional shift between images simulated in the laboratory and captured in the wild, models trained on databases with synthetic distortions remain particularly weak at handling realistic distortions (and vice versa). To confront the cross-distortion-scenario challenge, we develop a unified BIQA model and an approach of training it for both synthetic and realistic distortions. We first sample pairs of images from individual IQA databases, and compute a probability that the first image of each pair is of higher quality. We then employ the fidelity loss to optimize a deep neural network for BIQA over a large number of such image pairs. We also explicitly enforce a hinge constraint to regularize uncertainty estimation during optimization. Extensive experiments on six IQA databases show the promise of the learned method in blindly assessing image quality in the laboratory and wild. In addition, we demonstrate the universality of the proposed training strategy by using it to improve existing BIQA models.
Collapse
|
9
|
Hosseini MS, Zhang Y, Plataniotis KN. Encoding Visual Sensitivity by MaxPol Convolution Filters for Image Sharpness Assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:4510-4525. [PMID: 30908222 DOI: 10.1109/tip.2019.2906582] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this paper, we propose a novel design of Human Visual System (HVS) response in a convolutional filter form to decompose meaningful features that are closely tied with image sharpness level. No-reference (NR) Image sharpness assessment (ISA) techniques have emerged as the standard of image quality assessment in diverse imaging applications. Despite their high correlation with subjective scoring, they are challenging for practical considerations due to high computational cost and lack of scalability across different image blurs. We bridge this gap by synthesizing the HVS response as a linear combination of Finite Impulse Response (FIR) derivative filters to boost the falloff of high band frequency magnitudes in natural imaging paradigm. The numerical implementation of the HVS filter is carried out with MaxPol filter library that can be arbitrarily set for any differential orders and cutoff frequencies to balance out the estimation of informative features and noise sensitivities. Utilized by HVS filter, we then design an innovative NR-ISA metric called "HVS-MaxPol" that (a) requires minimal computational cost, (b) produce high correlation accuracy with image sharpness level, and (c) scales to assess synthetic and natural image blur. Specifically, the synthetic blur images are constructed by blurring the raw images using Gaussian filter, while natural blur is observed from real-life application such as motion, out-of-focus, luminance contrast, etc. Furthermore, we create a natural benchmark database in digital pathology for validation of image focus quality in whole slide imaging systems called "FocusPath" consisting of 864 blurred images. Thorough experiments are designed to test and validate the efficiency of HVS-MaxPol across different blur databases and state-of-the-art NR-ISA metrics. The experiment result indicates that our metric has the best overall performance with respect to speed, accuracy and scalability.
Collapse
|
10
|
No-Reference Image Blur Assessment Based on Response Function of Singular Values. Symmetry (Basel) 2018. [DOI: 10.3390/sym10080304] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Blur is an important factor affecting the image quality. This paper presents an efficient no-reference (NR) image blur assessment method based on a response function of singular values. For an image, the grayscale image is computed to the acquire spatial information. In the meantime, the gradient map is computed to acquire the shape information, and the saliency map can be obtained by using scale-invariant feature transform (SIFT). Then, the grayscale image, the gradient map, and the saliency map are divided into blocks of the same size. The blocks of the gradient map are converted into discrete cosine transform (DCT) coefficients, from which the response function of singular values (RFSV) are generated. The sum of the RFSV are then utilized to characterize the image blur. The variance of the grayscale image and the DCT domain entropy of the gradient map are used to reduce the impact of the image content. The SIFT-dependent weights are calculated in the saliency map, which are assigned to the image blocks. Finally, the blur score is the normalized sum of the RFSV. Extensive experiments are conducted on four synthetic databases and two real blur databases. The experimental results indicate that the blur scores produced by our method are highly correlated with the subjective evaluations. Furthermore, the proposed method is superior to six state-of-the-art methods.
Collapse
|
11
|
Wang H, Wang J, Chen W, Xu L. Automatic illumination planning for robot vision inspection system. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.05.015] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
12
|
Gao F, Wang Y, Li P, Tan M, Yu J, Zhu Y. DeepSim: Deep similarity for image quality assessment. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.01.054] [Citation(s) in RCA: 91] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
13
|
Yu S, Wu S, Wang L, Jiang F, Xie Y, Li L. A shallow convolutional neural network for blind image sharpness assessment. PLoS One 2017; 12:e0176632. [PMID: 28459832 PMCID: PMC5436206 DOI: 10.1371/journal.pone.0176632] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Accepted: 04/13/2017] [Indexed: 11/18/2022] Open
Abstract
Blind image quality assessment can be modeled as feature extraction followed by score prediction. It necessitates considerable expertise and efforts to handcraft features for optimal representation of perceptual image quality. This paper addresses blind image sharpness assessment by using a shallow convolutional neural network (CNN). The network takes single feature layer to unearth intrinsic features for image sharpness representation and utilizes multilayer perceptron (MLP) to rate image quality. Different from traditional methods, CNN integrates feature extraction and score prediction into an optimization procedure and retrieves features automatically from raw images. Moreover, its prediction performance can be enhanced by replacing MLP with general regression neural network (GRNN) and support vector regression (SVR). Experiments on Gaussian blur images from LIVE-II, CSIQ, TID2008 and TID2013 demonstrate that CNN features with SVR achieves the best overall performance, indicating high correlation with human subjective judgment.
Collapse
Affiliation(s)
- Shaode Yu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences,
Shenzhen, Guangdong, China
- Shenzhen College of Advanced Technology, University of Chinese Academy of
Sciences, Shenzhen, Guangdong, China
| | - Shibin Wu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences,
Shenzhen, Guangdong, China
- Shenzhen College of Advanced Technology, University of Chinese Academy of
Sciences, Shenzhen, Guangdong, China
| | - Lei Wang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences,
Shenzhen, Guangdong, China
| | - Fan Jiang
- Faculty of Information Engineering and Automation, Kunming University of
Science and Technology, Kunming, Yunnan, China
| | - Yaoqin Xie
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences,
Shenzhen, Guangdong, China
- * E-mail:
(YQX); (LDL)
| | - Leida Li
- School of Information and Control Engineering, China University of Mining
and Technology, Xuzhou, Jiangsu, China
- * E-mail:
(YQX); (LDL)
| |
Collapse
|
14
|
Statistical Evaluation of No-Reference Image Quality Assessment Metrics for Remote Sensing Images. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2017. [DOI: 10.3390/ijgi6050133] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
15
|
An experimental survey of no-reference video quality assessment methods. INTERNATIONAL JOURNAL OF PERVASIVE COMPUTING AND COMMUNICATIONS 2016. [DOI: 10.1108/ijpcc-01-2016-0008] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
The Video Quality Metric (VQM) is one of the most used objective methods to assess video quality, because of its high correlation with the human visual system (HVS). VQM is, however, not viable in real-time deployments such as mobile streaming, not only due to its high computational demands but also because, as a Full Reference (FR) metric, it requires both the original video and its impaired counterpart. In contrast, No Reference (NR) objective algorithms operate directly on the impaired video and are considerably faster but loose out in accuracy. The purpose of this paper is to study how differently NR metrics perform in the presence of network impairments.
Design/methodology/approach
The authors assess eight NR metrics, alongside a lightweight FR metric, using VQM as benchmark in a self-developed network-impaired video data set. This paper covers a range of methods, a diverse set of video types and encoding conditions and a variety of network impairment test-cases.
Findings
The authors show the extent by which packet loss affects different video types, correlating the accuracy of NR metrics to the FR benchmark. This paper helps identifying the conditions under which simple metrics may be used effectively and indicates an avenue to control the quality of streaming systems.
Originality/value
Most studies in literature have focused on assessing streams that are either unaffected by the network (e.g. looking at the effects of video compression algorithms) or are affected by synthetic network impairments (i.e. via simulated network conditions). The authors show that when streams are affected by real network conditions, assessing Quality of Experience becomes even harder, as the existing metrics perform poorly.
Collapse
|
16
|
Yan R, Shao L. Blind Image Blur Estimation via Deep Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:1910-1921. [PMID: 26930680 DOI: 10.1109/tip.2016.2535273] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Image blur kernel estimation is critical to blind image deblurring. Most existing approaches exploit handcrafted blur features that are optimized for a certain uniform blur across the image, which is unrealistic in a real blind deconvolution setting, where the blur type is often unknown. To deal with this issue, we aim at identifying the blur type for each input image patch, and then estimating the kernel parameter in this paper. A learning-based method using a pre-trained deep neural network (DNN) and a general regression neural network (GRNN) is proposed to first classify the blur type and then estimate its parameters, taking advantages of both the classification ability of DNN and the regression ability of GRNN. To the best of our knowledge, this is the first time that pre-trained DNN and GRNN have been applied to the problem of blur analysis. First, our method identifies the blur type from a mixed input of image patches corrupted by various blurs with different parameters. To this aim, a supervised DNN is trained to project the input samples into a discriminative feature space, in which the blur type can be easily classified. Then, for each blur type, the proposed GRNN estimates the blur parameters with very high accuracy. Experiments demonstrate the effectiveness of the proposed method in several tasks with better or competitive results compared with the state of the art on two standard image data sets, i.e., the Berkeley segmentation data set and the Pascal VOC 2007 data set. In addition, blur region segmentation and deblurring on a number of real photographs show that our method outperforms the previous techniques even for non-uniformly blurred images.
Collapse
|
17
|
Zhang C, Pan J, Chen S, Wang T, Sun D. No reference image quality assessment using sparse feature representation in two dimensions spatial correlation. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.01.105] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
18
|
Yousaf S, Qin S. Closed-Loop Restoration Approach to Blurry Images Based on Machine Learning and Feedback Optimization. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:5928-5941. [PMID: 26513786 DOI: 10.1109/tip.2015.2492825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Blind image deconvolution (BID) aims to remove or reduce the degradations that have occurred during the acquisition or processing. It is a challenging ill-posed problem due to a lack of enough information in degraded image for unambiguous recovery of both point spread function (PSF) and clear image. Although recently many powerful algorithms appeared; however, it is still an active research area due to the diversity of degraded images as well as degradations. Closed-loop control systems are characterized with their powerful ability to stabilize the behavior response and overcome external disturbances by designing an effective feedback optimization. In this paper, we employed feedback control to enhance the stability of BID by driving the current estimation quality of PSF to the desired level without manually selecting restoration parameters and using an effective combination of machine learning with feedback optimization. The foremost challenge when designing a feedback structure is to construct or choose a suitable performance metric as a controlled index and a feedback information. Our proposed quality metric is based on the blur assessment of deconvolved patches to identify the best PSF and computing its relative quality. The Kalman filter-based extremum seeking approach is employed to find the optimum value of controlled variable. To find better restoration parameters, learning algorithms, such as multilayer perceptron and bagged decision trees, are used to estimate the generic PSF support size instead of trial and error methods. The problem is modeled as a combination of pattern classification and regression using multiple training features, including noise metrics, blur metrics, and low-level statistics. Multi-objective genetic algorithm is used to find key patches from multiple saliency maps which enhance performance and save extra computation by avoiding ineffectual regions of the image. The proposed scheme is shown to outperform corresponding open-loop schemes, which often fails or needs many assumptions regarding images and thus resulting in sub-optimal results.
Collapse
|
19
|
|
20
|
Virtanen T, Nuutinen M, Vaahteranoksa M, Oittinen P, Hakkinen J. CID2013: a database for evaluating no-reference image quality assessment algorithms. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:390-402. [PMID: 25494511 DOI: 10.1109/tip.2014.2378061] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
This paper presents a new database, CID2013, to address the issue of using no-reference (NR) image quality assessment algorithms on images with multiple distortions. Current NR algorithms struggle to handle images with many concurrent distortion types, such as real photographic images captured by different digital cameras. The database consists of six image sets; on average, 30 subjects have evaluated 12-14 devices depicting eight different scenes for a total of 79 different cameras, 480 images, and 188 subjects (67% female). The subjective evaluation method was a hybrid absolute category rating-pair comparison developed for the study and presented in this paper. This method utilizes a slideshow of all images within a scene to allow the test images to work as references to each other. In addition to mean opinion score value, the images are also rated using sharpness, graininess, lightness, and color saturation scales. The CID2013 database contains images used in the experiments with the full subjective data plus extensive background information from the subjects. The database is made freely available for the research community.
Collapse
|
21
|
Sang QB, Wu XJ, Li CF, Lu Y. Blind image blur assessment using singular value similarity and blur comparisons. PLoS One 2014; 9:e108073. [PMID: 25247555 PMCID: PMC4172683 DOI: 10.1371/journal.pone.0108073] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2014] [Accepted: 08/25/2014] [Indexed: 11/19/2022] Open
Abstract
The increasing number of demanding consumer image applications has led to increased interest in no-reference objective image quality assessment (IQA) algorithms. In this paper, we propose a new blind blur index for still images based on singular value similarity. The algorithm consists of three steps. First, a re-blurred image is produced by applying a Gaussian blur to the test image. Second, a singular value decomposition is performed on the test image and re-blurred image. Finally, an image blur index is constructed based on singular value similarity. The experimental results obtained on four simulated databases to demonstrate that the proposed algorithm has high correlation with human judgment when assessing blur or noise distortion of images.
Collapse
Affiliation(s)
- Qing-Bing Sang
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu, China
| | - Xiao-Jun Wu
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu, China
| | - Chao-Feng Li
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu, China
| | - Yin Lu
- Computer Science Department, Texas Tech University, Lubbock, Texas, United States of America
| |
Collapse
|
22
|
Gao X, Gao F, Tao D, Li X. Universal blind image quality assessment metrics via natural scene statistics and multiple kernel learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2013; 24:2013-26. [PMID: 24805219 DOI: 10.1109/tnnls.2013.2271356] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Universal blind image quality assessment (IQA) metrics that can work for various distortions are of great importance for image processing systems, because neither ground truths are available nor the distortion types are aware all the time in practice. Existing state-of-the-art universal blind IQA algorithms are developed based on natural scene statistics (NSS). Although NSS-based metrics obtained promising performance, they have some limitations: 1) they use either the Gaussian scale mixture model or generalized Gaussian density to predict the nonGaussian marginal distribution of wavelet, Gabor, or discrete cosine transform coefficients. The prediction error makes the extracted features unable to reflect the change in nonGaussianity (NG) accurately. The existing algorithms use the joint statistical model and structural similarity to model the local dependency (LD). Although this LD essentially encodes the information redundancy in natural images, these models do not use information divergence to measure the LD. Although the exponential decay characteristic (EDC) represents the property of natural images that large/small wavelet coefficient magnitudes tend to be persistent across scales, which is highly correlated with image degradations, it has not been applied to the universal blind IQA metrics; and 2) all the universal blind IQA metrics use the same similarity measure for different features for learning the universal blind IQA metrics, though these features have different properties. To address the aforementioned problems, we propose to construct new universal blind quality indicators using all the three types of NSS, i.e., the NG, LD, and EDC, and incorporating the heterogeneous property of multiple kernel learning (MKL). By analyzing how different distortions affect these statistical properties, we present two universal blind quality assessment models, NSS global scheme and NSS two-step scheme. In the proposed metrics: 1) we exploit the NG of natural images using the original marginal distribution of wavelet coefficients; 2) we measure correlations between wavelet coefficients using mutual information defined in information theory; 3) we use features of EDC in universal blind image quality prediction directly; and 4) we introduce MKL to measure the similarity of different features using different kernels. Thorough experimental results on the Laboratory for Image and Video Engineering database II and the Tampere Image Database2008 demonstrate that both metrics are in remarkably high consistency with the human perception, and overwhelm representative universal blind algorithms as well as some standard full reference quality indexes for various types of distortions.
Collapse
|
23
|
Dong L, Su J, Izquierdo E. Scene-oriented hierarchical classification of blurry and noisy images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2012; 21:2534-2545. [PMID: 22334004 DOI: 10.1109/tip.2012.2187528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
A system for scene-oriented hierarchical classification of blurry and noisy images is proposed. It attempts to simulate important features of the human visual perception. The underlying approach is based on three strategies: extraction of essential signatures captured from a global context, simulating the global pathway; highlight detection based on local conspicuous features of the reconstructed image, simulating the local pathway; and hierarchical classification of extracted features using probabilistic techniques. The techniques involved in hierarchical classification use input from both the local and global pathways. Visual context is exploited by a combination of Gabor filtering with the principal component analysis. In parallel, a pseudo-restoration process is applied together with an affine invariant approach to improve the accuracy in the detection of local conspicuous features. Subsequently, the local conspicuous features and the global essential signature are combined and clustered by a Monte Carlo approach. Finally, clustered features are fed to a self-organizing tree algorithm to generate the final hierarchical classification results. Selected representative results of a comprehensive experimental evaluation validate the proposed system.
Collapse
Affiliation(s)
- Le Dong
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China.
| | | | | |
Collapse
|