51
|
Picetti F, Mandelli S, Bestagini P, Lipari V, Tubaro S. DIPPAS: a deep image prior PRNU anonymization scheme. EURASIP JOURNAL ON INFORMATION SECURITY 2022. [DOI: 10.1186/s13635-022-00128-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
AbstractSource device identification is an important topic in image forensics since it allows to trace back the origin of an image. Its forensics counterpart is source device anonymization, that is, to mask any trace on the image that can be useful for identifying the source device. A typical trace exploited for source device identification is the photo response non-uniformity (PRNU), a noise pattern left by the device on the acquired images. In this paper, we devise a methodology for suppressing such a trace from natural images without a significant impact on image quality. Expressly, we turn PRNU anonymization into the combination of a global optimization problem in a deep image prior (DIP) framework followed by local post-processing operations. In a nutshell, a convolutional neural network (CNN) acts as a generator and iteratively returns several images with attenuated PRNU traces. By exploiting straightforward local post-processing and assembly on these images, we produce a final image that is anonymized with respect to the source PRNU, still maintaining high visual quality. With respect to widely adopted deep learning paradigms, the used CNN is not trained on a set of input-target pairs of images. Instead, it is optimized to reconstruct output images from the original image under analysis itself. This makes the approach particularly suitable in scenarios where large heterogeneous databases are analyzed. Moreover, it prevents any problem due to the lack of generalization. Through numerical examples on publicly available datasets, we prove our methodology to be effective compared to state-of-the-art techniques.
Collapse
|
52
|
High-Resolution ISAR Imaging Based on Plug-and-Play 2D ADMM-Net. REMOTE SENSING 2022. [DOI: 10.3390/rs14040901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
We propose a deep learning architecture, dubbed Plug-and-play 2D ADMM-Net (PAN), by combining model-driven deep networks and data-driven deep networks for effective high-resolution 2D inverse synthetic aperture radar (ISAR) imaging with various signal-to-noise ratios (SNR) and incomplete data scenarios. First, a sparse observation model of 2D ISAR imaging is established, and a 2D ADMM algorithm is presented. On this basis, using the plug and play (PnP) technique, PnP 2D ADMM is proposed, by combining the 2D ADMM algorithm and the deep denoising network DnCNN. Then, we unroll and generalize the PnP 2D ADMM to the PAN architecture, in which all adjustable parameters in the reconstruction layers, denoising layers, and multiplier update layers are learned by end-to-end training through back-propagation. Experimental results showed that the PAN with a single parameter set can achieve noise-robust ISAR imaging with superior reconstruction performance on incomplete simulated and measured data under different SNRs.
Collapse
|
53
|
Multiframe blind restoration with image quality prior. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
54
|
Zhang Y, Li J, Li X, Wang B, Li T. Image Stripe Noise Removal Based on Compressed Sensing. INT J PATTERN RECOGN 2022. [DOI: 10.1142/s0218001422540040] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The sensors or electronic components are vulnerable to interference in the camera’s imaging process, usually leading to random directional stripes. Therefore, a method of stripe noise removal based on compressed sensing is proposed. First, the measurement matrix of the image with stripe noise is established, which makes the stripe images equivalent to the observation of the original image. Second, the relationships between the corresponding coefficients of adjacent scales are defined. On this basis, the bivariate threshold function is set in the curvelet sparse domain to represent the features of images. Finally, the Landweber iteration algorithm of alternating convex projection and filtering operation is achieved. Furthermore, to accelerate the noise removal at the initial stage of iteration and preserve the image details later, the exponential threshold function is utilized. This method does not need many samples, which is different from the current deep learning method. The experimental results show that the proposed algorithm represents excellent performance in removing the stripes and preserving the texture details. In addition, the PSNR of the denoised image has been dramatically improved compared with similar algorithms.
Collapse
Affiliation(s)
- Yan Zhang
- School of Computer and Information Technology, Northeast Petroleum University, Daqing, Heilongjiang 163318, P. R. China
| | - Jie Li
- School of Computer and Information Technology, Northeast Petroleum University, Daqing, Heilongjiang 163318, P. R. China
| | - Xinyue Li
- School of Computer and Information Technology, Northeast Petroleum University, Daqing, Heilongjiang 163318, P. R. China
| | - Bin Wang
- School of Computer and Information Technology, Northeast Petroleum University, Daqing, Heilongjiang 163318, P. R. China
| | - Tiange Li
- Natural Gas Branch Company of Daqing Oilfield Limited Company, Daqing, Heilongjiang 163453, P. R. China
| |
Collapse
|
55
|
Liu R, Ma L, Yuan X, Zeng S, Zhang J. Task-Oriented Convex Bilevel Optimization With Latent Feasibility. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:1190-1203. [PMID: 35015638 DOI: 10.1109/tip.2022.3140607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This paper firstly proposes a convex bilevel optimization paradigm to formulate and optimize popular learning and vision problems in real-world scenarios. Different from conventional approaches, which directly design their iteration schemes based on given problem formulation, we introduce a task-oriented energy as our latent constraint which integrates richer task information. By explicitly re- characterizing the feasibility, we establish an efficient and flexible algorithmic framework to tackle convex models with both shrunken solution space and powerful auxiliary (based on domain knowledge and data distribution of the task). In theory, we present the convergence analysis of our latent feasibility re- characterization based numerical strategy. We also analyze the stability of the theoretical convergence under computational error perturbation. Extensive numerical experiments are conducted to verify our theoretical findings and evaluate the practical performance of our method on different applications.
Collapse
|
56
|
Zhou C, Kong Y, Zhang C, Sun L, Wu D, Zhou C. A Hybrid Sparse Representation Model for Image Restoration. SENSORS (BASEL, SWITZERLAND) 2022; 22:537. [PMID: 35062497 PMCID: PMC8778763 DOI: 10.3390/s22020537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 01/03/2022] [Accepted: 01/05/2022] [Indexed: 06/14/2023]
Abstract
Group-based sparse representation (GSR) uses image nonlocal self-similarity (NSS) prior to grouping similar image patches, and then performs sparse representation. However, the traditional GSR model restores the image by training degraded images, which leads to the inevitable over-fitting of the data in the training model, resulting in poor image restoration results. In this paper, we propose a new hybrid sparse representation model (HSR) for image restoration. The proposed HSR model is improved in two aspects. On the one hand, the proposed HSR model exploits the NSS priors of both degraded images and external image datasets, making the model complementary in feature space and the plane. On the other hand, we introduce a joint sparse representation model to make better use of local sparsity and NSS characteristics of the images. This joint model integrates the patch-based sparse representation (PSR) model and GSR model, while retaining the advantages of the GSR model and the PSR model, so that the sparse representation model is unified. Extensive experimental results show that the proposed hybrid model outperforms several existing image recovery algorithms in both objective and subjective evaluations.
Collapse
Affiliation(s)
- Caiyue Zhou
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (C.Z.); (L.S.); (D.W.)
| | - Yanfen Kong
- Department of Information Engineering, Weihai Ocean Vocational College, Rongcheng 264300, China; (Y.K.); (C.Z.)
| | - Chuanyong Zhang
- Department of Information Engineering, Weihai Ocean Vocational College, Rongcheng 264300, China; (Y.K.); (C.Z.)
| | - Lin Sun
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (C.Z.); (L.S.); (D.W.)
| | - Dongmei Wu
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (C.Z.); (L.S.); (D.W.)
| | - Chongbo Zhou
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (C.Z.); (L.S.); (D.W.)
- Department of Information Engineering, Weihai Ocean Vocational College, Rongcheng 264300, China; (Y.K.); (C.Z.)
| |
Collapse
|
57
|
Kong S, Wang W, Feng X, Jia X. Deep RED Unfolding Network for Image Restoration. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:852-867. [PMID: 34951845 DOI: 10.1109/tip.2021.3136623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The deep unfolding network (DUN) provides an efficient framework for image restoration. It consists of a regularization module and a data fitting module. In existing DUN models, it is common to directly use a deep convolution neural network (DCNN) as the regularization module, and perform data fitting before regularization in each iteration/stage. In this work, we present a DUN by incorporating a new regularization module, and putting the regularization module before the data fitting module. The proposed regularization model is deducted by using the regularization by denoing (RED) and plugging in it a newly designed DCNN. For the data fitting module, we use the closed-form solution with Faster Fourier Transform (FFT). The resulted DRED-DUN model has some major advantages. First, the regularization model inherits the flexibility of learned image-adaptive and interpretability of RED. Second, the DRED-DUN model is an end-to-end trainable DUN, which learns the regularization network and other parameters jointly, thus leads to better restoration performance than the plug-and-play framework. Third, extensive experiments show that, our proposed model significantly outperforms the-state-of-the-art model-based methods and learning based methods in terms of PSNR indexes as well as the visual effects. In particular, our method has much better capability in recovering salient image components such as edges and small scale textures.
Collapse
|
58
|
Su T, Cui Z, Yang J, Zhang Y, Liu J, Zhu J, Gao X, Fang S, Zheng H, Ge Y, Liang D. Generalized deep iterative reconstruction for sparse-view CT imaging. Phys Med Biol 2021; 67. [PMID: 34847538 DOI: 10.1088/1361-6560/ac3eae] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 11/30/2021] [Indexed: 11/11/2022]
Abstract
Sparse-view CT is a promising approach in reducing the X-ray radiation dose in clinical CT imaging. However, the CT images reconstructed from the conventional filtered backprojection (FBP) algorithm suffer from severe streaking artifacts. Iterative reconstruction (IR) algorithms have been widely adopted to mitigate these streaking artifacts, but they may prolong the CT imaging time due to the intense data-specific computations. Recently, model-driven deep learning (DL) CT image reconstruction method, which unrolls the iterative optimization procedures into the deep neural network, has shown exciting prospect in improving the image quality and shortening the reconstruction time. In this work, we explore the generalized unrolling scheme for such iterative model to further enhance its performance on sparse-view CT imaging. By using it, the iteration parameters, regularizer term, data-fidelity term and even the mathematical operations are all assumed to be learned and optimized via the network training. Results from the numerical and experimental sparse-view CT imaging demonstrate that the newly proposed network with the maximum generalization provides the best reconstruction performance.
Collapse
Affiliation(s)
- Ting Su
- Institute of Biomedical and Health Engineering, Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences, Shenzhen, CHINA
| | - Zhuoxu Cui
- Institute of Biomedical and Health Engineering, Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences, Shenzhen, CHINA
| | - Jiecheng Yang
- Institute of Biomedical and Health Engineering, Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences, Shenzhen, CHINA
| | - Yunxin Zhang
- Beijing Jishuitan Hospital, Beijing, Beijing, CHINA
| | - Jian Liu
- Beijing Tiantan Hospital, Beijing, CHINA
| | - Jiongtao Zhu
- Institute of Biomedical and Health Engineering, Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences, Shenzhen, CHINA
| | - Xiang Gao
- Institute of Biomedical and Health Engineering, Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences, Shenzhen, CHINA
| | - Shibo Fang
- Institute of Biomedical and Health Engineering, Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences, Shenzhen, CHINA
| | - Hairong Zheng
- Paul C. Lauterbur Research Centre for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Shenzhen Institutes of Advanced Technology, 1068 Xueyuan Avenue, Shenzhen University Town, Shenzhen, P.R.China, Shenzhen, CHINA
| | - Yongshuai Ge
- Institute of Biomedical and Health Engineering, Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences, Shenzhen, 518055, CHINA
| | - Dong Liang
- Paul C. Lauterbur Research Centre for Biomedical Imaging, Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences, 1068 Xueyuan Avenue, Shenzhen University Town, Shenzhen, P.R.China, Shenzhen, 518055, CHINA
| |
Collapse
|
59
|
Image Denoising Using Nonlocal Regularized Deep Image Prior. Symmetry (Basel) 2021. [DOI: 10.3390/sym13112114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Deep neural networks have shown great potential in various low-level vision tasks, leading to several state-of-the-art image denoising techniques. Training a deep neural network in a supervised fashion usually requires the collection of a great number of examples and the consumption of a significant amount of time. However, the collection of training samples is very difficult for some application scenarios, such as the full-sampled data of magnetic resonance imaging and the data of satellite remote sensing imaging. In this paper, we overcome the problem of a lack of training data by using an unsupervised deep-learning-based method. Specifically, we propose a deep-learning-based method based on the deep image prior (DIP) method, which only requires a noisy image as training data, without any clean data. It infers the natural images with random inputs and the corrupted observation with the help of performing correction via a convolutional network. We improve the original DIP method as follows: Firstly, the original optimization objective function is modified by adding nonlocal regularizers, consisting of a spatial filter and a frequency domain filter, to promote the gradient sparsity of the solution. Secondly, we solve the optimization problem with the alternating direction method of multipliers (ADMM) framework, resulting in two separate optimization problems, including a symmetric U-Net training step and a plug-and-play proximal denoising step. As such, the proposed method exploits the powerful denoising ability of both deep neural networks and nonlocal regularizations. Experiments validate the effectiveness of leveraging a combination of DIP and nonlocal regularizers, and demonstrate the superior performance of the proposed method both quantitatively and visually compared with the original DIP method.
Collapse
|
60
|
Cheng J, Cui ZX, Huang W, Ke Z, Ying L, Wang H, Zhu Y, Liang D. Learning Data Consistency and its Application to Dynamic MR Imaging. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:3140-3153. [PMID: 34252025 DOI: 10.1109/tmi.2021.3096232] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Magnetic resonance (MR) image reconstruction from undersampled k-space data can be formulated as a minimization problem involving data consistency and image prior. Existing deep learning (DL)-based methods for MR reconstruction employ deep networks to exploit the prior information and integrate the prior knowledge into the reconstruction under the explicit constraint of data consistency, without considering the real distribution of the noise. In this work, we propose a new DL-based approach termed Learned DC that implicitly learns the data consistency with deep networks, corresponding to the actual probability distribution of system noise. The data consistency term and the prior knowledge are both embedded in the weights of the networks, which provides an utterly implicit manner of learning reconstruction model. We evaluated the proposed approach with highly undersampled dynamic data, including the dynamic cardiac cine data with up to 24-fold acceleration and dynamic rectum data with the acceleration factor equal to the number of phases. Experimental results demonstrate the superior performance of the Learned DC both quantitatively and qualitatively than the state-of-the-art.
Collapse
|
61
|
Liu R, Mu P, Zhang J. Investigating Customization Strategies and Convergence Behaviors of Task-Specific ADMM. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:8278-8292. [PMID: 34559653 DOI: 10.1109/tip.2021.3113796] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Alternating Direction Method of Multiplier (ADMM) has been a popular algorithmic framework for separable optimization problems with linear constraints. For numerical ADMM fail to exploit the particular structure of the problem at hand nor the input data information, leveraging task-specific modules (e.g., neural networks and other data-driven architectures) to extend ADMM is a significant but challenging task. This work focuses on designing a flexible algorithmic framework to incorporate various task-specific modules (with no additional constraints) to improve the performance of ADMM in real-world applications. Specifically, we propose Guidance from Optimality (GO), a new customization strategy, to embed task-specific modules into ADMM (GO-ADMM). By introducing an optimality-based criterion to guide the propagation, GO-ADMM establishes an updating scheme agnostic to the choice of additional modules. The existing task-specific methods just plug their task-specific modules into the numerical iterations in a straightforward manner. Even with some restrictive constraints on the plug-in modules, they can only obtain some relatively weaker convergence properties for the resulted ADMM iterations. Fortunately, without any restrictions on the embedded modules, we prove the convergence of GO-ADMM regarding objective values and constraint violations, and derive the worst-case convergence rate measured by iteration complexity. Extensive experiments are conducted to verify the theoretical results and demonstrate the efficiency of GO-ADMM.
Collapse
|
62
|
Jung H, Kim Y, Jang H, Ha N, Sohn K. Multi-Task Learning Framework for Motion Estimation and Dynamic Scene Deblurring. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:8170-8183. [PMID: 34550887 DOI: 10.1109/tip.2021.3113185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Motion blur, which disturbs human and machine perceptions of a scene, has been considered an unnecessary artifact that should be removed. However, the blur can be a useful clue to understanding the dynamic scene, since various sources of motion generate different types of artifacts. Motivated by the relationship between motion and blur, we propose a motion-aware feature learning framework for dynamic scene deblurring through multi-task learning. Our multi-task framework simultaneously estimates a deblurred image and a motion field from a blurred image. We design the encoder-decoder architectures for two tasks, and the encoder part is shared between them. Our motion estimation network could effectively distinguish between different types of blur, which facilitates image deblurring. Understanding implicit motion information through image deblurring could improve the performance of motion estimation. In addition to sharing the network between two tasks, we propose a reblurring loss function to optimize the overall parameters in our multi-task architecture. We provide an intensive analysis of complementary tasks to show the effectiveness of our multi-task framework. Furthermore, the experimental results demonstrate that the proposed method outperforms the state-of-the-art deblurring methods with respect to both qualitative and quantitative evaluations.
Collapse
|
63
|
Bian L, Wang Y, Zhang J. Generalized MSFA Engineering With Structural and Adaptive Nonlocal Demosaicing. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:7867-7877. [PMID: 34487494 DOI: 10.1109/tip.2021.3108913] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The emerging multispectral-filter-array (MSFA) cameras require generalized demosaicing for MSFA engineering. The existing interpolation, compressive sensing and deep learning based methods suffer from either limited reconstruction accuracy or poor generalization. In this work, we report a generalized demosaicing method with structural and adaptive nonlocal optimization, enabling boosted reconstruction accuracy for different MSFAs. The advantages lie in the following three aspects. First, the nonlocal low-rank optimization is applied and extended to the multiple spatial-spectral-temporal dimensions to exploit more crucial details. Second, the block matching accuracy is promoted by employing a novel structural similarity metric instead of the conventional Euclidean distance. Third, the running efficiency is boosted by an adaptive iteration strategy. We built a prototype system to capture raw mosaic images under different MSFAs, and used the technique as an off-the-shelf tool to demonstrate MSFA engineering. The experiments show that the binary tree (BT) based filter array produces higher accuracy than the random and regular ones for different number of channels.
Collapse
|
64
|
|
65
|
Zhou H, Feng H, Xu W, Xu Z, Li Q, Chen Y. Deep denoiser prior based deep analytic network for lensless image restoration. OPTICS EXPRESS 2021; 29:27237-27253. [PMID: 34615144 DOI: 10.1364/oe.432544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 07/29/2021] [Indexed: 06/13/2023]
Abstract
Mask based lensless imagers have huge application prospects due to their ultra-thin body. However, the visual perception of the restored images is poor due to the ill conditioned nature of the system. In this work, we proposed a deep analytic network by imitating the traditional optimization process as an end-to-end network. Our network combines analytic updates with a deep denoiser prior to progressively improve lensless image quality over a few iterations. The convergence is proven mathematically and verified in the results. In addition, our method is universal in non-blind restoration. We detailed the solution for the general inverse problem and conducted five groups of deblurring experiments as examples. Both experimental results demonstrate that our method achieves superior performance against the existing state-of-the-art methods.
Collapse
|
66
|
Jing L, Lv S. Art Image Processing and Color Objective Evaluation Based on Multicolor Space Convolutional Neural Network. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:4273963. [PMID: 34413888 PMCID: PMC8369161 DOI: 10.1155/2021/4273963] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 07/30/2021] [Indexed: 11/26/2022]
Abstract
A convolutional neural network's weight sharing feature can significantly reduce the cumbersome degree of the network structure and reduce the number of weights that need to be trained. The model can directly input the original image, without the process of feature extraction and data reconstruction in common classification algorithms. This kind of network structure has got a good performance in image processing and recognition. Based on the color objective evaluation method of the convolutional neural network, this paper proposes a convolutional neural network model based on multicolor space and builds a convolutional neural network based on VGGNet (Visual Geometry Group Net) in three different color spaces, namely, RGB (Red Green Blue), LAB (Luminosity a b), and HSV (Hue Saturation Value) color spaces. We carry out research on data input processing and model output selection and perform feature extraction and prediction of color images. After a model output selection judger, the prediction results of different color spaces are merged and the final prediction category is output. This article starts with the multidimensional correlation for visual art image processing and color objective evaluation. Considering the relationship between the evolution of artistic painting style and the color of artistic images, this article explores the characteristics of artistic image dimensions. In view of different factors, corresponding knowledge extraction strategies are designed to generate color label distribution, provide supplementary information of art history for input images, and train the model on a multitask learning framework. In this paper, experiments on multiple art painting data sets prove that this method is superior to single-color label classification methods.
Collapse
Affiliation(s)
- Liang Jing
- Hubei Institute of Fine Arts, Wuhan 430205, Hubei, China
| | - Shifeng Lv
- Luxshare Precision Industry Co., Ltd., Shanghai 200126, China
| |
Collapse
|
67
|
Sun L, Dong W, Li X, Wu J, Li L, Shi G. Deep Maximum a Posterior Estimator for Video Denoising. Int J Comput Vis 2021. [DOI: 10.1007/s11263-021-01510-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
68
|
|
69
|
Back Propagation Neural Network-Based Ultrasound Image for Diagnosis of Cartilage Lesions in Knee Osteoarthritis. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:2584291. [PMID: 34373773 PMCID: PMC8349257 DOI: 10.1155/2021/2584291] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Revised: 07/12/2021] [Accepted: 07/22/2021] [Indexed: 11/21/2022]
Abstract
Objective To explore the application value of ultrasound image based on back propagation (BP) neural network algorithm in knee osteoarthritis (KOA) and evaluate the application effect and value of ultrasound image technology based on the BP neural network in the diagnosis of knee osteoarthritis cartilage lesions, 98 patients who were admitted to our hospital were diagnosed with KOA and had undergone arthroscopic soft tissue examinations were randomly selected. According to whether image processing was performed, the ultrasound images of all patients were divided into two groups. The control group was image before processing, and the experimental group was image after processing optimization. The consistency of the inspection results of the ultrasound images before and after the processing with the arthroscopy results was compared. The results showed that the staging accuracy of the control group was 68.3% and that of the experimental group was 76.9%. The accuracy of staging cartilage degeneration of the experimental group was higher than that of the control group, and the difference was not remarkable (P > 0.05). The kappa coefficient of the experimental group was 0.61, and that of the control group was 0.40. The kappa coefficient of the experimental group was higher than that of the control group, and the difference was significant (P < 0.05). Conclusion The inspection effect of the ultrasound image processed by the BP neural network was superior to that of the conventional ultrasound image. It reflected the good adoption prospect of neural networks in image processing.
Collapse
|
70
|
Deep low-Rank plus sparse network for dynamic MR imaging. Med Image Anal 2021; 73:102190. [PMID: 34340107 DOI: 10.1016/j.media.2021.102190] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 07/15/2021] [Accepted: 07/16/2021] [Indexed: 11/22/2022]
Abstract
In dynamic magnetic resonance (MR) imaging, low-rank plus sparse (L+S) decomposition, or robust principal component analysis (PCA), has achieved stunning performance. However, the selection of the parameters of L+S is empirical, and the acceleration rate is limited, which are common failings of iterative compressed sensing MR imaging (CS-MRI) reconstruction methods. Many deep learning approaches have been proposed to address these issues, but few of them use a low-rank prior. In this paper, a model-based low-rank plus sparse network, dubbed L+S-Net, is proposed for dynamic MR reconstruction. In particular, we use an alternating linearized minimization method to solve the optimization problem with low-rank and sparse regularization. Learned soft singular value thresholding is introduced to ensure the clear separation of the L component and S component. Then, the iterative steps are unrolled into a network in which the regularization parameters are learnable. We prove that the proposed L+S-Net achieves global convergence under two standard assumptions. Experiments on retrospective and prospective cardiac cine datasets show that the proposed model outperforms state-of-the-art CS and existing deep learning methods and has great potential for extremely high acceleration factors (up to 24×).
Collapse
|
71
|
You D, Zhang J, Xie J, Chen B, Ma S. COAST: COntrollable Arbitrary-Sampling NeTwork for Compressive Sensing. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:6066-6080. [PMID: 34185643 DOI: 10.1109/tip.2021.3091834] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Recent deep network-based compressive sensing (CS) methods have achieved great success. However, most of them regard different sampling matrices as different independent tasks and need to train a specific model for each target sampling matrix. Such practices give rise to inefficiency in computing and suffer from poor generalization ability. In this paper, we propose a novel COntrollable Arbitrary-Sampling neTwork, dubbed COAST, to solve CS problems of arbitrary-sampling matrices (including unseen sampling matrices) with one single model. Under the optimization-inspired deep unfolding framework, our COAST exhibits good interpretability. In COAST, a random projection augmentation (RPA) strategy is proposed to promote the training diversity in the sampling space to enable arbitrary sampling, and a controllable proximal mapping module (CPMM) and a plug-and-play deblocking (PnP-D) strategy are further developed to dynamically modulate the network features and effectively eliminate the blocking artifacts, respectively. Extensive experiments on widely used benchmark datasets demonstrate that our proposed COAST is not only able to handle arbitrary sampling matrices with one single model but also to achieve state-of-the-art performance with fast speed.
Collapse
|
72
|
Zhang Y, Tian Y, Kong Y, Zhong B, Fu Y. Residual Dense Network for Image Restoration. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:2480-2495. [PMID: 31985406 DOI: 10.1109/tpami.2020.2968521] [Citation(s) in RCA: 143] [Impact Index Per Article: 35.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Recently, deep convolutional neural network (CNN) has achieved great success for image restoration (IR) and provided hierarchical features at the same time. However, most deep CNN based IR models do not make full use of the hierarchical features from the original low-quality images; thereby, resulting in relatively-low performance. In this work, we propose a novel and efficient residual dense network (RDN) to address this problem in IR, by making a better tradeoff between efficiency and effectiveness in exploiting the hierarchical features from all the convolutional layers. Specifically, we propose residual dense block (RDB) to extract abundant local features via densely connected convolutional layers. RDB further allows direct connections from the state of preceding RDB to all the layers of current RDB, leading to a contiguous memory mechanism. To adaptively learn more effective features from preceding and current local features and stabilize the training of wider network, we proposed local feature fusion in RDB. After fully obtaining dense local features, we use global feature fusion to jointly and adaptively learn global hierarchical features in a holistic way. We demonstrate the effectiveness of RDN with several representative IR applications, single image super-resolution, Gaussian image denoising, image compression artifact reduction, and image deblurring. Experiments on benchmark and real-world datasets show that our RDN achieves favorable performance against state-of-the-art methods for each IR task quantitatively and visually.
Collapse
|
73
|
Zha Z, Wen B, Yuan X, Zhou JT, Zhou J, Zhu C. Triply Complementary Priors for Image Restoration. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5819-5834. [PMID: 34133279 DOI: 10.1109/tip.2021.3086049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Recent works that utilized deep models have achieved superior results in various image restoration (IR) applications. Such approach is typically supervised, which requires a corpus of training images with distributions similar to the images to be recovered. On the other hand, the shallow methods, which are usually unsupervised remain promising performance in many inverse problems, e.g., image deblurring and image compressive sensing (CS), as they can effectively leverage nonlocal self-similarity priors of natural images. However, most of such methods are patch-based leading to the restored images with various artifacts due to naive patch aggregation in addition to the slow speed. Using either approach alone usually limits performance and generalizability in IR tasks. In this paper, we propose a joint low-rank and deep (LRD) image model, which contains a pair of triply complementary priors, namely, internal and external, shallow and deep, and non-local and local priors. We then propose a novel hybrid plug-and-play (H-PnP) framework based on the LRD model for IR. Following this, a simple yet effective algorithm is developed to solve the proposed H-PnP based IR problems. Extensive experimental results on several representative IR tasks, including image deblurring, image CS and image deblocking, demonstrate that the proposed H-PnP algorithm achieves favorable performance compared to many popular or state-of-the-art IR methods in terms of both objective and visual perception.
Collapse
|
74
|
Dong W, Zhou C, Wu F, Wu J, Shi G, Li X. Model-Guided Deep Hyperspectral Image Super-Resolution. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5754-5768. [PMID: 33979283 DOI: 10.1109/tip.2021.3078058] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The trade-off between spatial and spectral resolution is one of the fundamental issues in hyperspectral images (HSI). Given the challenges of directly acquiring high-resolution hyperspectral images (HR-HSI), a compromised solution is to fuse a pair of images: one has high-resolution (HR) in the spatial domain but low-resolution (LR) in spectral-domain and the other vice versa. Model-based image fusion methods including pan-sharpening aim at reconstructing HR-HSI by solving manually designed objective functions. However, such hand-crafted prior often leads to inevitable performance degradation due to a lack of end-to-end optimization. Although several deep learning-based methods have been proposed for hyperspectral pan-sharpening, HR-HSI related domain knowledge has not been fully exploited, leaving room for further improvement. In this paper, we propose an iterative Hyperspectral Image Super-Resolution (HSISR) algorithm based on a deep HSI denoiser to leverage both domain knowledge likelihood and deep image prior. By taking the observation matrix of HSI into account during the end-to-end optimization, we show how to unfold an iterative HSISR algorithm into a novel model-guided deep convolutional network (MoG-DCN). The representation of the observation matrix by subnetworks also allows the unfolded deep HSISR network to work with different HSI situations, which enhances the flexibility of MoG-DCN. Extensive experimental results are reported to demonstrate that the proposed MoG-DCN outperforms several leading HSISR methods in terms of both implementation cost and visual quality. The code is available at https://see.xidian.edu.cn/faculty/wsdong/Projects/MoG-DCN.htm.
Collapse
|
75
|
Buchlak QD, Esmaili N, Leveque JC, Bennett C, Farrokhi F, Piccardi M. Machine learning applications to neuroimaging for glioma detection and classification: An artificial intelligence augmented systematic review. J Clin Neurosci 2021; 89:177-198. [PMID: 34119265 DOI: 10.1016/j.jocn.2021.04.043] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 04/30/2021] [Indexed: 12/13/2022]
Abstract
Glioma is the most common primary intraparenchymal tumor of the brain and the 5-year survival rate of high-grade glioma is poor. Magnetic resonance imaging (MRI) is essential for detecting, characterizing and monitoring brain tumors but definitive diagnosis still relies on surgical pathology. Machine learning has been applied to the analysis of MRI data in glioma research and has the potential to change clinical practice and improve patient outcomes. This systematic review synthesizes and analyzes the current state of machine learning applications to glioma MRI data and explores the use of machine learning for systematic review automation. Various datapoints were extracted from the 153 studies that met inclusion criteria and analyzed. Natural language processing (NLP) analysis involved keyword extraction, topic modeling and document classification. Machine learning has been applied to tumor grading and diagnosis, tumor segmentation, non-invasive genomic biomarker identification, detection of progression and patient survival prediction. Model performance was generally strong (AUC = 0.87 ± 0.09; sensitivity = 0.87 ± 0.10; specificity = 0.0.86 ± 0.10; precision = 0.88 ± 0.11). Convolutional neural network, support vector machine and random forest algorithms were top performers. Deep learning document classifiers yielded acceptable performance (mean 5-fold cross-validation AUC = 0.71). Machine learning tools and data resources were synthesized and summarized to facilitate future research. Machine learning has been widely applied to the processing of MRI data in glioma research and has demonstrated substantial utility. NLP and transfer learning resources enabled the successful development of a replicable method for automating the systematic review article screening process, which has potential for shortening the time from discovery to clinical application in medicine.
Collapse
Affiliation(s)
- Quinlan D Buchlak
- School of Medicine, The University of Notre Dame Australia, Sydney, NSW, Australia.
| | - Nazanin Esmaili
- School of Medicine, The University of Notre Dame Australia, Sydney, NSW, Australia; Faculty of Engineering and IT, University of Technology Sydney, Ultimo, NSW, Australia
| | | | - Christine Bennett
- School of Medicine, The University of Notre Dame Australia, Sydney, NSW, Australia
| | - Farrokh Farrokhi
- Neuroscience Institute, Virginia Mason Medical Center, Seattle, WA, USA
| | - Massimo Piccardi
- Faculty of Engineering and IT, University of Technology Sydney, Ultimo, NSW, Australia
| |
Collapse
|
76
|
Gavaskar RG, Athalye CD, Chaudhury KN. On Plug-and-Play Regularization Using Linear Denoisers. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:4802-4813. [PMID: 33909564 DOI: 10.1109/tip.2021.3075092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In plug-and-play (PnP) regularization, the knowledge of the forward model is combined with a powerful denoiser to obtain state-of-the-art image reconstructions. This is typically done by taking a proximal algorithm such as FISTA or ADMM, and formally replacing the proximal map associated with a regularizer by nonlocal means, BM3D or a CNN denoiser. Each iterate of the resulting PnP algorithm involves some kind of inversion of the forward model followed by denoiser-induced regularization. A natural question in this regard is that of optimality, namely, do the PnP iterations minimize some f+g , where f is a loss function associated with the forward model and g is a regularizer? This has a straightforward solution if the denoiser can be expressed as a proximal map, as was shown to be the case for a class of linear symmetric denoisers. However, this result excludes kernel denoisers such as nonlocal means that are inherently non-symmetric. In this paper, we prove that a broader class of linear denoisers (including symmetric denoisers and kernel denoisers) can be expressed as a proximal map of some convex regularizer g . An algorithmic implication of this result for non-symmetric denoisers is that it necessitates appropriate modifications in the PnP updates to ensure convergence to a minimum of f+g . Apart from the convergence guarantee, the modified PnP algorithms are shown to produce good restorations.
Collapse
|
77
|
Adversarial Gaussian Denoiser for Multiple-Level Image Denoising. SENSORS 2021; 21:s21092998. [PMID: 33923320 PMCID: PMC8123214 DOI: 10.3390/s21092998] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 04/17/2021] [Accepted: 04/23/2021] [Indexed: 12/28/2022]
Abstract
Image denoising is a challenging task that is essential in numerous computer vision and image processing problems. This study proposes and applies a generative adversarial network-based image denoising training architecture to multiple-level Gaussian image denoising tasks. Convolutional neural network-based denoising approaches come across a blurriness issue that produces denoised images blurry on texture details. To resolve the blurriness issue, we first performed a theoretical study of the cause of the problem. Subsequently, we proposed an adversarial Gaussian denoiser network, which uses the generative adversarial network-based adversarial learning process for image denoising tasks. This framework resolves the blurriness problem by encouraging the denoiser network to find the distribution of sharp noise-free images instead of blurry images. Experimental results demonstrate that the proposed framework can effectively resolve the blurriness problem and achieve significant denoising efficiency than the state-of-the-art denoising methods.
Collapse
|
78
|
Zhou Y, Yu K, Wang M, Ma Y, Peng Y, Chen Z, Zhu W, Shi F, Chen X. Speckle Noise Reduction for OCT Images based on Image Style Transfer and Conditional GAN. IEEE J Biomed Health Inform 2021; 26:139-150. [PMID: 33882009 DOI: 10.1109/jbhi.2021.3074852] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Raw optical coherence tomography (OCT) images typically are of low quality because speckle noise blurs retinal structures, severely compromising visual quality and degrading performances of subsequent image analysis tasks. In our previous study, we have developed a Conditional Generative Adversarial Network (cGAN) for speckle noise removal in OCT images collected by several commercial OCT scanners, which we collectively refer to as scanner T. In this paper, we improve the cGAN model and apply it to our in-house OCT scanner (scanner B) for speckle noise suppression. The proposed model consists of two steps: 1) We train a Cycle-Consistent GAN (CycleGAN) to learn style transfer between two OCT image datasets collected by different scanners. The purpose of the CycleGAN is to leverage the ground truth dataset created in our previous study. 2) We train a mini-cGAN model based on the PatchGAN mechanism with the ground truth dataset to suppress speckle noise in OCT images. After training, we first apply the CycleGAN model to convert raw images collected by scanner B to match the style of the images from scanner T, and subsequently use the mini-cGAN model to suppress speckle noise in the style transferred images. We evaluate the proposed method on a dataset collected by scanner B. Experimental results show that the improved model outperforms our previous method and other state-of-the-art models in speckle noise removal, retinal structure preservation and contrast enhancement.
Collapse
|
79
|
Zhang Z, Tang Z, Wang Y, Zhang Z, Zhan C, Zha Z, Wang M. Dense Residual Network: Enhancing global dense feature flow for character recognition. Neural Netw 2021; 139:77-85. [PMID: 33684611 DOI: 10.1016/j.neunet.2021.02.005] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 01/15/2021] [Accepted: 02/08/2021] [Indexed: 11/25/2022]
Abstract
Deep Convolutional Neural Networks (CNNs), such as Dense Convolutional Network (DenseNet), have achieved great success for image representation learning by capturing deep hierarchical features. However, most existing network architectures of simply stacking the convolutional layers fail to enable them to fully discover local and global feature information between layers. In this paper, we mainly investigate how to enhance the local and global feature learning abilities of DenseNet by fully exploiting the hierarchical features from all convolutional layers. Technically, we propose an effective convolutional deep model termed Dense Residual Network (DRN) for the task of optical character recognition. To define DRN, we propose a refined residual dense block (r-RDB) to retain the ability of local feature fusion and local residual learning of original RDB, which can reduce the computing efforts of inner layers at the same time. After fully capturing local residual dense features, we utilize the sum operation and several r-RDBs to construct a new block termed global dense block (GDB) by imitating the construction of dense blocks to adaptively learn global dense residual features in a holistic way. Finally, we use two convolutional layers to design a down-sampling block to reduce the global feature size and extract more informative deeper features. Extensive results show that our DRN can deliver enhanced results, compared with other related deep models.
Collapse
Affiliation(s)
- Zhao Zhang
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China; Key Laboratory of Knowledge Engineering with Big Data (Ministry of Education) & Intelligent Interconnected Systems Laboratory of Anhui Province, Hefei University of Technology, Hefei 230009, China.
| | - Zemin Tang
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China
| | - Yang Wang
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China; Key Laboratory of Knowledge Engineering with Big Data (Ministry of Education) & Intelligent Interconnected Systems Laboratory of Anhui Province, Hefei University of Technology, Hefei 230009, China.
| | - Zheng Zhang
- Bio-Computing Research Center, Harbin Institute of Technology (Shenzhen), Shenzhen, China
| | - Choujun Zhan
- School of Computer, South China Normal University, Guangzhou 510631, China.
| | - Zhengjun Zha
- Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China
| | - Meng Wang
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China; Key Laboratory of Knowledge Engineering with Big Data (Ministry of Education) & Intelligent Interconnected Systems Laboratory of Anhui Province, Hefei University of Technology, Hefei 230009, China
| |
Collapse
|
80
|
An Efficient and Accurate Depth-Wise Separable Convolutional Neural Network for Cybersecurity Vulnerability Assessment Based on CAPTCHA Breaking. ELECTRONICS 2021. [DOI: 10.3390/electronics10040480] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Cybersecurity practitioners generate a Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHAs) as a form of security mechanism in website applications, in order to differentiate between human end-users and machine bots. They tend to use standard security to implement CAPTCHAs in order to prevent hackers from writing malicious automated programs to make false website registrations and to restrict them from stealing end-users’ private information. Among the categories of CAPTCHAs, the text-based CAPTCHA is the most widely used. However, with the evolution of deep learning, it has been so dramatic that tasks previously thought not easily addressable by computers and used as CAPTCHA to prevent spam are now possible to break. The workflow of CAPTCHA breaking is a combination of efforts, approaches, and the development of the computation-efficient Convolutional Neural Network (CNN) model that attempts to increase accuracy. In this study, in contrast to breaking the whole CAPTCHA images simultaneously, this study split four-character CAPTCHA images for the individual characters with a 2-pixel margin around the edges of a new training dataset, and then proposed an efficient and accurate Depth-wise Separable Convolutional Neural Network for breaking text-based CAPTCHAs. Most importantly, to the best of our knowledge, this is the first CAPTCHA breaking study to use the Depth-wise Separable Convolution layer to build an efficient CNN model to break text-based CAPTCHAs. We have evaluated and compared the performance of our proposed model to that of fine-tuning other popular CNN image recognition architectures on the generated CAPTCHA image dataset. In real-time, our proposed model used less time to break the text-based CAPTCHAs with an accuracy of more than 99% on the testing dataset. We observed that our proposed CNN model has efficiently improved the CAPTCHA breaking accuracy and streamlined the structure of the CAPTCHA breaking network as compared to other CAPTCHA breaking techniques.
Collapse
|
81
|
Multi-system fusion based on deep neural network and cloud edge computing and its application in intelligent manufacturing. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-05735-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
82
|
Wang Y, Wei Y, Qian X, Zhu L, Yang Y. Sketch-Guided Scenery Image Outpainting. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:2643-2655. [PMID: 33523812 DOI: 10.1109/tip.2021.3054477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The outpainting results produced by existing approaches are often too random to meet users' requirements. In this work, we take the image outpainting one step forward by allowing users to harvest personal custom outpainting results using sketches as the guidance. To this end, we propose an encoder-decoder based network to conduct sketch-guided outpainting, where two alignment modules are adopted to impose the generated content to be realistic and consistent with the provided sketches. First, we apply a holistic alignment module to make the synthesized part be similar to the real one from the global view. Second, we reversely produce the sketches from the synthesized part and encourage them be consistent with the ground-truth ones using a sketch alignment module. In this way, the learned generator will be imposed to pay more attention to fine details and be sensitive to the guiding sketches. To our knowledge, this work is the first attempt to explore the challenging yet meaningful conditional scenery image outpainting. We conduct extensive experiments on two collected benchmarks to qualitatively and quantitatively validate the effectiveness of our approach compared with the other state-of-the-art generative models.
Collapse
|
83
|
Wang C, Ren C, He X, Qing L. Deep recursive network for image denoising with global non-linear smoothness constraint prior. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.09.070] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
84
|
Zhang H, Liu B, Yu H, Dong B. MetaInv-Net: Meta Inversion Network for Sparse View CT Image Reconstruction. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:621-634. [PMID: 33104506 DOI: 10.1109/tmi.2020.3033541] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
X-ray Computed Tomography (CT) is widely used in clinical applications such as diagnosis and image-guided interventions. In this paper, we propose a new deep learning based model for CT image reconstruction with the backbone network architecture built by unrolling an iterative algorithm. However, unlike the existing strategy to include as many data-adaptive components in the unrolled dynamics model as possible, we find that it is enough to only learn the parts where traditional designs mostly rely on intuitions and experience. More specifically, we propose to learn an initializer for the conjugate gradient (CG) algorithm that involved in one of the subproblems of the backbone model. Other components, such as image priors and hyperparameters, are kept as the original design. Since a hypernetwork is introduced to inference on the initialization of the CG module, it makes the proposed model a certain meta-learning model. Therefore, we shall call the proposed model the meta-inversion network (MetaInv-Net). The proposed MetaInv-Net can be designed with much less trainable parameters while still preserves its superior image reconstruction performance than some state-of-the-art deep models in CT imaging. In simulated and real data experiments, MetaInv-Net performs very well and can be generalized beyond the training setting, i.e., to other scanning settings, noise levels, and data sets.
Collapse
|
85
|
Nair P, Gavaskar RG, Chaudhury KN. Fixed-Point and Objective Convergence of Plug-and-Play Algorithms. IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING 2021; 7:337-348. [DOI: 10.1109/tci.2021.3066053] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/19/2023]
|
86
|
Zhang Z, Liu Y, Liu J, Wen F, Zhu C. AMP-Net: Denoising-Based Deep Unfolding for Compressive Image Sensing. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 30:1487-1500. [PMID: 33338019 DOI: 10.1109/tip.2020.3044472] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Most compressive sensing (CS) reconstruction methods can be divided into two categories, i.e. model-based methods and classical deep network methods. By unfolding the iterative optimization algorithm for model-based methods onto networks, deep unfolding methods have the good interpretation of model-based methods and the high speed of classical deep network methods. In this article, to solve the visual image CS problem, we propose a deep unfolding model dubbed AMP-Net. Rather than learning regularization terms, it is established by unfolding the iterative denoising process of the well-known approximate message passing algorithm. Furthermore, AMP-Net integrates deblocking modules in order to eliminate the blocking artifacts that usually appear in CS of visual images. In addition, the sampling matrix is jointly trained with other network parameters to enhance the reconstruction performance. Experimental results show that the proposed AMP-Net has better reconstruction accuracy than other state-of-the-art methods with high reconstruction speed and a small number of network parameters.
Collapse
|
87
|
Unni VS, Nair P, Chaudhury KN. Plug-And-Play Registration And Fusion. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) 2020. [DOI: 10.1109/icip40778.2020.9190847] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/19/2023]
|
88
|
Speckle Noise Reduction in Sonar Image Based on Adaptive Redundant Dictionary. JOURNAL OF MARINE SCIENCE AND ENGINEERING 2020. [DOI: 10.3390/jmse8100761] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
As acoustic waves are affected by the channel characteristics, such as scattering and reverberation when propagating in water, sonar images often exhibit speckle noise which will cause visual quality of the image to decrease. Therefore, denoising is a crucial preprocessing technique in sonar image applications. However, speckle noise is mainly caused by the sediment echo signals which are related to the background of seafloor sediment and can be obtained by prior modeling. Although deep learning-based denoising algorithms represent a research hotspot now, they are not suitable for such applications due to the high calculation amount and the large requirement of original images considering that sonar is carried by Autonomous Underwater Vehicles (AUVs) for collecting sonar images and performing calculation. In contrast, dictionary learning-based denoising method is more suitable and easier to be modeled. Compared with deep learning, it can greatly reduce the calculation amount and is more easily integrated into AUV systems. In addition, dictionary learning method based on image sparse representation can effectively achieve image denoising similarly. In order to solve the above problems, we propose a new adaptive dictionary learning method based on multi-resolution characteristics, which combines K-SVD dictionary learning with wavelet transform. Our method has the characteristics of dictionary learning and inherits the features of wavelet analysis as well. Compared with several classical methods, the proposed method is better at speckle noise reduction and edge detail preservation. At the same time, the calculation time is greatly reduced and the efficiency is significantly improved.
Collapse
|
89
|
Zhou H, Feng H, Hu Z, Xu Z, Li Q, Chen Y. Lensless cameras using a mask based on almost perfect sequence through deep learning. OPTICS EXPRESS 2020; 28:30248-30262. [PMID: 33114908 DOI: 10.1364/oe.400486] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 08/16/2020] [Indexed: 06/11/2023]
Abstract
Mask-based lensless imaging cameras have many applications due to their smaller volumes and lower costs. However, due to the ill-nature of the inverse problem, the reconstructed images have low resolution and poor quality. In this article, we use a mask based on almost perfect sequence which has an excellent autocorrelation property for lensless imaging and propose a Learned Analytic solution Net for image reconstruction under the framework of unrolled optimization. Our network combines a physical imaging model with deep learning to achieve high-quality image reconstruction. The experimental results indicate that our reconstructed images at a resolution of 512 × 512 have excellent performances in both visual effects and objective evaluations.
Collapse
|
90
|
Abstract
Websites can increase their security and prevent harmful Internet attacks by providing CAPTCHA verification for determining whether end-user is a human or a robot. Text-based CAPTCHA is the most common and designed to be easily recognized by humans and difficult to identify by machines or robots. However, with the dramatic advancements in deep learning, it becomes much easier to build convolutional neural network (CNN) models that can efficiently recognize text-based CAPTCHAs. In this study, we introduce an efficient CNN model that uses attached binary images to recognize CAPTCHAs. By making a specific number of copies of the input CAPTCHA image equal to the number of characters in that input CAPTCHA image and attaching distinct binary images to each copy, we build a new CNN model that can recognize CAPTCHAs effectively. The model has a simple structure and small storage size and does not require the segmentation of CAPTCHAs into individual characters. After training and testing the proposed CAPTCHA recognition CNN model, the achieved experimental results reveal the strength of the model in CAPTCHA character recognition.
Collapse
|
91
|
Zha Z, Yuan X, Wen B, Zhou J, Zhu C. Group Sparsity Residual Constraint with Non-Local Priors for Image Restoration. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:8960-8975. [PMID: 32903181 DOI: 10.1109/tip.2020.3021291] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Group sparse representation (GSR) has made great strides in image restoration producing superior performance, realized through employing a powerful mechanism to integrate the local sparsity and nonlocal self-similarity of images. However, due to some form of degradation (e.g., noise, down-sampling or pixels missing), traditional GSR models may fail to faithfully estimate sparsity of each group in an image, thus resulting in a distorted reconstruction of the original image. This motivates us to design a simple yet effective model that aims to address the above mentioned problem. Specifically, we propose group sparsity residual constraint with nonlocal priors (GSRC-NLP) for image restoration. Through introducing the group sparsity residual constraint, the problem of image restoration is further defined and simplified through attempts at reducing the group sparsity residual. Towards this end, we first obtain a good estimation of the group sparse coefficient of each original image group by exploiting the image nonlocal self-similarity (NSS) prior along with self-supervised learning scheme, and then the group sparse coefficient of the corresponding degraded image group is enforced to approximate the estimation. To make the proposed scheme tractable and robust, two algorithms, i.e., iterative shrinkage/thresholding (IST) and alternating direction method of multipliers (ADMM), are employed to solve the proposed optimization problems for different image restoration tasks. Experimental results on image denoising, image inpainting and image compressive sensing (CS) recovery, demonstrate that the proposed GSRC-NLP based image restoration algorithm is comparable to state-of-the-art denoising methods and outperforms several state-of-the-art image inpainting and image CS recovery methods in terms of both objective and perceptual quality metrics.
Collapse
|
92
|
Li J, Wang Y, Xie H, Ma KK. Learning a Single Model with a Wide Range of Quality Factors for JPEG Image Artifacts Removal. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:8842-8854. [PMID: 32886610 DOI: 10.1109/tip.2020.3020389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Lossy compression brings artifacts into the compressed image and degrades the visual quality. In recent years, many compression artifacts removal methods based on convolutional neural network (CNN) have been developed with great success. However, these methods usually train a model based on one specific value or a small range of quality factors. Obviously, if the test images quality factor does not match to the assumed value range, then degraded performance will be resulted. With this motivation and further consideration of practical usage, a highly robust compression artifacts removal network is proposed in this paper. Our proposed network is a single model approach that can be trained for handling a wide range of quality factors while consistently delivering superior or comparable image artifacts removal performance. To demonstrate, we focus on the JPEG compression with quality factors, ranging from 1 to 60. Note that a turnkey success of our proposed network lies in the novel utilization of the quantization tables as part of the training data. Furthermore, it has two branches in parallel-i.e., the restoration branch and the global branch. The former effectively removes the local artifacts, such as ringing artifacts removal. On the other hand, the latter extracts the global features of the entire image that provides highly instrumental image quality improvement, especially effective on dealing with the global artifacts, such as blocking, color shifting. Extensive experimental results performed on color and grayscale images have clearly demonstrated the effectiveness and efficacy of our proposed single-model approach on the removal of compression artifacts from the decoded image.
Collapse
|
93
|
Zhou Q, Ding M, Zhang X. Image Deblurring Using Multi-Stream Bottom-Top-Bottom Attention Network and Global Information-Based Fusion and Reconstruction Network. SENSORS 2020; 20:s20133724. [PMID: 32635206 PMCID: PMC7374418 DOI: 10.3390/s20133724] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 06/25/2020] [Accepted: 06/30/2020] [Indexed: 11/25/2022]
Abstract
Image deblurring has been a challenging ill-posed problem in computer vision. Gaussian blur is a common model for image and signal degradation. The deep learning-based deblurring methods have attracted much attention due to their advantages over the traditional methods relying on hand-designed features. However, the existing deep learning-based deblurring techniques still cannot perform well in restoring the fine details and reconstructing the sharp edges. To address this issue, we have designed an effective end-to-end deep learning-based non-blind image deblurring algorithm. In the proposed method, a multi-stream bottom-top-bottom attention network (MBANet) with the encoder-to-decoder structure is designed to integrate low-level cues and high-level semantic information, which can facilitate extracting image features more effectively and improve the computational efficiency of the network. Moreover, the MBANet adopts a coarse-to-fine multi-scale strategy to process the input images to improve image deblurring performance. Furthermore, the global information-based fusion and reconstruction network is proposed to fuse multi-scale output maps to improve the global spatial information and recurrently refine the output deblurred image. The experiments were done on the public GoPro dataset and the realistic and dynamic scenes (REDS) dataset to evaluate the effectiveness and robustness of the proposed method. The experimental results show that the proposed method generally outperforms some traditional deburring methods and deep learning-based state-of-the-art deblurring methods such as scale-recurrent network (SRN) and denoising prior driven deep neural network (DPDNN) in terms of such quantitative indexes as peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) and human vision.
Collapse
|
94
|
Abstract
In recent years, convolutional neural networks (CNN) have been widely used in image denoising for their high performance. One difficulty in applying the CNN to medical image denoising such as speckle reduction in the optical coherence tomography (OCT) image is that a large amount of high-quality data is required for training, which is an inherent limitation for OCT despeckling. Recently, deep image prior (DIP) networks have been proposed for image restoration without pre-training since the CNN structures have the intrinsic ability to capture the low-level statistics of a single image. However, the DIP has difficulty finding a good balance between maintaining details and suppressing speckle noise. Inspired by DIP, in this paper, a sorted non-local statics which measures the signal autocorrelation in the differences between the constructed image and the input image is proposed for OCT image restoration. By adding the sorted non-local statics as a regularization loss in the DIP learning, more low-level image statistics are captured by CNN networks in the process of OCT image restoration. The experimental results demonstrate the superior performance of the proposed method over other state-of-the-art despeckling methods, in terms of objective metrics and visual quality.
Collapse
|
95
|
Jiao J, Tu WC, Liu D, He S, Lau RWH, Huang TS. FormNet: Formatted Learning for Image Restoration. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:6302-6314. [PMID: 32365031 DOI: 10.1109/tip.2020.2990603] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this paper, we propose a deep CNN to tackle the image restoration problem by learning formatted information. Previous deep learning based methods directly learn the mapping from corrupted images to clean images, and may suffer from the gradient exploding/vanishing problems of deep neural networks. We propose to address the image restoration problem by learning the structured details and recovering the latent clean image together, from the shared information between the corrupted image and the latent image. In addition, instead of learning the pure difference (corruption), we propose to add a residual formatting layer and an adversarial block to format the information to structured one, which allows the network to converge faster and boosts the performance. Furthermore, we propose a cross-level loss net to ensure both pixel-level accuracy and semantic-level visual quality. Evaluations on public datasets show that the proposed method performs favorably against existing approaches quantitatively and qualitatively.
Collapse
|
96
|
Gavaskar RG, Chaudhury KN. Plug-and-Play ISTA Converges With Kernel Denoisers. IEEE SIGNAL PROCESSING LETTERS 2020; 27:610-614. [DOI: 10.1109/lsp.2020.2986643] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/19/2023]
|
97
|
Cao J, Liu S, Liu H, Lu H. CS-MRI reconstruction based on analysis dictionary learning and manifold structure regularization. Neural Netw 2019; 123:217-233. [PMID: 31884182 DOI: 10.1016/j.neunet.2019.12.010] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Revised: 11/28/2019] [Accepted: 12/10/2019] [Indexed: 11/28/2022]
Abstract
Compressed sensing (CS) significantly accelerates magnetic resonance imaging (MRI) by allowing the exact reconstruction of image from highly undersampling k-space data. In this process, the high sparsity obtained by the learned dictionary and exploitation of correlation among patches are essential to the reconstructed image quality. In this paper, by a use of these two aspects, we propose a novel CS-MRI model based on analysis dictionary learning and manifold structure regularization (ADMS). Furthermore, a proper tight frame constraint is used to obtain an effective overcomplete analysis dictionary with a high sparsifying capacity. The constructed manifold structure regularization nonuniformly enforces the correlation of each group formed by similar patches, which is more consistent with the diverse nonlocal similarity in realistic images. The proposed model is efficiently solved by the alternating direction method of multipliers (ADMM), in which the fast algorithm for each sub-problem is separately developed. The experimental results demonstrate that main components in the proposed method contribute to the final reconstruction performance and the effectiveness of the proposed model.
Collapse
Affiliation(s)
- Jianxin Cao
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, 400044, China
| | - Shujun Liu
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, 400044, China.
| | - Hongqing Liu
- Chongqing Key Lab of Mobile Communications Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Hongwei Lu
- Department of Orthopaedics, Southwest Hospital, Army Medical University, Chongqing 400038, China
| |
Collapse
|
98
|
Gavaskar RG, Chaudhury KN. On the Proof of Fixed-Point Convergence for Plug-and-Play ADMM. IEEE SIGNAL PROCESSING LETTERS 2019; 26:1817-1821. [DOI: 10.1109/lsp.2019.2950611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/19/2023]
|
99
|
Brifman A, Romano Y, Elad M. Unified Single-Image and Video Super-Resolution via Denoising Algorithms. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:6063-6076. [PMID: 31251189 DOI: 10.1109/tip.2019.2924173] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Single image super-resolution (SISR) aims to recover a high-resolution image from a given low-resolution version of it. Video super-resolution (VSR) targets a series of given images, aiming to fuse them to create a higher resolution outcome. Although SISR and VSR seem to have a lot in common, most SISR algorithms do not have a simple and direct extension to VSR. VSR is considered a more challenging inverse problem, mainly due to its reliance on a sub-pixel accurate motion-estimation, which has no parallel in SISR. Another complication is the dynamics of the video, often addressed by simply generating a single frame instead of a complete output sequence. In this paper, we suggest a simple and robust super-resolution framework that can be applied to single images and easily extended to video. Our work relies on the observation that denoising of images and videos is well-managed and very effectively treated by a variety of methods. We exploit the plug-and-play-prior framework and the regularization-by-denoising (RED) approach that extends it, and show how to use such denoisers in order to handle the SISR and the VSR problems using a unified formulation and framework. This way, we benefit from the effectiveness and efficiency of existing image/video denoising algorithms, while solving much more challenging problems. More specifically, harnessing the VBM3D video denoiser, we obtain a strongly competitive motion-estimation free VSR algorithm, showing tendency to a high-quality output and fast processing.
Collapse
|
100
|
Abstract
In this paper, we propose a new dimensionality reduction method named Discriminative Sparsity Graph Embedding (DSGE) which considers the local structure information and the global distribution information simultaneously. Firstly, we adopt the intra-class compactness constraint to automatically construct the intrinsic adjacent graph, which enhances the reconstruction relationship between the given sample and the non-neighbor samples with the same class. Meanwhile, the inter-class compactness constraint is exploited to construct the penalty adjacent graph, which reduces the reconstruction influence between the given sample and the pseudo-neighbor samples with the different classes. Then, the global distribution constraints are introduced to the projection objective function for seeking the optimal subspace which compacts intra-classes samples and alienates inter-classes samples at the same time. Extensive experiments are carried out on AR, Extended Yale B, LFW and PubFig databases which are four representative face datasets, and the corresponding experimental results illustrate the effectiveness of our proposed method.
Collapse
|