1
|
Song Z, Lin Y, Xiong L, Li Z. Super-resolution algorithm for the characterization of sweat glands in fingerprint OCT images. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2023; 40:2068-2077. [PMID: 38038073 DOI: 10.1364/josaa.503212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 10/07/2023] [Indexed: 12/02/2023]
Abstract
Optical coherence tomography (OCT) is a noninvasive optical imaging technique that can be used to produce three-dimensional images of fingerprints. However, the low quality and poor resolution of the regions of interest (ROIs) in OCT images make it challenging to segment small tissues accurately. To address this issue, a super-resolution (SR) network called ESRNet has been developed to enhance the quality of OCT images, facilitating their applications in research. Firstly, the performance of the SR images produced by ESRNet is evaluated by comparing it to those generated by five other SR methods. Specifically, the SR performance is evaluated using three upscale factors (2×, 3×, and 4×) to assess the quality of the enhanced images. Based on the results obtained from the three datasets, it is evident that ESRNet outperforms current advanced networks in terms of SR performance. Furthermore, the segmentation accuracy of sweat glands has been significantly improved by the SR images. The number of sweat glands in the top view increased from 102 to 117, further substantiating the performance of the ESRNet network. The spiral structure of sweat glands is clear to human eyes and has been verified by showing similar left-right-handed spiral numbers. Finally, a sweat gland recognition method for the SR 3D images is proposed.
Collapse
|
2
|
Jensen PM, Jeppesen N, Dahl AB, Dahl VA. Review of Serial and Parallel Min-Cut/Max-Flow Algorithms for Computer Vision. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2310-2329. [PMID: 35471866 DOI: 10.1109/tpami.2022.3170096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Minimum cut/maximum flow (min-cut/max-flow) algorithms solve a variety of problems in computer vision and thus significant effort has been put into developing fast min-cut/max-flow algorithms. As a result, it is difficult to choose an ideal algorithm for a given problem. Furthermore, parallel algorithms have not been thoroughly compared. In this paper, we evaluate the state-of-the-art serial and parallel min-cut/max-flow algorithms on the largest set of computer vision problems yet. We focus on generic algorithms, i.e., for unstructured graphs, but also compare with the specialized GridCut implementation. When applicable, GridCut performs best. Otherwise, the two pseudoflow algorithms, Hochbaum pseudoflow and excesses incremental breadth first search, achieves the overall best performance. The most memory efficient implementation tested is the Boykov-Kolmogorov algorithm. Amongst generic parallel algorithms, we find the bottom-up merging approach by Liu and Sun to be best, but no method is dominant. Of the generic parallel methods, only the parallel preflow push-relabel algorithm is able to efficiently scale with many processors across problem sizes, and no generic parallel method consistently outperforms serial algorithms. Finally, we provide and evaluate strategies for algorithm selection to obtain good expected performance. We make our dataset and implementations publicly available for further research.
Collapse
|
3
|
Post-trained convolution networks for single image super-resolution. ARTIF INTELL 2023. [DOI: 10.1016/j.artint.2023.103882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
|
4
|
Lightweight Single Image Super-Resolution with Selective Channel Processing Network. SENSORS 2022; 22:s22155586. [PMID: 35898091 PMCID: PMC9332725 DOI: 10.3390/s22155586] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 07/22/2022] [Accepted: 07/24/2022] [Indexed: 02/05/2023]
Abstract
With the development of deep learning, considerable progress has been made in image restoration. Notably, many state-of-the-art single image super-resolution (SR) methods have been proposed. However, most of them contain many parameters, which leads to a significant amount of calculation consumption in the inference phase. To make current SR networks more lightweight and resource-friendly, we present a convolution neural network with the proposed selective channel processing strategy (SCPN). Specifically, the selective channel processing module (SCPM) is first designed to dynamically learn the significance of each channel in the feature map using a channel selection matrix in the training phase. Correspondingly, in the inference phase, only the essential channels indicated by the channel selection matrixes need to be further processed. By doing so, we can significantly reduce the parameters and the calculation consumption. Moreover, the differential channel attention (DCA) block is proposed, which takes into consideration the data distribution of the channels in feature maps to restore more high-frequency information. Extensive experiments are performed on the natural image super-resolution benchmarks (i.e., Set5, Set14, B100, Urban100, Manga109) and remote-sensing benchmarks (i.e., UCTest and RESISCTest), and our method achieves superior results to other state-of-the-art methods. Furthermore, our method keeps a slim size with fewer than 1 M parameters, which proves the superiority of our method. Owing to the proposed SCPM and DCA, our SCPN model achieves a better trade-off between calculation cost and performance in both general and remote-sensing SR applications, and our proposed method can be extended to other computer vision tasks for further research.
Collapse
|
5
|
Datta R, Mandal S, Umer S, AlZubi AA, Alharbi A, Alanazi JM. Single-image reconstruction using novel super-resolution technique for large-scaled images. Soft comput 2022; 26:8089-8103. [PMID: 35582159 PMCID: PMC9099350 DOI: 10.1007/s00500-022-07142-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/05/2022] [Indexed: 11/29/2022]
Abstract
A fast and novel method for single-image reconstruction using the super-resolution (SR) technique has been proposed in this paper. The working principle of the proposed scheme has been divided into three components. A low-resolution image is divided into several homogeneous or non-homogeneous regions in the first component. This partition is based on the analysis of texture patterns within that region. Only the non-homogeneous regions undergo the sparse representation for SR image reconstruction in the second component. The obtained reconstructed region from the second component undergoes a statistical-based prediction model to generate its more enhanced version in the third component. The remaining homogeneous regions are bicubic interpolated and reflect the required high-resolution image. The proposed technique is applied to some Large-scale electrical, machine and civil architectural design images. The purpose of using these images is that these images are huge in size, and processing such large images for any application is time-consuming. The proposed SR technique results in a better reconstructed SR image from its lower version with low time complexity. The performance of the proposed system on the electrical, machine and civil architectural design images is compared with the state-of-the-art methods, and it is shown that the proposed scheme outperforms the other competing methods.
Collapse
Affiliation(s)
- Ramanath Datta
- Department of Electronics and Communication Engineering, St.Thomas’ College of Engineering and Technology, Kolkata, India
| | - Sekhar Mandal
- Department of Computer Science and Technology, Indian Institute of Engineering Science and Technology, Shibpur, Kolkata, India
| | - Saiyed Umer
- Department of Computer Science and Engineering, Aliah University, Kolkata, India
| | - Ahmad Ali AlZubi
- Computer Science Department, Community College, King Saud University, Riyadh, Saudi Arabia
| | - Abdullah Alharbi
- Computer Science Department, Community College, King Saud University, Riyadh, Saudi Arabia
| | - Jazem Mutared Alanazi
- Computer Science Department, Community College, King Saud University, Riyadh, Saudi Arabia
| |
Collapse
|
6
|
Qiu D, Cheng Y, Wang X. Dual U-Net residual networks for cardiac magnetic resonance images super-resolution. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 218:106707. [PMID: 35255374 DOI: 10.1016/j.cmpb.2022.106707] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Revised: 01/24/2022] [Accepted: 02/21/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND AND OBJECTIVE Heart disease is a vital disease that has threatened human health, and is the number one killer of human life. Moreover, with the added influence of recent health factors, its incidence rate keeps showing an upward trend. Today, cardiac magnetic resonance (CMR) imaging can provide a full range of structural and functional information for the heart, and has become an important tool for the diagnosis and treatment of heart disease. Therefore, improving the image resolution of CMR has an important medical value for the diagnosis and condition assessment of heart disease. At present, most single-image super-resolution (SISR) reconstruction methods have some serious problems, such as insufficient feature information mining, difficulty to determine the dependence of each channel of feature map, and reconstruction error when reconstructing high-resolution image. METHODS To solve these problems, we have proposed and implemented a dual U-Net residual network (DURN) for super-resolution of CMR images. Specifically, we first propose a U-Net residual network (URN) model, which is divided into the up-branch and the down-branch. The up-branch is composed of residual blocks and up-blocks to extract and upsample deep features; the down-branch is composed of residual blocks and down-blocks to extract and downsample deep features. Based on the URN model, we employ this a dual U-Net residual network (DURN) model, which combines the extracted deep features of the same position between the first URN and the second URN through residual connection. It can make full use of the features extracted by the first URN to extract deeper features of low-resolution images. RESULTS When the scale factors are 2, 3, and 4, our DURN can obtain 37.86 dB, 33.96 dB, and 31.65 dB on the Set5 dataset, which shows (i) a maximum improvement of 4.17 dB, 3.55 dB, and 3.22dB over the Bicubic algorithm, and (ii) a minimum improvement of 0.34 dB, 0.14 dB, and 0.11 dB over the LapSRN algorithm. CONCLUSION Comprehensive experimental study results on benchmark datasets demonstrate that our proposed DURN can not only achieve better performance for peak signal to noise ratio (PSNR) and structural similarity index (SSIM) values than other state-of-the-art SR image algorithms, but also reconstruct clearer super-resolution CMR images which have richer details, edges, and texture.
Collapse
Affiliation(s)
- Defu Qiu
- Engineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou 221116, China; School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Yuhu Cheng
- Engineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou 221116, China; School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Xuesong Wang
- Engineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou 221116, China; School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China.
| |
Collapse
|
7
|
Syed Tafseer Haider Shah, Xuezhi X, Ahmed W. Optical Flow Estimation with Convolutional Neural Nets. PATTERN RECOGNITION AND IMAGE ANALYSIS 2021. [DOI: 10.1134/s1054661821040210] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
8
|
Feng H, Wang L, Cheng S, Du A, Li Y. Dynamic dual attention iterative network for image super-resolution. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02816-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
9
|
Abstract
This paper proposes a robust multi-frame video super-resolution (SR) scheme to obtain high SR performance under large upscaling factors. Although the reference low-resolution frames can provide complementary information for the high-resolution frame, an effective regularizer is required to rectify the unreliable information from the reference frames. As the high-frequency information is mostly contained in the image gradient field, we propose to learn the gradient-mapping function between the high-resolution (HR) and the low-resolution (LR) image to regularize the fusion of multiple frames. In contrast to the existing spatial-domain networks, we train a deep gradient-mapping network to learn the horizontal and vertical gradients. We found that adding the low-frequency information (mainly from the LR image) to the gradient-learning network can boost the performance of the network. A forward and backward motion field prior is used to regularize the estimation of the motion flow between frames. For robust SR reconstruction, a weighting scheme is proposed to exclude the outlier data. Visual and quantitative evaluations on benchmark datasets demonstrate that our method is superior to many state-of-the-art methods and can recover better details with less artifacts.
Collapse
|
10
|
Improvement of signal and noise performance using single image super-resolution based on deep learning in single photon-emission computed tomography imaging system. NUCLEAR ENGINEERING AND TECHNOLOGY 2021. [DOI: 10.1016/j.net.2021.01.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
11
|
Yuan YH, Li J, Li Y, Qiang J, Li B, Yang W, Peng F. OPLS-SR: A novel face super-resolution learning method using orthonormalized coherent features. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.01.082] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
12
|
|
13
|
Guo Y, Lu M, Zuo W, Zhang C, Chen Y. Deep Likelihood Network for Image Restoration With Multiple Degradation Levels. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:2669-2681. [PMID: 33476270 DOI: 10.1109/tip.2021.3051767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Convolutional neural networks have been proven effective in a variety of image restoration tasks. Most state-of-the-art solutions, however, are trained using images with a single particular degradation level, and their performance deteriorates drastically when applied to other degradation settings. In this paper, we propose deep likelihood network (DL-Net), aiming at generalizing off-the-shelf image restoration networks to succeed over a spectrum of degradation levels. We slightly modify an off-the-shelf network by appending a simple recursive module, which is derived from a fidelity term, for disentangling the computation for multiple degradation levels. Extensive experimental results on image inpainting, interpolation, and super-resolution show the effectiveness of our DL-Net.
Collapse
|
14
|
Bai K, Liao X, Zhang Q, Jia X, Liu S. Survey of Learning Based Single Image Super-Resolution Reconstruction Technology. PATTERN RECOGNITION AND IMAGE ANALYSIS 2021. [DOI: 10.1134/s1054661820040045] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
15
|
Awasthi N, Jain G, Kalva SK, Pramanik M, Yalavarthy PK. Deep Neural Network-Based Sinogram Super-Resolution and Bandwidth Enhancement for Limited-Data Photoacoustic Tomography. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2020; 67:2660-2673. [PMID: 32142429 DOI: 10.1109/tuffc.2020.2977210] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Photoacoustic tomography (PAT) is a noninvasive imaging modality combining the benefits of optical contrast at ultrasonic resolution. Analytical reconstruction algorithms for photoacoustic (PA) signals require a large number of data points for accurate image reconstruction. However, in practical scenarios, data are collected using the limited number of transducers along with data being often corrupted with noise resulting in only qualitative images. Furthermore, the collected boundary data are band-limited due to limited bandwidth (BW) of the transducer, making the PA imaging with limited data being qualitative. In this work, a deep neural network-based model with loss function being scaled root-mean-squared error was proposed for super-resolution, denoising, as well as BW enhancement of the PA signals collected at the boundary of the domain. The proposed network has been compared with traditional as well as other popular deep-learning methods in numerical as well as experimental cases and is shown to improve the collected boundary data, in turn, providing superior quality reconstructed PA image. The improvement obtained in the Pearson correlation, structural similarity index metric, and root-mean-square error was as high as 35.62%, 33.81%, and 41.07%, respectively, for phantom cases and signal-to-noise ratio improvement in the reconstructed PA images was as high as 11.65 dB for in vivo cases compared with reconstructed image obtained using original limited BW data. Code is available at https://sites.google.com/site/sercmig/home/dnnpat.
Collapse
|
16
|
El-Mawass N, Honeine P, Vercouter L. SimilCatch: Enhanced social spammers detection on Twitter using Markov Random Fields. Inf Process Manag 2020. [DOI: 10.1016/j.ipm.2020.102317] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
17
|
Super Resolution with Kernel Estimation and Dual Attention Mechanism. INFORMATION 2020. [DOI: 10.3390/info11110508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Convolutional Neural Networks (CNN) have led to promising performance in super-resolution (SR). Most SR methods are trained and evaluated on predefined blur kernel datasets (e.g., bicubic). However, the blur kernel of real-world LR image is much more complex. Therefore, the SR model trained on simulated data becomes less effective when applied to real scenarios. In this paper, we propose a novel super resolution framework based on blur kernel estimation and dual attention mechanism. Our network learns the internal relations from the input image itself, thus the network can quickly adapt to any input image. We add the blur kernel estimation structure into the network, correcting the inaccurate blur kernel to generate high quality images. Meanwhile, we propose a dual attention mechanism to restore the texture details of the image, adaptively adjusting the features of the image by considering the interdependencies both in channel and spatial. The combination of blur kernel estimation and attention mechanism makes our network perform well for complex blur images in practice. Extensive experiments show that our method (KASR) achieves promising accuracy and visual improvements against most existing methods.
Collapse
|
18
|
Multi-Frame Labeled Faces Database: Towards Face Super-Resolution from Realistic Video Sequences. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10207213] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Forensically trained facial reviewers are still considered as one of the most accurate approaches for person identification from video records. The human brain can utilize information, not just from a single image, but also from a sequence of images (i.e., videos), and even in the case of low-quality records or a long distance from a camera, it can accurately identify a given person. Unfortunately, in many cases, a single still image is needed. An example of such a case is a police search that is about to be announced in newspapers. This paper introduces a face database obtained from real environment counting in 17,426 sequences of images. The dataset includes persons of various races and ages and also different environments, different lighting conditions or camera device types. This paper also introduces a new multi-frame face super-resolution method and compares this method with the state-of-the-art single-frame and multi-frame super-resolution methods. We prove that the proposed method increases the quality of face images, even in cases of low-resolution low-quality input images, and provides better results than single-frame approaches that are still considered the best in this area. Quality of face images was evaluated using several objective mathematical methods, and also subjective ones, by several volunteers. The source code and the dataset were released and the experiment is fully reproducible.
Collapse
|
19
|
Peng C, Wang N, Li J, Gao X. Universal Face Photo-Sketch Style Transfer via Multiview Domain Translation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:8519-8534. [PMID: 32813659 DOI: 10.1109/tip.2020.3016502] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Face photo-sketch style transfer aims to convert a representation of a face from the photo (or sketch) domain to the sketch (respectively, photo) domain while preserving the character of the subject. It has wide-ranging applications in law enforcement, forensic investigation and digital entertainment. However, conventional face photo-sketch synthesis methods usually require training images from both the source domain and the target domain, and are limited in that they cannot be applied to universal conditions where collecting training images in the source domain that match the style of the test image is unpractical. This problem entails two major challenges: 1) designing an effective and robust domain translation model for the universal situation in which images of the source domain needed for training are unavailable, and 2) preserving the facial character while performing a transfer to the style of an entire image collection in the target domain. To this end, we present a novel universal face photo-sketch style transfer method that does not need any image from the source domain for training. The regression relationship between an input test image and the entire training image collection in the target domain is inferred via a deep domain translation framework, in which a domain-wise adaption term and a local consistency adaption term are developed. To improve the robustness of the style transfer process, we propose a multiview domain translation method that flexibly leverages a convolutional neural network representation with hand-crafted features in an optimal way. Qualitative and quantitative comparisons are provided for universal unconstrained conditions of unavailable training images from the source domain, demonstrating the effectiveness and superiority of our method for universal face photo-sketch style transfer.
Collapse
|
20
|
Murray RF. A model of lightness perception guided by probabilistic assumptions about lighting and reflectance. J Vis 2020; 20:28. [PMID: 32725175 PMCID: PMC7424934 DOI: 10.1167/jov.20.7.28] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Lightness perception is the ability to perceive black, white, and gray surface colors in a wide range of lighting conditions and contexts. This ability is fundamental for any biological or artificial visual system, but it poses a difficult computational problem, and how the human visual system computes lightness is not well understood. Here I show that several key phenomena in lightness perception can be explained by a probabilistic graphical model that makes a few simple assumptions about local patterns of lighting and reflectance, and infers globally optimal interpretations of stimulus images. Like human observers, the model exhibits partial lightness constancy, codetermination, contrast, glow, and articulation effects. It also arrives at human-like interpretations of strong lightness illusions that have challenged previous models. The model's assumptions are reasonable and generic, including, for example, that lighting intensity spans a much wider range than surface reflectance and that shadow boundaries tend to be straighter than reflectance edges. Thus, a probabilistic model based on simple assumptions about lighting and reflectance gives a good computational account of lightness perception over a wide range of conditions. This work also shows how graphical models can be extended to develop more powerful models of constancy that incorporate features such color and depth.
Collapse
Affiliation(s)
- Richard F Murray
- Department of Psychology and Centre for Vision Research, York University, Toronto, Ontario, Canada
| |
Collapse
|
21
|
Wang T, Zhou C, Yu H, Sun Y, Xie X, Liu C. Analysis and improvement of image segmentation algorithm based on fuzzy edge compensation. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-179858] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Tianqi Wang
- School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing, China
| | - Changjie Zhou
- School of Computer Science and Technology, Guizhou University, Guiyang, China
| | - Hui Yu
- Research Center of Maritime Security Technology, China Waterborne Transport Research Institute, Beijing, China
| | - Yi Sun
- School of Software Engineering, Beijing University of Posts and Telecommunications, Beijing, China
| | - Xuemei Xie
- School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing, China
| | - Chuanchang Liu
- Institute of Network Technology, Beijing University of Posts and Telecommunications, Beijing, China
| |
Collapse
|
22
|
Sun D, Yang X, Liu MY, Kautz J. Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:1408-1423. [PMID: 30676944 DOI: 10.1109/tpami.2019.2894353] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
We investigate two crucial and closely-related aspects of CNNs for optical flow estimation: models and training. First, we design a compact but effective CNN model, called PWC-Net, according to simple and well-established principles: pyramidal processing, warping, and cost volume processing. PWC-Net is 17 times smaller in size, 2 times faster in inference, and 11 percent more accurate on Sintel final than the recent FlowNet2 model. It is the winning entry in the optical flow competition of the robust vision challenge. Next, we experimentally analyze the sources of our performance gains. In particular, we use the same training procedure for PWC-Net to retrain FlowNetC, a sub-network of FlowNet2. The retrained FlowNetC is 56 percent more accurate on Sintel final than the previously trained one and even 5 percent more accurate than the FlowNet2 model. We further improve the training procedure and increase the accuracy of PWC-Net on Sintel by 10 percent and on KITTI 2012 and 2015 by 20 percent. Our newly trained model parameters and training protocols are available on https://github.com/NVlabs/PWC-Net.
Collapse
|
23
|
Bresler G, Karzand M. Learning a tree-structured Ising model in order to make predictions. Ann Stat 2020. [DOI: 10.1214/19-aos1808] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
24
|
Abd El-Samie FE, Ashiba HI, Shendy H, Mansour HM, Ahmed HM, Taha TE, Dessouky MI, Elkordy MF, Abd‑Elnaby M, El-Fishawy AS. Enhancement of Infrared Images Using Super Resolution Techniques Based on Big Data Processing. MULTIMEDIA TOOLS AND APPLICATIONS 2020; 79:5671-5692. [DOI: 10.1007/s11042-019-7634-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2018] [Revised: 04/02/2019] [Accepted: 04/11/2019] [Indexed: 09/01/2023]
|
25
|
Yang Q, Li N, Zhao Z, Fan X, Chang EIC, Xu Y. MRI Cross-Modality Image-to-Image Translation. Sci Rep 2020; 10:3753. [PMID: 32111966 PMCID: PMC7048849 DOI: 10.1038/s41598-020-60520-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 02/12/2020] [Indexed: 11/23/2022] Open
Abstract
We present a cross-modality generation framework that learns to generate translated modalities from given modalities in MR images. Our proposed method performs Image Modality Translation (abbreviated as IMT) by means of a deep learning model that leverages conditional generative adversarial networks (cGANs). Our framework jointly exploits the low-level features (pixel-wise information) and high-level representations (e.g. brain tumors, brain structure like gray matter, etc.) between cross modalities which are important for resolving the challenging complexity in brain structures. Our framework can serve as an auxiliary method in medical use and has great application potential. Based on our proposed framework, we first propose a method for cross-modality registration by fusing the deformation fields to adopt the cross-modality information from translated modalities. Second, we propose an approach for MRI segmentation, translated multichannel segmentation (TMS), where given modalities, along with translated modalities, are segmented by fully convolutional networks (FCN) in a multichannel manner. Both of these two methods successfully adopt the cross-modality information to improve the performance without adding any extra data. Experiments demonstrate that our proposed framework advances the state-of-the-art on five brain MRI datasets. We also observe encouraging results in cross-modality registration and segmentation on some widely adopted brain datasets. Overall, our work can serve as an auxiliary method in medical use and be applied to various tasks in medical fields.
Collapse
Grants
- This work is supported by Microsoft Research under the eHealth program, the National Natural Science Foundation in China under Grant 81771910, the National Science and Technology Major Project of the Ministry of Science and Technology in China under Grant 2017YFC0110903, the Beijing Natural Science Foundation in China under Grant 4152033, the Technology and Innovation Commission of Shenzhen in China under Grant shenfagai2016-627, Beijing Young Talent Project in China, the Fundamental Research Funds for the Central Universities of China under Grant SKLSDE-2017ZX-08 from the State Key Laboratory of Software Development Environment in Beihang University in China, the 111 Project in China under Grant B13003.
- This work is supported by the National Science and Technology Major Project of the Ministry of Science and Technology in China under Grant 2017YFC0110903, Microsoft Research under the eHealth program, the National Natural Science Foundation in China under Grant 81771910, the Beijing Natural Science Foundation in China under Grant 4152033, the Technology and Innovation Commission of Shenzhen in China under Grant shenfagai2016-627, Beijing Young Talent Project in China, the Fundamental Research Funds for the Central Universities of China under Grant SKLSDE-2017ZX-08 from the State Key Laboratory of Software Development Environment in Beihang University in China, the 111 Project in China under Grant B13003.
Collapse
Affiliation(s)
- Qianye Yang
- State Key Laboratory of Software Development Environment and Key Laboratory of Biomechanics and Mechanobiology of Ministry of Education and Research Institute of Beihang University in Shenzhen, Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing, 100191, China
| | - Nannan Li
- State Key Laboratory of Software Development Environment and Key Laboratory of Biomechanics and Mechanobiology of Ministry of Education and Research Institute of Beihang University in Shenzhen, Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing, 100191, China
- Ping An Technology (Shenzhen) Co., Ltd., Shanghai, 200030, China
| | - Zixu Zhao
- State Key Laboratory of Software Development Environment and Key Laboratory of Biomechanics and Mechanobiology of Ministry of Education and Research Institute of Beihang University in Shenzhen, Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing, 100191, China
| | - Xingyu Fan
- Bioengineering College of Chongqing University, Chongqing, 400044, China
| | | | - Yan Xu
- State Key Laboratory of Software Development Environment and Key Laboratory of Biomechanics and Mechanobiology of Ministry of Education and Research Institute of Beihang University in Shenzhen, Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing, 100191, China.
- Microsoft Research Asia, Beijing, 100080, China.
| |
Collapse
|
26
|
Single image super resolution using dictionary learning and sparse coding with multi-scale and multi-directional Gabor feature representation. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2019.10.040] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
27
|
Qin J, Huang Y, Wen W. Multi-scale feature fusion residual network for Single Image Super-Resolution. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.10.076] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
28
|
Image Super-Resolution Based on CNN Using Multilabel Gene Expression Programming. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10030854] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Current mainstream super-resolution algorithms based on deep learning use a deep convolution neural network (CNN) framework to realize end-to-end learning from low-resolution (LR) image to high-resolution (HR) images, and have achieved good image restoration effects. However, as the number of layers in the network is increased, better results are not necessarily obtained, and there will be problems such as slow training convergence, mismatched sample blocks, and unstable image restoration results. We propose a preclassified deep-learning algorithm (MGEP-SRCNN) using Multilabel Gene Expression Programming (MGEP), which screens out a sample sub-bank with high relevance to the target image before image block extraction, preclassifies samples in a multilabel framework, and then performs nonlinear mapping and image reconstruction. The algorithm is verified through standard images, and better objective image quality is obtained. The restoration effect under different magnification conditions is also better.
Collapse
|
29
|
Ranjan A, Hoffmann DT, Tzionas D, Tang S, Romero J, Black MJ. Learning Multi-human Optical Flow. Int J Comput Vis 2020. [DOI: 10.1007/s11263-019-01279-w] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
AbstractThe optical flow of humans is well known to be useful for the analysis of human action. Recent optical flow methods focus on training deep networks to approach the problem. However, the training data used by them does not cover the domain of human motion. Therefore, we develop a dataset of multi-human optical flow and train optical flow networks on this dataset. We use a 3D model of the human body and motion capture data to synthesize realistic flow fields in both single- and multi-person images. We then train optical flow networks to estimate human flow fields from pairs of images. We demonstrate that our trained networks are more accurate than a wide range of top methods on held-out test data and that they can generalize well to real image sequences. The code, trained models and the dataset are available for research.
Collapse
|
30
|
Ouyang Y. Strong-Structural Convolution Neural Network for Semantic Segmentation. PATTERN RECOGNITION AND IMAGE ANALYSIS 2019. [DOI: 10.1134/s1054661819040126] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
31
|
Fu B, Li Y, Wang XH, Ren YG. Image super-resolution using TV priori guided convolutional network. Pattern Recognit Lett 2019. [DOI: 10.1016/j.patrec.2019.06.022] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
32
|
Zhao Y, Wang R, Jia W, Zuo W, Liu X, Gao W. Deep Reconstruction of Least Significant Bits for Bit-Depth Expansion. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:2847-2859. [PMID: 30624217 DOI: 10.1109/tip.2019.2891131] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Bit-depth expansion (BDE) is important for displaying a low bit-depth image in a high bit-depth monitor. Current BDE algorithms often utilize traditional methods to fill the missing least significant bits and suffer from multiple kinds of perceivable artifacts. In this paper, we present a deep residual network-based method for BDE. Based on the different properties of flat and non-flat areas, two channels are proposed to reconstruct these two kinds of areas, respectively. Moreover, a simple yet efficient local adaptive adjustment preprocessing is presented in the flat-area-channel. By combining the benefits of both the traditional debanding strategy and network-based reconstruction, the proposed method can further promote the subjective quality of the flat area. Experimental results on several image sets demonstrate that the proposed BDE network can obtain favorable visual quality as well as decent quantitative performance.
Collapse
|
33
|
Analyzing Perception-Distortion Tradeoff Using Enhanced Perceptual Super-Resolution Network. LECTURE NOTES IN COMPUTER SCIENCE 2019. [DOI: 10.1007/978-3-030-11021-5_8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
|
34
|
Liu Z, Wang L, Hua G, Zhang Q, Niu Z, Wu Y, Zheng N. Joint Video Object Discovery and Segmentation by Coupled Dynamic Markov Networks. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:5840-5853. [PMID: 30059300 DOI: 10.1109/tip.2018.2859622] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
It is a challenging task to extract segmentation mask of a target from a single noisy video, which involves object discovery coupled with segmentation. To solve this challenge, we present a method to jointly discover and segment an object from a noisy video, where the target disappears intermittently throughout the video. Previous methods either only fulfill video object discovery, or video object segmentation presuming the existence of the object in each frame. We argue that jointly conducting the two tasks in a unified way will be beneficial. In other words, video object discovery and video object segmentation tasks can facilitate each other. To validate this hypothesis, we propose a principled probabilistic model, where two dynamic Markov networks are coupled-one for discovery and the other for segmentation. When conducting the Bayesian inference on this model using belief propagation, the bi-directional message passing reveals a clear collaboration between these two inference tasks. We validated our proposed method in five data sets. The first three video data sets, i.e., the SegTrack data set, the YouTube-objects data set, and the Davis data set, are not noisy, where all video frames contain the objects. The two noisy data sets, i.e., the XJTU-Stevens data set, and the Noisy-ViDiSeg data set, newly introduced in this paper, both have many frames that do not contain the objects. When compared with state of the art, it is shown that although our method produces inferior results on video data sets without noisy frames, we are able to obtain better results on video data sets with noisy frames.
Collapse
|
35
|
Face image super-resolution with pose via nuclear norm regularized structural orthogonal Procrustes regression. Neural Comput Appl 2018. [DOI: 10.1007/s00521-018-3826-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
36
|
Infrared Image Super-Resolution Reconstruction Based on Quaternion Fractional Order Total Variation with Lp Quasinorm. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8101864] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Owing to the limitations of the imaging principle as well as the properties of imaging systems, infrared images often have some drawbacks, including low resolution, a lack of detail, and indistinct edges. Therefore, it is essential to improve infrared image quality. Considering the information of neighbors, a description of sparse edges, and by avoiding staircase artifacts, a new super-resolution reconstruction (SRR) method is proposed for infrared images, which is based on fractional order total variation (FTV) with quaternion total variation and the L p quasinorm. Our proposed method improves the sparsity exploitation of FTV, and efficiently preserves image structures. Furthermore, we adopt the plug-and-play alternating direction method of multipliers (ADMM) and the fast Fourier transform (FFT) theory for the proposed method to improve the efficiency and robustness of our algorithm; in addition, an accelerated step is adopted. Our experimental results show that the proposed method leads to excellent performances in terms of an objective evaluation and the subjective visual effect.
Collapse
|
37
|
Robust face super-resolution via iterative sparsity and locality-constrained representation. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2018.06.050] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
38
|
Liu Z, Li X, Luo P, Loy CC, Tang X. Deep Learning Markov Random Field for Semantic Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2018; 40:1814-1828. [PMID: 28796610 DOI: 10.1109/tpami.2017.2737535] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Semantic segmentation tasks can be well modeled by Markov Random Field (MRF). This paper addresses semantic segmentation by incorporating high-order relations and mixture of label contexts into MRF. Unlike previous works that optimized MRFs using iterative algorithm, we solve MRF by proposing a Convolutional Neural Network (CNN), namely Deep Parsing Network (DPN), which enables deterministic end-to-end computation in a single forward pass. Specifically, DPN extends a contemporary CNN to model unary terms and additional layers are devised to approximate the mean field (MF) algorithm for pairwise terms. It has several appealing properties. First, different from the recent works that required many iterations of MF during back-propagation, DPN is able to achieve high performance by approximating one iteration of MF. Second, DPN represents various types of pairwise terms, making many existing models as its special cases. Furthermore, pairwise terms in DPN provide a unified framework to encode rich contextual information in high-dimensional data, such as images and videos. Third, DPN makes MF easier to be parallelized and speeded up, thus enabling efficient inference. DPN is thoroughly evaluated on standard semantic image/video segmentation benchmarks, where a single DPN model yields state-of-the-art segmentation accuracies on PASCAL VOC 2012, Cityscapes dataset and CamVid dataset.
Collapse
|
39
|
|
40
|
|
41
|
|
42
|
Super-Resolution for "Jilin-1" Satellite Video Imagery via a Convolutional Network. SENSORS 2018; 18:s18041194. [PMID: 29652838 PMCID: PMC5948634 DOI: 10.3390/s18041194] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Revised: 04/10/2018] [Accepted: 04/11/2018] [Indexed: 11/17/2022]
Abstract
Super-resolution for satellite video attaches much significance to earth observation accuracy, and the special imaging and transmission conditions on the video satellite pose great challenges to this task. The existing deep convolutional neural-network-based methods require pre-processing or post-processing to be adapted to a high-resolution size or pixel format, leading to reduced performance and extra complexity. To this end, this paper proposes a five-layer end-to-end network structure without any pre-processing and post-processing, but imposes a reshape or deconvolution layer at the end of the network to retain the distribution of ground objects within the image. Meanwhile, we formulate a joint loss function by combining the output and high-dimensional features of a non-linear mapping network to precisely learn the desirable mapping relationship between low-resolution images and their high-resolution counterparts. Also, we use satellite video data itself as a training set, which favors consistency between training and testing images and promotes the method’s practicality. Experimental results on “Jilin-1” satellite video imagery show that this method demonstrates a superior performance in terms of both visual effects and measure metrics over competing methods.
Collapse
|
43
|
Zhao Y, Wang R, Jia W, Yang J, Wang W, Gao W. Local patch encoding-based method for single image super-resolution. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2017.12.032] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
44
|
Song Q, Xiong R, Liu D, Xiong Z, Wu F, Gao W. Fast Image Super-Resolution via Local Adaptive Gradient Field Sharpening Transform. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:1966-1980. [PMID: 33156782 DOI: 10.1109/tip.2017.2789323] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This paper proposes a single-image super-resolution scheme by introducing a gradient field sharpening transform that converts the blurry gradient field of upsampled low-resolution (LR) image to a much sharper gradient field of original high-resolution (HR) image. Different from the existing methods that need to figure out the whole gradient profile structure and locate the edge points, we derive a new approach that sharpens the gradient field adaptively only based on the pixels in a small neighborhood. To maintain image contrast, image gradient is adaptively scaled to keep the integral of gradient field stable. Finally, the HR image is reconstructed by fusing the LR image with the sharpened HR gradient field. Experimental results demonstrate that the proposed algorithm can generate more accurate gradient field and produce super-resolved images with better objective and visual qualities. Another advantage is that the proposed gradient sharpening transform is very fast and suitable for low-complexity applications.
Collapse
|
45
|
Huang Y, Wang W, Wang L. Video Super-Resolution via Bidirectional Recurrent Convolutional Networks. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2018; 40:1015-1028. [PMID: 28489532 DOI: 10.1109/tpami.2017.2701380] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Super resolving a low-resolution video, namely video super-resolution (SR), is usually handled by either single-image SR or multi-frame SR. Single-Image SR deals with each video frame independently, and ignores intrinsic temporal dependency of video frames which actually plays a very important role in video SR. Multi-Frame SR generally extracts motion information, e.g., optical flow, to model the temporal dependency, but often shows high computational cost. Considering that recurrent neural networks (RNNs) can model long-term temporal dependency of video sequences well, we propose a fully convolutional RNN named bidirectional recurrent convolutional network for efficient multi-frame SR. Different from vanilla RNNs, 1) the commonly-used full feedforward and recurrent connections are replaced with weight-sharing convolutional connections. So they can greatly reduce the large number of network parameters and well model the temporal dependency in a finer level, i.e., patch-based rather than frame-based, and 2) connections from input layers at previous timesteps to the current hidden layer are added by 3D feedforward convolutions, which aim to capture discriminate spatio-temporal patterns for short-term fast-varying motions in local adjacent frames. Due to the cheap convolutional operations, our model has a low computational complexity and runs orders of magnitude faster than other multi-frame SR methods. With the powerful temporal dependency modeling, our model can super resolve videos with complex motions and achieve well performance.
Collapse
|
46
|
Abbasi A, Monadjemi A, Fang L, Rabbani H. Optical coherence tomography retinal image reconstruction via nonlocal weighted sparse representation. JOURNAL OF BIOMEDICAL OPTICS 2018; 23:1-11. [PMID: 29575829 DOI: 10.1117/1.jbo.23.3.036011] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Accepted: 03/06/2018] [Indexed: 06/08/2023]
Abstract
We present a nonlocal weighted sparse representation (NWSR) method for reconstruction of retinal optical coherence tomography (OCT) images. To reconstruct a high signal-to-noise ratio and high-resolution OCT images, utilization of efficient denoising and interpolation algorithms are necessary, especially when the original data were subsampled during acquisition. However, the OCT images suffer from the presence of a high level of noise, which makes the estimation of sparse representations a difficult task. Thus, the proposed NWSR method merges sparse representations of multiple similar noisy and denoised patches to better estimate a sparse representation for each patch. First, the sparse representation of each patch is independently computed over an overcomplete dictionary, and then a nonlocal weighted sparse coefficient is computed by averaging representations of similar patches. Since the sparsity can reveal relevant information from noisy patches, combining noisy and denoised patches' representations is beneficial to obtain a more robust estimate of the unknown sparse representation. The denoised patches are obtained by applying an off-the-shelf image denoising method and our method provides an efficient way to exploit information from noisy and denoised patches' representations. The experimental results on denoising and interpolation of spectral domain OCT images demonstrated the effectiveness of the proposed NWSR method over existing state-of-the-art methods.
Collapse
Affiliation(s)
- Ashkan Abbasi
- University of Isfahan, Department of Artificial Intelligence, Faculty of Computer Engineering, Isfah, Iran
| | - Amirhassan Monadjemi
- University of Isfahan, Department of Artificial Intelligence, Faculty of Computer Engineering, Isfah, Iran
| | - Leyuan Fang
- Hunan University, College of Electrical and Information Engineering, Changsha, China
| | - Hossein Rabbani
- Isfahan University of Medical Sciences, School of Advanced Technologies in Medicine, Medical Image a, Iran
| |
Collapse
|
47
|
Zeng X, Huang H, Qi C. Expanding Training Data for Facial Image Super-Resolution. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:716-729. [PMID: 28166514 DOI: 10.1109/tcyb.2017.2655027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The quality of training data is very important for learning-based facial image super-resolution (SR). The more similarity between training data and testing input is, the better SR results we can have. To generate a better training set of low/high resolution training facial images for a particular testing input, this paper is the first work that proposes expanding the training data for improving facial image SR. To this end, observing that facial images are highly structured, we propose three constraints, i.e., the local structure constraint, the correspondence constraint and the similarity constraint, to generate new training data, where local patches are expanded with different expansion parameters. The expanded training data can be used for both patch-based facial SR methods and global facial SR methods. Extensive testings on benchmark databases and real world images validate the effectiveness of training data expansion on improving the SR quality.
Collapse
|
48
|
Cai J, Gu S, Zhang L, Zhang L. Learning a Deep Single Image Contrast Enhancer from Multi-Exposure Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:2049-2062. [PMID: 29994747 DOI: 10.1109/tip.2018.2794218] [Citation(s) in RCA: 104] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Due to the poor lighting condition and limited dynamic range of digital imaging devices, the recorded images are often under-/over-exposed and with low contrast. Most of previous single image contrast enhancement (SICE) methods adjust the tone curve to correct the contrast of an input image. Those methods, however, often fail in revealing image details because of the limited information in a single image. On the other hand, the SICE task can be better accomplished if we can learn extra information from appropriately collected training data. In this work, we propose to use the convolutional neural network (CNN) to train a SICE enhancer. One key issue is how to construct a training dataset of low-contrast and high-contrast image pairs for end-to-end CNN learning. To this end, we build a large-scale multi-exposure image dataset, which contains 589 elaborately selected high-resolution multi-exposure sequences with 4,413 images. Thirteen representative multi-exposure image fusion and stack-based high dynamic range imaging algorithms are employed to generate the contrast enhanced images for each sequence, and subjective experiments are conducted to screen the best quality one as the reference image of each scene. With the constructed dataset, a CNN can be easily trained as the SICE enhancer to improve the contrast of an under-/over-exposure image. Experimental results demonstrate the advantages of our method over existing SICE methods with a significant margin.
Collapse
|
49
|
SRFeat: Single Image Super-Resolution with Feature Discrimination. COMPUTER VISION – ECCV 2018 2018. [DOI: 10.1007/978-3-030-01270-0_27] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
|
50
|
Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y. Image Super-Resolution Using Very Deep Residual Channel Attention Networks. COMPUTER VISION – ECCV 2018 2018. [DOI: 10.1007/978-3-030-01234-2_18] [Citation(s) in RCA: 910] [Impact Index Per Article: 151.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|