1
|
Balmez R, Brateanu A, Orhei C, Ancuti CO, Ancuti C. DepthLux: Employing Depthwise Separable Convolutions for Low-Light Image Enhancement. SENSORS (BASEL, SWITZERLAND) 2025; 25:1530. [PMID: 40096403 PMCID: PMC11902424 DOI: 10.3390/s25051530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2025] [Revised: 02/22/2025] [Accepted: 02/27/2025] [Indexed: 03/19/2025]
Abstract
Low-light image enhancement is an important task in computer vision, often made challenging by the limitations of image sensors, such as noise, low contrast, and color distortion. These challenges are further exacerbated by the computational demands of processing spatial dependencies under such conditions. We present a novel transformer-based framework that enhances efficiency by utilizing depthwise separable convolutions instead of conventional approaches. Additionally, an original feed-forward network design reduces the computational overhead while maintaining high performance. Experimental results demonstrate that this method achieves competitive results, providing a practical and effective solution for enhancing images captured in low-light environments.
Collapse
Affiliation(s)
- Raul Balmez
- Department of Computer Science, University of Manchester, Manchester M13 9PL, UK; (R.B.); (A.B.)
| | - Alexandru Brateanu
- Department of Computer Science, University of Manchester, Manchester M13 9PL, UK; (R.B.); (A.B.)
| | - Ciprian Orhei
- Faculty of Electronics, Telecommunications and Information Technologies, Polytechnic University Timisoara, 300223 Timisoara, Romania;
| | - Codruta O. Ancuti
- Faculty of Electronics, Telecommunications and Information Technologies, Polytechnic University Timisoara, 300223 Timisoara, Romania;
| | - Cosmin Ancuti
- Faculty of Electronics, Telecommunications and Information Technologies, Polytechnic University Timisoara, 300223 Timisoara, Romania;
| |
Collapse
|
2
|
Xie C, Fei L, Tao H, Hu Y, Zhou W, Hoe JT, Hu W, Tan YP. Residual Quotient Learning for Zero-Reference Low-Light Image Enhancement. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; PP:365-378. [PMID: 40030647 DOI: 10.1109/tip.2024.3519997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Recently, neural networks have become the dominant approach to low-light image enhancement (LLIE), with at least one-third of them adopting a Retinex-related architecture. However, through in-depth analysis, we contend that this most widely accepted LLIE structure is suboptimal, particularly when addressing the non-uniform illumination commonly observed in natural images. In this paper, we present a novel variant learning framework, termed residual quotient learning, to substantially alleviate this issue. Instead of following the existing Retinex-related decomposition-enhancement-reconstruction process, our basic idea is to explicitly reformulate the light enhancement task as adaptively predicting the latent quotient with reference to the original low-light input using a residual learning fashion. By leveraging the proposed residual quotient learning, we develop a lightweight yet effective network called ResQ-Net. This network features enhanced non-uniform illumination modeling capabilities, making it more suitable for real-world LLIE tasks. Moreover, due to its well-designed structure and reference-free loss function, ResQ-Net is flexible in training as it allows for zero-reference optimization, which further enhances the generalization and adaptability of our entire framework. Extensive experiments on various benchmark datasets demonstrate the merits and effectiveness of the proposed residual quotient learning, and our trained ResQ-Net outperforms state-of-the-art methods both qualitatively and quantitatively. Furthermore, a practical application in dark face detection is explored, and the preliminary results confirm the potential and feasibility of our method in real-world scenarios.
Collapse
|
3
|
Jourlin M. Image Enhancement Thanks to Negative Grey Levels in the Logarithmic Image Processing Framework. SENSORS (BASEL, SWITZERLAND) 2024; 24:4969. [PMID: 39124018 PMCID: PMC11314874 DOI: 10.3390/s24154969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Revised: 06/21/2024] [Accepted: 07/11/2024] [Indexed: 08/12/2024]
Abstract
The present study deals with image enhancement, which is a very common problem in image processing. This issue has been addressed in multiple works with different methods, most with the sole purpose of improving the perceived quality. Our goal is to propose an approach with a strong physical justification that can model the human visual system. This is why the Logarithmic Image Processing (LIP) framework was chosen. Within this model, initially dedicated to images acquired in transmission, it is possible to introduce the novel concept of negative grey levels, interpreted as light intensifiers. Such an approach permits the extension of the dynamic range of a low-light image to the full grey scale in "real-time", which means at camera speed. In addition, this method is easily generalizable to colour images and is reversible, i.e., bijective in the mathematical sense, and can be applied to images acquired in reflection thanks to the consistency of the LIP framework with human vision. Various application examples are presented, as well as prospects for extending this work.
Collapse
Affiliation(s)
- Michel Jourlin
- Laboratoire Hubert Curien, UMR CNRS 5516, 18 Rue Professeur Benoît Lauras, 42000 Saint-Étienne, France
| |
Collapse
|
4
|
Tian Z, Qu P, Li J, Sun Y, Li G, Liang Z, Zhang W. A Survey of Deep Learning-Based Low-Light Image Enhancement. SENSORS (BASEL, SWITZERLAND) 2023; 23:7763. [PMID: 37765817 PMCID: PMC10535564 DOI: 10.3390/s23187763] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 08/29/2023] [Accepted: 09/02/2023] [Indexed: 09/29/2023]
Abstract
Images captured under poor lighting conditions often suffer from low brightness, low contrast, color distortion, and noise. The function of low-light image enhancement is to improve the visual effect of such images for subsequent processing. Recently, deep learning has been used more and more widely in image processing with the development of artificial intelligence technology, and we provide a comprehensive review of the field of low-light image enhancement in terms of network structure, training data, and evaluation metrics. In this paper, we systematically introduce low-light image enhancement based on deep learning in four aspects. First, we introduce the related methods of low-light image enhancement based on deep learning. We then describe the low-light image quality evaluation methods, organize the low-light image dataset, and finally compare and analyze the advantages and disadvantages of the related methods and give an outlook on the future development direction.
Collapse
Affiliation(s)
- Zhen Tian
- School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China; (Z.T.); (J.L.); (Y.S.); (G.L.); (W.Z.)
- Institute of Computer Applications, Henan Institute of Science and Technology, Xinxiang 453003, China
| | - Peixin Qu
- School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China; (Z.T.); (J.L.); (Y.S.); (G.L.); (W.Z.)
- Institute of Computer Applications, Henan Institute of Science and Technology, Xinxiang 453003, China
| | - Jielin Li
- School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China; (Z.T.); (J.L.); (Y.S.); (G.L.); (W.Z.)
- Institute of Computer Applications, Henan Institute of Science and Technology, Xinxiang 453003, China
| | - Yukun Sun
- School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China; (Z.T.); (J.L.); (Y.S.); (G.L.); (W.Z.)
- Institute of Computer Applications, Henan Institute of Science and Technology, Xinxiang 453003, China
| | - Guohou Li
- School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China; (Z.T.); (J.L.); (Y.S.); (G.L.); (W.Z.)
- Institute of Computer Applications, Henan Institute of Science and Technology, Xinxiang 453003, China
| | - Zheng Liang
- School of Internet, Anhui University, Hefei 230039, China;
| | - Weidong Zhang
- School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China; (Z.T.); (J.L.); (Y.S.); (G.L.); (W.Z.)
- Institute of Computer Applications, Henan Institute of Science and Technology, Xinxiang 453003, China
| |
Collapse
|
5
|
Lv X, Zhang S, Wang C, Zhang W, Yao H, Huang Q. Unsupervised Low-Light Video Enhancement With Spatial-Temporal Co-Attention Transformer. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:4701-4715. [PMID: 37549080 DOI: 10.1109/tip.2023.3301332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/09/2023]
Abstract
Existing low-light video enhancement methods are dominated by Convolution Neural Networks (CNNs) that are trained in a supervised manner. Due to the difficulty of collecting paired dynamic low/normal-light videos in real-world scenes, they are usually trained on synthetic, static, and uniform motion videos, which undermines their generalization to real-world scenes. Additionally, these methods typically suffer from temporal inconsistency (e.g., flickering artifacts and motion blurs) when handling large-scale motions since the local perception property of CNNs limits them to model long-range dependencies in both spatial and temporal domains. To address these problems, we propose the first unsupervised method for low-light video enhancement to our best knowledge, named LightenFormer, which models long-range intra- and inter-frame dependencies with a spatial-temporal co-attention transformer to enhance brightness while maintaining temporal consistency. Specifically, an effective but lightweight S-curve Estimation Network (SCENet) is first proposed to estimate pixel-wise S-shaped non-linear curves (S-curves) to adaptively adjust the dynamic range of an input video. Next, to model the temporal consistency of the video, we present a Spatial-Temporal Refinement Network (STRNet) to refine the enhanced video. The core module of STRNet is a novel Spatial-Temporal Co-attention Transformer (STCAT), which exploits multi-scale self- and cross-attention interactions to capture long-range correlations in both spatial and temporal domains among frames for implicit motion estimation. To achieve unsupervised training, we further propose two non-reference loss functions based on the invertibility of the S-curve and the noise independence among frames. Extensive experiments on the SDSD and LLIV-Phone datasets demonstrate that our LightenFormer outperforms state-of-the-art methods.
Collapse
|
6
|
Li C, Guo C, Han L, Jiang J, Cheng MM, Gu J, Loy CC. Low-Light Image and Video Enhancement Using Deep Learning: A Survey. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:9396-9416. [PMID: 34752382 DOI: 10.1109/tpami.2021.3126387] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Low-light image enhancement (LLIE) aims at improving the perception or interpretability of an image captured in an environment with poor illumination. Recent advances in this area are dominated by deep learning-based solutions, where many learning strategies, network structures, loss functions, training data, etc. have been employed. In this paper, we provide a comprehensive survey to cover various aspects ranging from algorithm taxonomy to unsolved open issues. To examine the generalization of existing methods, we propose a low-light image and video dataset, in which the images and videos are taken by different mobile phones' cameras under diverse illumination conditions. Besides, for the first time, we provide a unified online platform that covers many popular LLIE methods, of which the results can be produced through a user-friendly web interface. In addition to qualitative and quantitative evaluation of existing methods on publicly available and our proposed datasets, we also validate their performance in face detection in the dark. This survey together with the proposed dataset and online platform could serve as a reference source for future study and promote the development of this research field. The proposed platform and dataset as well as the collected methods, datasets, and evaluation metrics are publicly available and will be regularly updated. Project page: https://www.mmlab-ntu.com/project/lliv_survey/index.html.
Collapse
|
7
|
Lu H, Gong J, Liu Z, Lan R, Pan X. FDMLNet: A Frequency-Division and Multiscale Learning Network for Enhancing Low-Light Image. SENSORS (BASEL, SWITZERLAND) 2022; 22:8244. [PMID: 36365942 PMCID: PMC9657473 DOI: 10.3390/s22218244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Revised: 10/20/2022] [Accepted: 10/24/2022] [Indexed: 06/16/2023]
Abstract
Low-illumination images exhibit low brightness, blurry details, and color casts, which present us an unnatural visual experience and further have a negative effect on other visual applications. Data-driven approaches show tremendous potential for lighting up the image brightness while preserving its visual naturalness. However, these methods introduce hand-crafted holes and noise enlargement or over/under enhancement and color deviation. For mitigating these challenging issues, this paper presents a frequency division and multiscale learning network named FDMLNet, including two subnets, DetNet and StruNet. This design first applies the guided filter to separate the high and low frequencies of authentic images, then DetNet and StruNet are, respectively, developed to process them, to fully explore their information at different frequencies. In StruNet, a feasible feature extraction module (FFEM), grouped by multiscale learning block (MSL) and a dual-branch channel attention mechanism (DCAM), is injected to promote its multiscale representation ability. In addition, three FFEMs are connected in a new dense connectivity meant to utilize multilevel features. Extensive quantitative and qualitative experiments on public benchmarks demonstrate that our FDMLNet outperforms state-of-the-art approaches benefiting from its stronger multiscale feature expression and extraction ability.
Collapse
|
8
|
Wang X, Hu R, Xu X. Single low-light image brightening using learning-based intensity mapping. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.08.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
9
|
Ye J, Chen X, Qiu C, Zhang Z. Low-Light Image Enhancement Using Photometric Alignment with Hierarchy Pyramid Network. SENSORS (BASEL, SWITZERLAND) 2022; 22:6799. [PMID: 36146148 PMCID: PMC9505311 DOI: 10.3390/s22186799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Revised: 09/03/2022] [Accepted: 09/05/2022] [Indexed: 06/16/2023]
Abstract
Low-light image enhancement can effectively assist high-level vision tasks that often fail in poor illumination conditions. Most previous data-driven methods, however, implemented enhancement directly from severely degraded low-light images that may provide undesirable enhancement results, including blurred detail, intensive noise, and distorted color. In this paper, inspired by a coarse-to-fine strategy, we propose an end-to-end image-level alignment with pixel-wise perceptual information enhancement pipeline for low-light image enhancement. A coarse adaptive global photometric alignment sub-network is constructed to reduce style differences, which facilitates improving illumination and revealing under-exposure area information. After the learned aligned image, a hierarchy pyramid enhancement sub-network is used to optimize image quality, which helps to remove amplified noise and enhance the local detail of low-light images. We also propose a multi-residual cascade attention block (MRCAB) that involves channel split and concatenation strategy, polarized self-attention mechanism, which leads to high-resolution reconstruction images in perceptual quality. Extensive experiments have demonstrated the effectiveness of our method on various datasets and significantly outperformed other state-of-the-art methods in detail and color reproduction.
Collapse
|
10
|
Low-light image enhancement with geometrical sparse representation. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04013-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
11
|
N2PN: Non-reference two-pathway network for low-light image enhancement. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02627-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
12
|
Wang D, Xia S, Yang W, Liu J. Combining Progressive Rethinking and Collaborative Learning: A Deep Framework for In-Loop Filtering. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:4198-4211. [PMID: 33798081 DOI: 10.1109/tip.2021.3068638] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this paper, we aim to address issues of (1) joint spatial-temporal modeling and (2) side information injection for deep-learning based in-loop filter. For (1), we design a deep network with both progressive rethinking and collaborative learning mechanisms to improve quality of the reconstructed intra-frames and inter-frames, respectively. For intra coding, a Progressive Rethinking Network (PRN) is designed to simulate the human decision mechanism for effective spatial modeling. Our designed block introduces an additional inter-block connection to bypass a high-dimensional informative feature before the bottleneck module across blocks to review the complete past memorized experiences and rethinks progressively. For inter coding, the current reconstructed frame interacts with reference frames (peak quality frame and the nearest adjacent frame) collaboratively at the feature level. For (2), we extract both intra-frame and inter-frame side information for better context modeling. A coarse-to-fine partition map based on HEVC partition trees is built as the intra-frame side information. Furthermore, the warped features of the reference frames are offered as the inter-frame side information. Our PRN with intra-frame side information provides 9.0% BD-rate reduction on average compared to HEVC baseline under All-intra (AI) configuration. While under Low-Delay B (LDB), Low-Delay P (LDP) and Random Access (RA) configuration, our PRN with inter-frame side information provides 9.0%, 10.6% and 8.0% BD-rate reduction on average respectively. Our project webpage is https://dezhao-wang.github.io/PRN-v2/.
Collapse
|