1
|
Huang Y, Liu C, Li B, Huang H, Zhang R, Ke W, Jing X. Frequency-Aware Divide-and-Conquer for Efficient Real Noise Removal. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:8429-8441. [PMID: 39178076 DOI: 10.1109/tnnls.2024.3439591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/25/2024]
Abstract
Deep-learning-based approaches have achieved remarkable progress for complex real scenario denoising, yet their accuracy-efficiency tradeoff is still understudied, particularly critical for mobile devices. As real noise is unevenly distributed relative to underlay signals in different frequency bands, we introduce a frequency-aware divide-and-conquer strategy to develop a frequency-aware denoising network (FADN). FADN is materialized by stacking frequency-aware denoising blocks (FADBs), in which a denoised image is progressively predicted by a series of frequency-aware noise dividing and conquering operations. For noise dividing, FADBs decompose the noisy and clean image pairs into low- and high-frequency representations via a wavelet transform (WT) followed by an invertible network and recover the final denoised image by integrating the denoised information from different frequency bands. For noise conquering, the separated low-frequency representation of the noisy image is kept as clean as possible by the supervision of the clean counterpart, while the high-frequency representation combining the estimated residual from the successive FADB is purified under the corresponding accompanied supervision for residual compensation. Since our FADN progressively and pertinently denoises from frequency bands, the accuracy-efficiency tradeoff can be controlled as a requirement by the number of FADBs. Experimental results on the SIDD, DND, and NAM datasets show that our FADN outperforms the state-of-the-art methods by improving the peak signal-to-noise ratio (PSNR) and decreasing the model parameters. The code is released at https://github.com/NekoDaiSiki/FADN.
Collapse
|
2
|
Wan X, Liu J, Gan X, Liu X, Wang S, Wen Y, Wan T, Zhu E. One-Step Multi-View Clustering With Diverse Representation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5774-5786. [PMID: 38557633 DOI: 10.1109/tnnls.2024.3378194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Multi-View clustering has attracted broad attention due to its capacity to utilize consistent and complementary information among views. Although tremendous progress has been made recently, most existing methods undergo high complexity, preventing them from being applied to large-scale tasks. Multi-View clustering via matrix factorization is a representative to address this issue. However, most of them map the data matrices into a fixed dimension, limiting the model's expressiveness. Moreover, a range of methods suffers from a two-step process, i.e., multimodal learning and the subsequent k-means, inevitably causing a suboptimal clustering result. In light of this, we propose a one-step multi-view clustering with diverse representation (OMVCDR) method, which incorporates multi-view learning and k-means into a unified framework. Specifically, we first project original data matrices into various latent spaces to attain comprehensive information and auto-weight them in a self-supervised manner. Then, we directly use the information matrices under diverse dimensions to obtain consensus discrete clustering labels. The unified work of representation learning and clustering boosts the quality of the final results. Furthermore, we develop an efficient optimization algorithm with proven convergence to solve the resultant problem. Comprehensive experiments on various datasets demonstrate the promising clustering performance of our proposed method. The code is publicly available at https://github.com/wanxinhang/OMVCDR.
Collapse
|
3
|
Yue Z, Yong H, Zhao Q, Zhang L, Meng D, Wong KYK. Deep Variational Network Toward Blind Image Restoration. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:7011-7026. [PMID: 38349822 DOI: 10.1109/tpami.2024.3365745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/15/2024]
Abstract
Blind image restoration (IR) is a common yet challenging problem in computer vision. Classical model-based methods and recent deep learning (DL)-based methods represent two different methodologies for this problem, each with their own merits and drawbacks. In this paper, we propose a novel blind image restoration method, aiming to integrate both the advantages of them. Specifically, we construct a general Bayesian generative model for the blind IR, which explicitly depicts the degradation process. In this proposed model, a pixel-wise non-i.i.d. Gaussian distribution is employed to fit the image noise. It is with more flexibility than the simple i.i.d. Gaussian or Laplacian distributions as adopted in most of conventional methods, so as to handle more complicated noise types contained in the image degradation. To solve the model, we design a variational inference algorithm where all the expected posteriori distributions are parameterized as deep neural networks to increase their model capability. Notably, such an inference algorithm induces a unified framework to jointly deal with the tasks of degradation estimation and image restoration. Further, the degradation information estimated in the former task is utilized to guide the latter IR process. Experiments on two typical blind IR tasks, namely image denoising and super-resolution, demonstrate that the proposed method achieves superior performance over current state-of-the-arts.
Collapse
|
4
|
Bu S, Li Y, Ren W, Liu G. ARU-DGAN: A dual generative adversarial network based on attention residual U-Net for magneto-acousto-electrical image denoising. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:19661-19685. [PMID: 38052619 DOI: 10.3934/mbe.2023871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Magneto-Acousto-Electrical Tomography (MAET) is a multi-physics coupling imaging modality that integrates the high resolution of ultrasound imaging with the high contrast of electrical impedance imaging. However, the quality of images obtained through this imaging technique can be easily compromised by environmental or experimental noise, thereby affecting the overall quality of the imaging results. Existing methods for magneto-acousto-electrical image denoising lack the capability to model local and global features of magneto-acousto-electrical images and are unable to extract the most relevant multi-scale contextual information to model the joint distribution of clean images and noise images. To address this issue, we propose a Dual Generative Adversarial Network based on Attention Residual U-Net (ARU-DGAN) for magneto-acousto-electrical image denoising. Specifically, our model approximates the joint distribution of magneto-acousto-electrical clean and noisy images from two perspectives: noise removal and noise generation. First, it transforms noisy images into clean ones through a denoiser; second, it converts clean images into noisy ones via a generator. Simultaneously, we design an Attention Residual U-Net (ARU) to serve as the backbone of the denoiser and generator in the Dual Generative Adversarial Network (DGAN). The ARU network adopts a residual mechanism and introduces a linear Self-Attention based on Cross-Normalization (CNorm-SA), which is proposed in this paper. This design allows the model to effectively extract the most relevant multi-scale contextual information while maintaining high resolution, thereby better modeling the local and global features of magneto-acousto-electrical images. Finally, extensive experiments on a real-world magneto-acousto-electrical image dataset constructed in this paper demonstrate significant improvements in preserving image details achieved by ARU-DGAN. Furthermore, compared to the state-of-the-art competitive methods, it exhibits a 0.3 dB increase in PSNR and an improvement of 0.47% in SSIM.
Collapse
Affiliation(s)
- Shuaiyu Bu
- Institute of Electrical Engineering, Chinese Academy of Sciences, Beijing 100190, China
- School of Electronic Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
- State Grid Beijing Electric Power Company, Beijing 100031, China
| | - Yuanyuan Li
- Institute of Electrical Engineering, Chinese Academy of Sciences, Beijing 100190, China
- School of Electronic Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Wenting Ren
- Department of Radiation Oncology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, China
| | - Guoqiang Liu
- Institute of Electrical Engineering, Chinese Academy of Sciences, Beijing 100190, China
- School of Electronic Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
5
|
Hu P, Zhu H, Lin J, Peng D, Zhao YP, Peng X. Unsupervised Contrastive Cross-Modal Hashing. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:3877-3889. [PMID: 35617190 DOI: 10.1109/tpami.2022.3177356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In this paper, we study how to make unsupervised cross-modal hashing (CMH) benefit from contrastive learning (CL) by overcoming two challenges. To be exact, i) to address the performance degradation issue caused by binary optimization for hashing, we propose a novel momentum optimizer that performs hashing operation learnable in CL, thus making on-the-shelf deep cross-modal hashing possible. In other words, our method does not involve binary-continuous relaxation like most existing methods, thus enjoying better retrieval performance; ii) to alleviate the influence brought by false-negative pairs (FNPs), we propose a Cross-modal Ranking Learning loss (CRL) which utilizes the discrimination from all instead of only the hard negative pairs, where FNP refers to the within-class pairs that were wrongly treated as negative pairs. Thanks to such a global strategy, CRL endows our method with better performance because CRL will not overuse the FNPs while ignoring the true-negative pairs. To the best of our knowledge, the proposed method could be one of the first successful contrastive hashing methods. To demonstrate the effectiveness of the proposed method, we carry out experiments on five widely-used datasets compared with 13 state-of-the-art methods. The code is available at https://github.com/penghu-cs/UCCH.
Collapse
|
6
|
Li Y, Zhou J, Tian J, Zheng X, Tang YY. Weighted Error Entropy-Based Information Theoretic Learning for Robust Subspace Representation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4228-4242. [PMID: 33606640 DOI: 10.1109/tnnls.2021.3056188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In most of the existing representation learning frameworks, the noise contaminating the data points is often assumed to be independent and identically distributed (i.i.d.), where the Gaussian distribution is often imposed. This assumption, though greatly simplifies the resulting representation problems, may not hold in many practical scenarios. For example, the noise in face representation is usually attributable to local variation, random occlusion, and unconstrained illumination, which is essentially structural, and hence, does not satisfy the i.i.d. property or the Gaussianity. In this article, we devise a generic noise model, referred to as independent and piecewise identically distributed (i.p.i.d.) model for robust presentation learning, where the statistical behavior of the underlying noise is characterized using a union of distributions. We demonstrate that our proposed i.p.i.d. model can better describe the complex noise encountered in practical scenarios and accommodate the traditional i.i.d. one as a special case. Assisted by the proposed noise model, we then develop a new information-theoretic learning framework for robust subspace representation through a novel minimum weighted error entropy criterion. Thanks to the superior modeling capability of the i.p.i.d. model, our proposed learning method achieves superior robustness against various types of noise. When applying our scheme to the subspace clustering and image recognition problems, we observe significant performance gains over the existing approaches.
Collapse
|
7
|
Zhang W, Wu QMJ, Yang Y, Akilan T. Multimodel Feature Reinforcement Framework Using Moore-Penrose Inverse for Big Data Analysis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:5008-5021. [PMID: 33021948 DOI: 10.1109/tnnls.2020.3026621] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Fully connected representation learning (FCRL) is one of the widely used network structures in multimodel image classification frameworks. However, most FCRL-based structures, for instance, stacked autoencoder encode features and find the final cognition with separate building blocks, resulting in loosely connected feature representation. This article achieves a robust representation by considering a low-dimensional feature and the classifier model simultaneously. Thus, a new hierarchical subnetwork-based neural network (HSNN) is proposed in this article. The novelties of this framework are as follows: 1) it is an iterative learning process, instead of stacking separate blocks to obtain the discriminative encoding and the final classification results. In this sense, the optimal global features are generated; 2) it applies Moore-Penrose (MP) inverse-based batch-by-batch learning strategy to handle large-scale data sets, so that large data set, such as Place365 containing 1.8 million images, can be processed effectively. The experimental results on multiple domains with a varying number of training samples from ∼ 1 K to ∼ 2 M show that the proposed feature reinforcement framework achieves better generalization performance compared with most state-of-the-art FCRL methods.
Collapse
|
8
|
Wang CD, Chen MS, Huang L, Lai JH, Yu PS. Smoothness Regularized Multiview Subspace Clustering With Kernel Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:5047-5060. [PMID: 33027007 DOI: 10.1109/tnnls.2020.3026686] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Multiview subspace clustering has attracted an increasing amount of attention in recent years. However, most of the existing multiview subspace clustering methods assume linear relations between multiview data points when learning the affinity representation by means of the self-expression or fail to preserve the locality property of the original feature space in the learned affinity representation. To address the above issues, in this article, we propose a new multiview subspace clustering method termed smoothness regularized multiview subspace clustering with kernel learning (SMSCK). To capture the nonlinear relations between multiview data points, the proposed model maps the concatenated multiview observations into a high-dimensional kernel space, in which the linear relations reflect the nonlinear relations between multiview data points in the original space. In addition, to explicitly preserve the locality property of the original feature space in the learned affinity representation, the smoothness regularization is deployed in the subspace learning in the kernel space. Theoretical analysis has been provided to ensure that the optimal solution of the proposed model meets the grouping effect. The unique optimal solution of the proposed model can be obtained by an optimization strategy and the theoretical convergence analysis is also conducted. Extensive experiments are conducted on both image and document data sets, and the comparison results with state-of-the-art methods demonstrate the effectiveness of our method.
Collapse
|
9
|
Remote Sensing Image Denoising via Low-Rank Tensor Approximation and Robust Noise Modeling. REMOTE SENSING 2020. [DOI: 10.3390/rs12081278] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Noise removal is a fundamental problem in remote sensing image processing. Most existing methods, however, have not yet attained sufficient robustness in practice, due to more or less neglecting the intrinsic structures of remote sensing images and/or underestimating the complexity of realistic noise. In this paper, we propose a new remote sensing image denoising method by integrating intrinsic image characterization and robust noise modeling. Specifically, we use low-Tucker-rank tensor approximation to capture the global multi-factor correlation within the underlying image, and adopt a non-identical and non-independent distributed mixture of Gaussians (non-i.i.d. MoG) assumption to encode the statistical configurations of the embedded noise. Then, we incorporate the proposed image and noise priors into a full Bayesian generative model and design an efficient variational Bayesian algorithm to infer all involved variables by closed-form equations. Moreover, adaptive strategies for the selection of hyperparameters are further developed to make our algorithm free from burdensome hyperparameter-tuning. Extensive experiments on both simulated and real multispectral/hyperspectral images demonstrate the superiority of the proposed method over the compared state-of-the-art ones.
Collapse
|