1
|
Yang J, Lai S, Wang X, Wang Y, Qian X. Diversity-Learning Block: Conquer Feature Homogenization of Multibranch. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7563-7576. [PMID: 36322499 DOI: 10.1109/tnnls.2022.3214993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Visual Geometry Group (VGG)-style ConvNet is an neural-network process units (NPU)-friendly network; however, the accuracy of this architecture cannot keep up with other well-designed network structures. Although some reparameterization methods are proposed to remedy this weakness, their performance suffers from the homogenization issue of parallel branches, and the preset shape of convolution kernels also influences spatial perception. To address this problem, we propose a diversity-learning (DL) block to build the DLNet, which could adaptively learn various features to enrich the feature space. To balance floating point of operations (FLOPs) and accuracy, groupwise operation is introduced and finally, a lightweight DL ConvNet DLGNet is obtained. Extensive evaluations have been conducted on different computer vision tasks, e.g., image classification [Canadian Institute For Advanced Research (CIFAR) and ImageNet], object detection [PASCAL visual object classes (VOC) and Microsoft Common Objects in Context (MS COCO)], and semantic segmentation (Cityscapes). The experimental results show that our proposed DLGNet can achieve comparable performance with the state-of-the-art networks while the speed is 183% faster than GhostNet and even over 600% faster than MobileNetV3 with similar accuracy when running on NPU.
Collapse
|
2
|
Wu S, Lv X, Liu Y, Jiang M, Li X, Jiang D, Yu J, Gong Y, Jiang R. Enhanced SSD framework for detecting defects in cigarette appearance using variational Bayesian inference under limited sample conditions. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:3281-3303. [PMID: 38454728 DOI: 10.3934/mbe.2024145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
In high-speed cigarette manufacturing industries, occasional minor cosmetic cigarette defects and a scarcity of samples significantly hinder the rapid and accurate detection of defects. To tackle this challenge, we propose an enhanced single-shot multibox detector (SSD) model that uses variational Bayesian inference for improved detection of tiny defects given sporadic occurrences and limited samples. The enhanced SSD model incorporates a bounded intersection over union (BIoU) loss function to reduce sensitivity to minor deviations and uses exponential linear unit (ELU) and leaky rectified linear unit (ReLU) activation functions to mitigate vanishing gradients and neuron death in deep neural networks. Empirical results show that the enhanced SSD300 and SSD512 models increase the model's detection accuracy mean average precision (mAP) by up to 1.2% for small defects. Ablation studies further reveal that the model's mAP increases by 1.5%, which reduces the computational requirements by 5.92 GFLOPs. The model also shows improved inference in scenarios with limited samples, thus highlighting its effectiveness and applicability in high-speed, precision-oriented cigarette manufacturing industries.
Collapse
Affiliation(s)
- Shichao Wu
- School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming 650221, China
| | - Xianzhou Lv
- Hongyun Honghe Tobacco (Group) Co., Ltd. Huize Cigarette Factory, Qujing 654200, China
| | - Yingbo Liu
- School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming 650221, China
| | - Ming Jiang
- Hongyun Honghe Tobacco (Group) Co., Ltd. Huize Cigarette Factory, Qujing 654200, China
| | - Xingxu Li
- School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming 650221, China
| | - Dan Jiang
- School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming 650221, China
| | - Jing Yu
- School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming 650221, China
| | - Yunyu Gong
- School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming 650221, China
| | - Rong Jiang
- Yunnan Key Laboratory of Service Computing, Kunming 650221, China
| |
Collapse
|
3
|
Wang S, Zhao J, Cai Y, Li Y, Qi X, Qiu X, Yao X, Tian Y, Zhu Y, Cao W, Zhang X. A method for small-sized wheat seedlings detection: from annotation mode to model construction. PLANT METHODS 2024; 20:15. [PMID: 38287423 PMCID: PMC10826033 DOI: 10.1186/s13007-024-01147-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 01/23/2024] [Indexed: 01/31/2024]
Abstract
The number of seedlings is an important indicator that reflects the size of the wheat population during the seedling stage. Researchers increasingly use deep learning to detect and count wheat seedlings from unmanned aerial vehicle (UAV) images. However, due to the small size and diverse postures of wheat seedlings, it can be challenging to estimate their numbers accurately during the seedling stage. In most related works in wheat seedling detection, they label the whole plant, often resulting in a higher proportion of soil background within the annotated bounding boxes. This imbalance between wheat seedlings and soil background in the annotated bounding boxes decreases the detection performance. This study proposes a wheat seedling detection method based on a local annotation instead of a global annotation. Moreover, the detection model is also improved by replacing convolutional and pooling layers with the Space-to-depth Conv module and adding a micro-scale detection layer in the YOLOv5 head network to better extract small-scale features in these small annotation boxes. The optimization of the detection model can reduce the number of error detections caused by leaf occlusion between wheat seedlings and the small size of wheat seedlings. The results show that the proposed method achieves a detection accuracy of 90.1%, outperforming other state-of-the-art detection methods. The proposed method provides a reference for future wheat seedling detection and yield prediction.
Collapse
Affiliation(s)
- Suwan Wang
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
| | - Jianqing Zhao
- College of Geography, Jiangsu Second Normal University, Nanjing, 211200, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Yucheng Cai
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Yan Li
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Xuerui Qi
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Xiaolei Qiu
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
- Jiangsu Key Laboratory for Information Agriculture, Nanjing, 210095, China
| | - Xia Yao
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
- Jiangsu Key Laboratory for Information Agriculture, Nanjing, 210095, China
| | - Yongchao Tian
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing, 210095, China
| | - Yan Zhu
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Weixing Cao
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Xiaohu Zhang
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China.
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China.
- Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing, 210095, China.
| |
Collapse
|
4
|
Gao R, Ma Y, Zhao Z, Li B, Zhang J. Real-Time Detection of an Undercarriage Based on Receptive Field Blocks and Coordinate Attention. SENSORS (BASEL, SWITZERLAND) 2023; 23:9861. [PMID: 38139707 PMCID: PMC10747497 DOI: 10.3390/s23249861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 11/24/2023] [Accepted: 12/11/2023] [Indexed: 12/24/2023]
Abstract
Currently, aeroplane images captured by camera sensors are characterized by their small size and intricate backgrounds, posing a challenge for existing deep learning algorithms in effectively detecting small targets. This paper incorporates the RFBNet (a coordinate attention mechanism) and the SIOU loss function into the YOLOv5 algorithm to address this issue. The result is developing the model for aeroplane and undercarriage detection. The primary goal is to synergize camera sensors with deep learning algorithms, improving image capture precision. YOLOv5-RSC enhances three aspects: firstly, it introduces the receptive field block based on the backbone network, increasing the size of the receptive field of the feature map, enhancing the connection between shallow and deep feature maps, and further improving the model's utilization of feature information. Secondly, the coordinate attention mechanism is added to the feature fusion network to assist the model in more accurately locating the targets of interest, considering attention in the channel and spatial dimensions. This enhances the model's attention to key information and improves detection precision. Finally, the SIoU bounding box loss function is adopted to address the issue of IoU's insensitivity to scale and increase the speed of model bounding box convergence. Subsequently, the Basler camera experimental platform was constructed for experimental verification. The results demonstrate that the AP values of the YOLOv5-RSC detection model for aeroplane and undercarriage are 92.4% and 80.5%, respectively. The mAP value is 86.4%, which is 2.0%, 5.4%, and 3.7% higher than the original YOLOv5 algorithm, respectively, with a detection speed reaching 89.2 FPS. These findings indicate that the model exhibits high detection precision and speed, providing a valuable reference for aeroplane undercarriage detection.
Collapse
Affiliation(s)
- Ruizhen Gao
- School of Mechanical Engineering and Equipment, Hebei University of Engineering, Handan 056038, China
- Key Laboratory of Intelligent Industrial Equipment Technology of Hebei Province, Hebei University of Engineering, Handan 056038, China
- Collaborative Innovation Center for Modern Equipment Manufacturing of Jinan New Area (Hebei), Handan 056038, China
| | - Ya’nan Ma
- School of Mechanical Engineering and Equipment, Hebei University of Engineering, Handan 056038, China
| | - Ziyue Zhao
- School of Mechanical Engineering and Equipment, Hebei University of Engineering, Handan 056038, China
| | - Baihua Li
- Department of Computer Science, Loughborough University, Loughborough LE11 3TU, UK
| | - Jingjun Zhang
- School of Mechanical Engineering and Equipment, Hebei University of Engineering, Handan 056038, China
- Key Laboratory of Intelligent Industrial Equipment Technology of Hebei Province, Hebei University of Engineering, Handan 056038, China
- Collaborative Innovation Center for Modern Equipment Manufacturing of Jinan New Area (Hebei), Handan 056038, China
| |
Collapse
|
5
|
Nadler EO, Darragh-Ford E, Desikan BS, Conaway C, Chu M, Hull T, Guilbeault D. Divergences in color perception between deep neural networks and humans. Cognition 2023; 241:105621. [PMID: 37716312 DOI: 10.1016/j.cognition.2023.105621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 06/23/2023] [Accepted: 09/09/2023] [Indexed: 09/18/2023]
Abstract
Deep neural networks (DNNs) are increasingly proposed as models of human vision, bolstered by their impressive performance on image classification and object recognition tasks. Yet, the extent to which DNNs capture fundamental aspects of human vision such as color perception remains unclear. Here, we develop novel experiments for evaluating the perceptual coherence of color embeddings in DNNs, and we assess how well these algorithms predict human color similarity judgments collected via an online survey. We find that state-of-the-art DNN architectures - including convolutional neural networks and vision transformers - provide color similarity judgments that strikingly diverge from human color judgments of (i) images with controlled color properties, (ii) images generated from online searches, and (iii) real-world images from the canonical CIFAR-10 dataset. We compare DNN performance against an interpretable and cognitively plausible model of color perception based on wavelet decomposition, inspired by foundational theories in computational neuroscience. While one deep learning model - a convolutional DNN trained on a style transfer task - captures some aspects of human color perception, our wavelet algorithm provides more coherent color embeddings that better predict human color judgments compared to all DNNs we examine. These results hold when altering the high-level visual task used to train similar DNN architectures (e.g., image classification versus image segmentation), as well as when examining the color embeddings of different layers in a given DNN architecture. These findings break new ground in the effort to analyze the perceptual representations of machine learning algorithms and to improve their ability to serve as cognitively plausible models of human vision. Implications for machine learning, human perception, and embodied cognition are discussed.
Collapse
Affiliation(s)
- Ethan O Nadler
- Carnegie Observatories, USA; Department of Physics, University of Southern California, USA.
| | - Elise Darragh-Ford
- Kavli Institute for Particle Astrophysics and Cosmology and Department of Physics, Stanford University, USA
| | - Bhargav Srinivasa Desikan
- School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne, Switzerland; Knowledge Lab, University of Chicago, USA
| | | | - Mark Chu
- School of the Arts, Columbia University, USA
| | | | | |
Collapse
|
6
|
Kim H, Lee W, Lee S, Lee J. Bridged adversarial training. Neural Netw 2023; 167:266-282. [PMID: 37666185 DOI: 10.1016/j.neunet.2023.08.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 06/07/2023] [Accepted: 08/13/2023] [Indexed: 09/06/2023]
Abstract
Adversarial robustness is considered a required property of deep neural networks. In this study, we discover that adversarially trained models might have significantly different characteristics in terms of margin and smoothness, even though they show similar robustness. Inspired by the observation, we investigate the effect of different regularizers and discover the negative effect of the smoothness regularizer on maximizing the margin. Based on the analyses, we propose a new method called bridged adversarial training that mitigates the negative effect by bridging the gap between clean and adversarial examples. We provide theoretical and empirical evidence that the proposed method provides stable and better robustness, especially for large perturbations.
Collapse
Affiliation(s)
- Hoki Kim
- Institute of Engineering Research, Seoul National University, Gwanak-gu 08826, Republic of Korea
| | - Woojin Lee
- School of AI Convergence, Dongguk University-Seoul, Jung-gu 04620, Republic of Korea
| | - Sungyoon Lee
- Department of Computer Science, Hanyang University, Seongdong-gu 04763, Republic of Korea
| | - Jaewook Lee
- Department of Industrial Engineering, Seoul National University, Gwanak-gu 08826, Republic of Korea.
| |
Collapse
|
7
|
Sun C, Chen J, Li Y, Wang W, Ma T. Random pruning: channel sparsity by expectation scaling factor. PeerJ Comput Sci 2023; 9:e1564. [PMID: 37705629 PMCID: PMC10495938 DOI: 10.7717/peerj-cs.1564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 08/13/2023] [Indexed: 09/15/2023]
Abstract
Pruning is an efficient method for deep neural network model compression and acceleration. However, existing pruning strategies, both at the filter level and at the channel level, often introduce a large amount of computation and adopt complex methods for finding sub-networks. It is found that there is a linear relationship between the sum of matrix elements of the channels in convolutional neural networks (CNNs) and the expectation scaling ratio of the image pixel distribution, which is reflects the relationship between the expectation change of the pixel distribution between the feature mapping and the input data. This implies that channels with similar expectation scaling factors (δ E ) cause similar expectation changes to the input data, thus producing redundant feature mappings. Thus, this article proposes a new structured pruning method called EXP. In the proposed method, the channels with similar δ E are randomly removed in each convolutional layer, and thus the whole network achieves random sparsity to obtain non-redundant and non-unique sub-networks. Experiments on pruning various networks show that EXP can achieve a significant reduction of FLOPs. For example, on the CIFAR-10 dataset, EXP reduces the FLOPs of the ResNet-56 model by 71.9% with a 0.23% loss in Top-1 accuracy. On ILSVRC-2012, it reduces the FLOPs of the ResNet-50 model by 60.0% with a 1.13% loss of Top-1 accuracy. Our code is available at: https://github.com/EXP-Pruning/EXP_Pruning and DOI: 10.5281/zenodo.8141065.
Collapse
Affiliation(s)
- Chuanmeng Sun
- North University of China, State Key Laboratory of Dynamic Measurement Technology, Taiyuan, Shanxi, China
- North University of China, School of Electrical and Control Engineering, Taiyuan, Shanxi, China
| | - Jiaxin Chen
- North University of China, State Key Laboratory of Dynamic Measurement Technology, Taiyuan, Shanxi, China
- North University of China, School of Electrical and Control Engineering, Taiyuan, Shanxi, China
| | - Yong Li
- Chongqing University, State Key Laboratory of Coal Mine Disaster Dynamics and Control, Chongqing, China
| | - Wenbo Wang
- North University of China, State Key Laboratory of Dynamic Measurement Technology, Taiyuan, Shanxi, China
- North University of China, School of Electrical and Control Engineering, Taiyuan, Shanxi, China
| | - Tiehua Ma
- North University of China, State Key Laboratory of Dynamic Measurement Technology, Taiyuan, Shanxi, China
- North University of China, School of Electrical and Control Engineering, Taiyuan, Shanxi, China
| |
Collapse
|
8
|
Wei M, Zhou Y, Li Z, Xu X. Class-imbalanced complementary-label learning via weighted loss. Neural Netw 2023; 166:555-565. [PMID: 37586256 DOI: 10.1016/j.neunet.2023.07.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 06/17/2023] [Accepted: 07/23/2023] [Indexed: 08/18/2023]
Abstract
Complementary-label learning (CLL) is widely used in weakly supervised classification, but it faces a significant challenge in real-world datasets when confronted with class-imbalanced training samples. In such scenarios, the number of samples in one class is considerably lower than in other classes, which consequently leads to a decline in the accuracy of predictions. Unfortunately, existing CLL approaches have not investigate this problem. To alleviate this challenge, we propose a novel problem setting that enables learning from class-imbalanced complementary labels for multi-class classification. To tackle this problem, we propose a novel CLL approach called Weighted Complementary-Label Learning (WCLL). The proposed method models a weighted empirical risk minimization loss by utilizing the class-imbalanced complementary labels, which is also applicable to multi-class imbalanced training samples. Furthermore, we derive an estimation error bound to provide theoretical assurance. To evaluate our approach, we conduct extensive experiments on several widely-used benchmark datasets and a real-world dataset, and compare our method with existing state-of-the-art methods. The proposed approach shows significant improvement in these datasets, even in the case of multiple class-imbalanced scenarios. Notably, the proposed method not only utilizes complementary labels to train a classifier but also solves the problem of class imbalance.
Collapse
Affiliation(s)
- Meng Wei
- School of Computer Science & Technology, China University of Mining and Technology, Xuzhou, China
| | - Yong Zhou
- School of Computer Science & Technology, China University of Mining and Technology, Xuzhou, China
| | - Zhongnian Li
- School of Computer Science & Technology, China University of Mining and Technology, Xuzhou, China
| | - Xinzheng Xu
- School of Computer Science & Technology, China University of Mining and Technology, Xuzhou, China.
| |
Collapse
|
9
|
Xia K, Lv Z, Liu K, Lu Z, Zhou C, Zhu H, Chen X. Global contextual attention augmented YOLO with ConvMixer prediction heads for PCB surface defect detection. Sci Rep 2023; 13:9805. [PMID: 37328545 DOI: 10.1038/s41598-023-36854-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 06/11/2023] [Indexed: 06/18/2023] Open
Abstract
To solve the problem of missed and false detection caused by the large number of tiny targets and complex background textures in a printed circuit board (PCB), we propose a global contextual attention augmented YOLO model with ConvMixer prediction heads (GCC-YOLO). In this study, we apply a high-resolution feature layer (P2) to gain more details and positional information of small targets. Moreover, in order to suppress the background noisy information and further enhance the feature extraction capability, a global contextual attention module (GC) is introduced in the backbone network and combined with a C3 module. Furthermore, in order to reduce the loss of shallow feature information due to the deepening of network layers, a bi-directional weighted feature pyramid (BiFPN) feature fusion structure is introduced. Finally, a ConvMixer module is introduced and combined with the C3 module to create a new prediction head, which improves the small target detection capability of the model while reducing the parameters. Test results on the PCB dataset show that GCC-YOLO improved the Precision, Recall, mAP@0.5, and mAP@0.5:0.95 by 0.2%, 1.8%, 0.5%, and 8.3%, respectively, compared to YOLOv5s; moreover, it has a smaller model volume and faster reasoning speed compared to other algorithms.
Collapse
Affiliation(s)
- Kewen Xia
- School of Mechanical and Power Engineering, Chongqing University of Science and Technology, Chongqing, 401331, China
| | - Zhongliang Lv
- School of Mechanical and Power Engineering, Chongqing University of Science and Technology, Chongqing, 401331, China.
| | - Kang Liu
- School of Mechanical and Power Engineering, Chongqing University of Science and Technology, Chongqing, 401331, China
| | - Zhenyu Lu
- School of Mechanical and Power Engineering, Chongqing University of Science and Technology, Chongqing, 401331, China
| | - Chuande Zhou
- School of Mechanical and Power Engineering, Chongqing University of Science and Technology, Chongqing, 401331, China.
| | - Hong Zhu
- School of Mechanical and Power Engineering, Chongqing University of Science and Technology, Chongqing, 401331, China
| | - Xuanlin Chen
- School of Mechanical and Power Engineering, Chongqing University of Science and Technology, Chongqing, 401331, China
| |
Collapse
|
10
|
Xue P, Lu Y, Chang J, Wei X, Wei Z. IR$$^2$$Net: information restriction and information recovery for accurate binary neural networks. Neural Comput Appl 2023. [DOI: 10.1007/s00521-023-08495-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
11
|
Fan J, Zeng Y. Challenging deep learning models with image distortion based on the abutting grating illusion. PATTERNS (NEW YORK, N.Y.) 2023; 4:100695. [PMID: 36960449 PMCID: PMC10028432 DOI: 10.1016/j.patter.2023.100695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 11/07/2022] [Accepted: 02/01/2023] [Indexed: 03/06/2023]
Abstract
Even state-of-the-art deep learning models lack fundamental abilities compared with humans. While many image distortions have been proposed to compare deep learning with humans, they depend on mathematical transformations instead of human cognitive functions. Here, we propose an image distortion based on the abutting grating illusion, which is a phenomenon discovered in humans and animals. The distortion generates illusory contour perception using line gratings abutting each other. We applied the method to MNIST, high-resolution MNIST, and "16-class-ImageNet" silhouettes. Many models, including models trained from scratch and 109 models pretrained with ImageNet or various data augmentation techniques, were tested. Our results show that abutting grating distortion is challenging even for state-of-the-art deep learning models. We discovered that DeepAugment models outperformed other pretrained models. Visualization of early layers indicates that better-performing models exhibit the endstopping property, which is consistent with neuroscience discoveries. Twenty-four human subjects classified distorted samples to validate the distortion.
Collapse
Affiliation(s)
- Jinyu Fan
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| | - Yi Zeng
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Future Technology, University of Chinese Academy of Sciences, Beijing 100049, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
- Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, China
- Corresponding author
| |
Collapse
|
12
|
Picot M, Messina F, Boudiaf M, Labeau F, Ayed IB, Piantanida P. Adversarial Robustness Via Fisher-Rao Regularization. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2698-2710. [PMID: 35552150 DOI: 10.1109/tpami.2022.3174724] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Adversarial robustness has become a topic of growing interest in machine learning since it was observed that neural networks tend to be brittle. We propose an information-geometric formulation of adversarial defense and introduce Fire, a new Fisher-Rao regularization for the categorical cross-entropy loss, which is based on the geodesic distance between the softmax outputs corresponding to natural and perturbed input features. Based on the information-geometric properties of the class of softmax distributions, we derive an explicit characterization of the Fisher-Rao Distance (FRD) for the binary and multiclass cases, and draw some interesting properties as well as connections with standard regularization metrics. Furthermore, we verify on a simple linear and Gaussian model, that all Pareto-optimal points in the accuracy-robustness region can be reached by Fire while other state-of-the-art methods fail. Empirically, we evaluate the performance of various classifiers trained with the proposed loss on standard datasets, showing up to a simultaneous 1% of improvement in terms of clean and robust performances while reducing the training time by 20% over the best-performing methods.
Collapse
|
13
|
Image synthesis: a review of methods, datasets, evaluation metrics, and future outlook. Artif Intell Rev 2023. [DOI: 10.1007/s10462-023-10434-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
|
14
|
Chen X, Li Y, Chen C. An Online Hashing Algorithm for Image Retrieval Based on Optical-Sensor Network. SENSORS (BASEL, SWITZERLAND) 2023; 23:2576. [PMID: 36904780 PMCID: PMC10007520 DOI: 10.3390/s23052576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 02/20/2023] [Accepted: 02/24/2023] [Indexed: 06/18/2023]
Abstract
Online hashing is a valid storage and online retrieval scheme, which is meeting the rapid increase in data in the optical-sensor network and the real-time processing needs of users in the era of big data. Existing online-hashing algorithms rely on data tags excessively to construct the hash function, and ignore the mining of the structural features of the data itself, resulting in a serious loss of the image-streaming features and the reduction in retrieval accuracy. In this paper, an online hashing model that fuses global and local dual semantics is proposed. First, to preserve the local features of the streaming data, an anchor hash model, which is based on the idea of manifold learning, is constructed. Second, a global similarity matrix, which is used to constrain hash codes is built by the balanced similarity between the newly arrived data and previous data, which makes hash codes retain global data features as much as possible. Then, under a unified framework, an online hash model that integrates global and local dual semantics is learned, and an effective discrete binary-optimization solution is proposed. A large number of experiments on three datasets, including CIFAR10, MNIST and Places205, show that our proposed algorithm improves the efficiency of image retrieval effectively, compared with several existing advanced online-hashing algorithms.
Collapse
Affiliation(s)
- Xiao Chen
- Department of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
| | - Yanlong Li
- Department of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
- Ministry of Education Key Laboratory of Cognitive Radio and Information Processing, Guilin University of Electronic Technology, Guilin 541004, China
| | - Chen Chen
- Department of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
| |
Collapse
|
15
|
Regularization-based pruning of irrelevant weights in deep neural architectures. APPL INTELL 2023. [DOI: 10.1007/s10489-022-04353-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
16
|
Texture and material classification with multi-scale ternary and septenary patterns. JOURNAL OF KING SAUD UNIVERSITY - COMPUTER AND INFORMATION SCIENCES 2022. [DOI: 10.1016/j.jksuci.2022.12.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
17
|
Tuggener L, Schmidhuber J, Stadelmann T. Is it enough to optimize CNN architectures on ImageNet? FRONTIERS IN COMPUTER SCIENCE 2022. [DOI: 10.3389/fcomp.2022.1041703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Classification performance based on ImageNet is the de-facto standard metric for CNN development. In this work we challenge the notion that CNN architecture design solely based on ImageNet leads to generally effective convolutional neural network (CNN) architectures that perform well on a diverse set of datasets and application domains. To this end, we investigate and ultimately improve ImageNet as a basis for deriving such architectures. We conduct an extensive empirical study for which we train 500 CNN architectures, sampled from the broad AnyNetX design space, on ImageNet as well as 8 additional well-known image classification benchmark datasets from a diverse array of application domains. We observe that the performances of the architectures are highly dataset dependent. Some datasets even exhibit a negative error correlation with ImageNet across all architectures. We show how to significantly increase these correlations by utilizing ImageNet subsets restricted to fewer classes. These contributions can have a profound impact on the way we design future CNN architectures and help alleviate the tilt we see currently in our community with respect to over-reliance on one dataset.
Collapse
|
18
|
Mi JX, Wang XD, Zhou LF, Cheng K. Adversarial Examples based on Object Detection tasks: A Survey. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.10.046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
19
|
WSAGrad: a novel adaptive gradient based method. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04205-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
20
|
Naeem A, Anees T, Ahmed KT, Naqvi RA, Ahmad S, Whangbo T. Deep learned vectors’ formation using auto-correlation, scaling, and derivations with CNN for complex and huge image retrieval. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-022-00866-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/10/2022]
Abstract
AbstractDeep learning for image retrieval has been used in this era, but image retrieval with the highest accuracy is the biggest challenge, which still lacks auto-correlation for feature extraction and description. In this paper, a novel deep learning technique for achieving highly accurate results for image retrieval is proposed, which implements a convolutional neural network with auto-correlation, gradient computation, scaling, filter, and localization coupled with state-of-the-art content-based image retrieval methods. For this purpose, novel image features are fused with signatures produced by the VGG-16. In the initial step, images from rectangular neighboring key points are auto-correlated. The image smoothing is achieved by computing intensities according to the local gradient. The result of Gaussian approximation with the lowest scale and suppression is adjusted by the by-box filter with the standard deviation adjusted to the lowest scale. The parameterized images are smoothed at different scales at various levels to achieve high accuracy. The principal component analysis has been used to reduce feature vectors and combine them with the VGG features. These features are integrated with the spatial color coordinates to represent color channels. This experimentation has been performed on Cifar-100, Cifar-10, Tropical fruits, 17 Flowers, Oxford, and Corel-1000 datasets. This study has achieved an extraordinary result for the Cifar-10 and Cifar-100 datasets. Similarly, the results of the study have shown efficient results for texture datasets of 17 Flowers and Tropical fruits. Moreover, when compared to state-of-the-art approaches, this research produced outstanding results for the Corel-1000 dataset.
Collapse
|
21
|
Lu Y, Zhang Z, Lu G, Zhou Y, Li J, Zhang D. Addi-Reg: A Better Generalization-Optimization Tradeoff Regularization Method for Convolutional Neural Networks. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:10827-10842. [PMID: 33750731 DOI: 10.1109/tcyb.2021.3062881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In convolutional neural networks (CNNs), generating noise for the intermediate feature is a hot research topic in improving generalization. The existing methods usually regularize the CNNs by producing multiplicative noise (regularization weights), called multiplicative regularization (Multi-Reg). However, Multi-Reg methods usually focus on improving generalization but fail to jointly consider optimization, leading to unstable learning with slow convergence. Moreover, Multi-Reg methods are not flexible enough since the regularization weights are generated from a definite manual-design distribution. Besides, most popular methods are not universal enough, because these methods are only designed for the residual networks. In this article, we, for the first time, experimentally and theoretically explore the nature of generating noise in the intermediate features for popular CNNs. We demonstrate that injecting noise in the feature space can be transformed to generating noise in the input space, and these methods regularize the networks in a Mini-batch in Mini-batch (MiM) sampling manner. Based on these observations, this article further discovers that generating multiplicative noise can easily degenerate the optimization due to its high dependence on the intermediate feature. Based on these studies, we propose a novel additional regularization (Addi-Reg) method, which can adaptively produce additional noise with low dependence on intermediate feature in CNNs by employing a series of mechanisms. Particularly, these well-designed mechanisms can stabilize the learning process in training, and our Addi-Reg method can pertinently learn the noise distributions for every layer in CNNs. Extensive experiments demonstrate that the proposed Addi-Reg method is more flexible and universal, and meanwhile achieves better generalization performance with faster convergence against the state-of-the-art Multi-Reg methods.
Collapse
|
22
|
Schwarz Schuler JP, Also SR, Puig D, Rashwan H, Abdel-Nasser M. An Enhanced Scheme for Reducing the Complexity of Pointwise Convolutions in CNNs for Image Classification Based on Interleaved Grouped Filters without Divisibility Constraints. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1264. [PMID: 36141151 PMCID: PMC9497893 DOI: 10.3390/e24091264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 09/01/2022] [Accepted: 09/05/2022] [Indexed: 06/16/2023]
Abstract
In image classification with Deep Convolutional Neural Networks (DCNNs), the number of parameters in pointwise convolutions rapidly grows due to the multiplication of the number of filters by the number of input channels that come from the previous layer. Existing studies demonstrated that a subnetwork can replace pointwise convolutional layers with significantly fewer parameters and fewer floating-point computations, while maintaining the learning capacity. In this paper, we propose an improved scheme for reducing the complexity of pointwise convolutions in DCNNs for image classification based on interleaved grouped filters without divisibility constraints. The proposed scheme utilizes grouped pointwise convolutions, in which each group processes a fraction of the input channels. It requires a number of channels per group as a hyperparameter Ch. The subnetwork of the proposed scheme contains two consecutive convolutional layers K and L, connected by an interleaving layer in the middle, and summed at the end. The number of groups of filters and filters per group for layers K and L is determined by exact divisions of the original number of input channels and filters by Ch. If the divisions were not exact, the original layer could not be substituted. In this paper, we refine the previous algorithm so that input channels are replicated and groups can have different numbers of filters to cope with non exact divisibility situations. Thus, the proposed scheme further reduces the number of floating-point computations (11%) and trainable parameters (10%) achieved by the previous method. We tested our optimization on an EfficientNet-B0 as a baseline architecture and made classification tests on the CIFAR-10, Colorectal Cancer Histology, and Malaria datasets. For each dataset, our optimization achieves a saving of 76%, 89%, and 91% of the number of trainable parameters of EfficientNet-B0, while keeping its test classification accuracy.
Collapse
Affiliation(s)
- Joao Paulo Schwarz Schuler
- Departament d’Enginyeria Informatica i Matemátiques, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Santiago Romani Also
- Departament d’Enginyeria Informatica i Matemátiques, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Domenec Puig
- Departament d’Enginyeria Informatica i Matemátiques, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Hatem Rashwan
- Departament d’Enginyeria Informatica i Matemátiques, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Mohamed Abdel-Nasser
- Departament d’Enginyeria Informatica i Matemátiques, Universitat Rovira i Virgili, 43007 Tarragona, Spain
- Electronics and Communication Engineering Section, Electrical Engineering Department, Aswan University, Aswan 81528, Egypt
| |
Collapse
|
23
|
Auditory Speech Based Alerting System for Detecting Dummy Number Plate via Video Processing Data sets. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:4423744. [PMID: 36093477 PMCID: PMC9462979 DOI: 10.1155/2022/4423744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 06/29/2022] [Accepted: 07/26/2022] [Indexed: 11/17/2022]
Abstract
Spectrum of applications in computer vision use object detection algorithms driven by the power of AI and ML algorithms. State of art detection models like faster Region based convolutional Neural Network (RCNN), Single Shot Multibox Detector (SSD), and You Only Look Once (YOLO) demonstrated a good performance for object detection, but many failed in detecting small objects. In view of this an improved network structure of YOLOv4 is proposed in this paper. This work presents an algorithm for small object detection trained using real-time high-resolution data for porting it on embedded platforms. License plate recognition, which is a small object in a car image, is considered for detection and an auditory speech signal is generated for detecting fake license plates. The proposed network is improved in the following aspects: Training the classifier by using positive data set formed from the core patterns of an image. Training YOLOv4 by the features obtained by decomposing the image into low frequency and high frequency. The resultant values are processed and demonstrated via a speech alerting signals and messages. This contributes to reducing the computation load and increasing the accuracy. Algorithm was tested on eight real-time video data sets. The results show that our proposed method greatly reduces computing effort while maintaining comparable accuracy. It takes 45 fps to detect one image when the input size is 1280 × 960, which could keep a real-time speed. Proposed algorithm works well in case of tilted, blurred, and occluded license plates. Also, an auditory traffic monitoring system can reduce criminal attacks by detecting suspicious license plates. The proposed algorithm is highly applicable for autonomous driving applications.
Collapse
|
24
|
Elephant motorbikes and too many neckties: epistemic spatialization as a framework for investigating patterns of bias in convolutional neural networks. AI & SOCIETY 2022. [DOI: 10.1007/s00146-022-01542-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
25
|
Mohamed E, Sirlantzis K, Howells G, Hoque S. Optimisation of Deep Learning Small-Object Detectors with Novel Explainable Verification. SENSORS 2022; 22:s22155596. [PMID: 35898097 PMCID: PMC9330345 DOI: 10.3390/s22155596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 07/13/2022] [Accepted: 07/18/2022] [Indexed: 11/16/2022]
Abstract
In this paper, we present a novel methodology based on machine learning for identifying the most appropriate from a set of available state-of-the-art object detectors for a given application. Our particular interest is to develop a road map for identifying verifiably optimal selections, especially for challenging applications such as detecting small objects in a mixed-size object dataset. State-of-the-art object detection systems often find the localisation of small-size objects challenging since most are usually trained on large-size objects. These contain abundant information as they occupy a large number of pixels relative to the total image size. This fact is normally exploited by the model during training and inference processes. To dissect and understand this process, our approach systematically examines detectors’ performances using two very distinct deep convolutional networks. The first is the single-stage YOLO V3 and the second is the double-stage Faster R-CNN. Specifically, our proposed method explores and visually illustrates the impact of feature extraction layers, number of anchor boxes, data augmentation, etc., utilising ideas from the field of explainable Artificial Intelligence (XAI). Our results, for example, show that multi-head YOLO V3 detectors trained using augmented data produce better performance even with a fewer number of anchor boxes. Moreover, robustness regarding the detector’s ability to explain how a specific decision was reached is investigated using different explanation techniques. Finally, two new visualisation techniques are proposed, WS-Grad and Concat-Grad, for identifying explanation cues of different detectors. These are applied to specific object detection tasks to illustrate their reliability and transparency with respect to the decision process. It is shown that the proposed techniques can result in high resolution and comprehensive heatmaps of the image areas, significantly affecting detector decisions as compared to the state-of-the-art techniques tested.
Collapse
Affiliation(s)
- Elhassan Mohamed
- School of Engineering, University of Kent, Canterbury CT2 7NT, UK;
- Correspondence: (E.M.); (K.S.)
| | - Konstantinos Sirlantzis
- School of Engineering, University of Kent, Canterbury CT2 7NT, UK;
- Correspondence: (E.M.); (K.S.)
| | - Gareth Howells
- School of Computing, University of Kent, Canterbury CT2 7NZ, UK;
| | - Sanaul Hoque
- School of Engineering, University of Kent, Canterbury CT2 7NT, UK;
| |
Collapse
|
26
|
Approximate Nearest Neighbor Search Using Enhanced Accumulative Quantization. ELECTRONICS 2022. [DOI: 10.3390/electronics11142236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Approximate nearest neighbor (ANN) search is fundamental for fast content-based image retrieval. While vector quantization is one key to performing an effective ANN search, in order to further improve ANN search accuracy, we propose an enhanced accumulative quantization (E-AQ). Based on our former work, we introduced the idea of the quarter point into accumulative quantization (AQ). Instead of finding the nearest centroid, the quarter vector was used to quantize the vector and was computed for each vector according to its nearest centroid and second nearest centroid. Then, the error produced through codebook training and vector quantization was reduced without increasing the number of centroids in each codebook. To evaluate the accuracy to which vectors were approximated by their quantization outputs, we realized an E-AQ-based exhaustive method for ANN search. Experimental results show that our approach gained up to 0.996 and 0.776 Recall@100 with eight size 256 codebooks on SIFT and GIST datasets, respectively, which is at least 1.6% and 4.9% higher than six other state-of-the-art methods. Moreover, based on the experimental results, E-AQ needs fewer codebooks while still providing the same ANN search accuracy.
Collapse
|
27
|
Tripathy PK, Shrivastava A, Agarwal V, Shah DU, L. CSR, Akilandeeswari S. Federated learning algorithm based on matrix mapping for data privacy over edge computing. INTERNATIONAL JOURNAL OF PERVASIVE COMPUTING AND COMMUNICATIONS 2022. [DOI: 10.1108/ijpcc-03-2022-0113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
This paper aims to provide the security and privacy for Byzantine clients from different types of attacks.
Design/methodology/approach
In this paper, the authors use Federated Learning Algorithm Based On Matrix Mapping For Data Privacy over Edge Computing.
Findings
By using Softmax layer probability distribution for model byzantine tolerance can be increased from 40% to 45% in the blocking-convergence attack, and the edge backdoor attack can be stopped.
Originality/value
By using Softmax layer probability distribution for model the results of the tests, the aggregation method can protect at least 30% of Byzantine clients.
Collapse
|
28
|
SFCC: Data Augmentation with Stratified Fourier Coefficients Combination for Time Series Classification. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10965-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
29
|
Hofmann M, Mader P. Synaptic Scaling-An Artificial Neural Network Regularization Inspired by Nature. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:3094-3108. [PMID: 33502984 DOI: 10.1109/tnnls.2021.3050422] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Nature has always inspired the human spirit and scientists frequently developed new methods based on observations from nature. Recent advances in imaging and sensing technology allow fascinating insights into biological neural processes. With the objective of finding new strategies to enhance the learning capabilities of neural networks, we focus on a phenomenon that is closely related to learning tasks and neural stability in biological neural networks, called homeostatic plasticity. Among the theories that have been developed to describe homeostatic plasticity, synaptic scaling has been found to be the most mature and applicable. We systematically discuss previous studies on the synaptic scaling theory and how they could be applied to artificial neural networks. Therefore, we utilize information theory to analytically evaluate how mutual information is affected by synaptic scaling. Based on these analytic findings, we propose two flavors in which synaptic scaling can be applied in the training process of simple and complex, feedforward, and recurrent neural networks. We compare our approach with state-of-the-art regularization techniques on standard benchmarks. We found that the proposed method yields the lowest error in both regression and classification tasks compared to previous regularization approaches in our experiments across a wide range of network feedforward and recurrent topologies and data sets.
Collapse
|
30
|
Zhang L, Su G, Yin J, Li Y, Lin Q, Zhang X, Shao L. Bioinspired Scene Classification by Deep Active Learning With Remote Sensing Applications. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:5682-5694. [PMID: 33635802 DOI: 10.1109/tcyb.2020.2981480] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Accurately classifying sceneries with different spatial configurations is an indispensable technique in computer vision and intelligent systems, for example, scene parsing, robot motion planning, and autonomous driving. Remarkable performance has been achieved by the deep recognition models in the past decade. As far as we know, however, these deep architectures are incapable of explicitly encoding the human visual perception, that is, the sequence of gaze movements and the subsequent cognitive processes. In this article, a biologically inspired deep model is proposed for scene classification, where the human gaze behaviors are robustly discovered and represented by a unified deep active learning (UDAL) framework. More specifically, to characterize objects' components with varied sizes, an objectness measure is employed to decompose each scenery into a set of semantically aware object patches. To represent each region at a low level, a local-global feature fusion scheme is developed which optimally integrates multimodal features by automatically calculating each feature's weight. To mimic the human visual perception of various sceneries, we develop the UDAL that hierarchically represents the human gaze behavior by recognizing semantically important regions within the scenery. Importantly, UDAL combines the semantically salient region detection and the deep gaze shifting path (GSP) representation learning into a principled framework, where only the partial semantic tags are required. Meanwhile, by incorporating the sparsity penalty, the contaminated/redundant low-level regional features can be intelligently avoided. Finally, the learned deep GSP features from the entire scene images are integrated to form an image kernel machine, which is subsequently fed into a kernel SVM to classify different sceneries. Experimental evaluations on six well-known scenery sets (including remote sensing images) have shown the competitiveness of our approach.
Collapse
|
31
|
Mo Y, Wu Y, Yang X, Liu F, Liao Y. Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.01.005] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
32
|
Algan G, Ulusoy I. MetaLabelNet: Learning to Generate Soft-Labels From Noisy-Labels. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:4352-4362. [PMID: 35731778 DOI: 10.1109/tip.2022.3183841] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Real-world datasets commonly have noisy labels, which negatively affects the performance of deep neural networks (DNNs). In order to address this problem, we propose a label noise robust learning algorithm, in which the base classifier is trained on soft-labels that are produced according to a meta-objective. In each iteration, before conventional training, the meta-training loop updates soft-labels so that resulting gradients updates on the base classifier would yield minimum loss on meta-data. Soft-labels are generated from extracted features of data instances, and the mapping function is learned by a single layer perceptron (SLP) network, which is called MetaLabelNet. Following, base classifier is trained by using these generated soft-labels. These iterations are repeated for each batch of training data. Our algorithm uses a small amount of clean data as meta-data, which can be obtained effortlessly for many cases. We perform extensive experiments on benchmark datasets with both synthetic and real-world noises. Results show that our approach outperforms existing baselines. The source code of the proposed model is available at https://github.com/gorkemalgan/MetaLabelNet.
Collapse
|
33
|
Xu M, Zhang T, Li Z, Zhang D. InfoAT: Improving Adversarial Training Using the Information Bottleneck Principle. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:1255-1264. [PMID: 35731762 DOI: 10.1109/tnnls.2022.3183095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Adversarial training (AT) has shown excellent high performance in defending against adversarial examples. Recent studies demonstrate that examples are not equally important to the final robustness of models during AT, that is, the so-called hard examples that can be attacked easily exhibit more influence than robust examples on the final robustness. Therefore, guaranteeing the robustness of hard examples is crucial for improving the final robustness of the model. However, defining effective heuristics to search for hard examples is still difficult. In this article, inspired by the information bottleneck (IB) principle, we uncover that an example with high mutual information of the input and its associated latent representation is more likely to be attacked. Based on this observation, we propose a novel and effective adversarial training method (InfoAT). InfoAT is encouraged to find examples with high mutual information and exploit them efficiently to improve the final robustness of models. Experimental results show that InfoAT achieves the best robustness among different datasets and models in comparison with several state-of-the-art methods.
Collapse
|
34
|
Lu Y, Zhang L, Yang X, Zhou Y. Efficient Harmonic Neural Networks With Compound Discrete Cosine Transform Filters and Shared Reconstruction Filters. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:693-707. [PMID: 35622805 DOI: 10.1109/tnnls.2022.3176611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The harmonic neural network (HNN) learns a combination of discrete cosine transform (DCT) filters to obtain an integrated feature from all spectra in the frequency domain. HNN, however, faces two challenges in learning and inference processes. First, the spectrum feature learned by HNN is insufficient and limited because the number of DCT filters is much smaller than that of feature maps. In addition, the number of parameters and the computation costs of HNN are significantly high because the intermediate spectrum layers are expanded multiple times. These two challenges will severely harm the performance and efficiency of HNN. To solve these problems, we first propose the compound DCT (C-DCT) filters integrating the nearest DCT filters to retrieve rich spectrum features to improve the performance. To significantly reduce the model size and computation complexity for improving the efficiency, the shared reconstruction filter is then proposed to share and dynamically drop the meta-filters in every frequency branch. Integrating the C-DCT filters with the shared reconstruction filters, the efficient harmonic network (EH-Net) is introduced. Extensive experiments on different datasets demonstrate that the proposed EH-Nets can effectively reduce the model size and computation complexity while maintaining the model performance. The code has been released at https://github.com/zhangle408/EH-Nets.
Collapse
|
35
|
A Novel Hierarchical Adaptive Feature Fusion Method for Meta-Learning. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12115458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Meta-learning aims to teach the machine how to learn. Embedding model-based meta-learning performs well in solving the few-shot problem. The methods use an embedding model, usually a convolutional neural network, to extract features from samples and use a classifier to measure the features extracted from a particular stage of the embedding model. However, the feature of the embedding model at the low stage contains richer visual information, while the feature at the high stage contains richer semantic information. Existing methods fail to consider the impact of the information carried by the features at different stages on the performance of the classifier. Therefore, we propose a meta-learning method based on adaptive feature fusion and weight optimization. The main innovations of the method are as follows: firstly, a feature fusion strategy is used to fuse the feature of each stage of the embedding model based on certain weights, effectively utilizing the information carried by different stage features. Secondly, the particle swarm optimization algorithm was used to optimize the weight of feature fusion, and determine each stage feature’s weight in the process of feature fusion. Compared to current mainstream baseline methods on multiple few-shot image recognition benchmarks, the method performs better.
Collapse
|
36
|
REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets. Int J Comput Vis 2022. [DOI: 10.1007/s11263-022-01625-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
37
|
Sagong MC, Yeo YJ, Shin YG, Ko SJ. Conditional Convolution Projecting Latent Vectors on Condition-Specific Space. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:1386-1393. [PMID: 35584073 DOI: 10.1109/tnnls.2022.3172512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Despite rapid advancements over the past several years, the conditional generative adversarial networks (cGANs) are still far from being perfect. Although one of the major concerns of the cGANs is how to provide the conditional information to the generator, there are not only no ways considered as the optimal solution but also a lack of related research. This brief presents a novel convolution layer, called the conditional convolution (cConv) layer, which incorporates the conditional information into the generator of the generative adversarial networks (GANs). Unlike the most general framework of the cGANs using the conditional batch normalization (cBN) that transforms the normalized feature maps after convolution, the proposed method directly produces conditional features by adjusting the convolutional kernels depending on the conditions. More specifically, in each cConv layer, the weights are conditioned in a simple but effective way through filter-wise scaling and channel-wise shifting operations. In contrast to the conventional methods, the proposed method with a single generator can effectively handle condition-specific characteristics. The experimental results on CIFAR, LSUN, and ImageNet datasets show that the generator with the proposed cConv layer achieves a higher quality of conditional image generation than that with the standard convolution layer.
Collapse
|
38
|
Yeo YJ, Shin YG, Park S, Ko SJ. Simple Yet Effective Way for Improving the Performance of GAN. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:1811-1818. [PMID: 33385312 DOI: 10.1109/tnnls.2020.3045000] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In adversarial learning, the discriminator often fails to guide the generator successfully since it distinguishes between real and generated images using silly or nonrobust features. To alleviate this problem, this brief presents a simple but effective way that improves the performance of the generative adversarial network (GAN) without imposing the training overhead or modifying the network architectures of existing methods. The proposed method employs a novel cascading rejection (CR) module for discriminator, which extracts multiple nonoverlapped features in an iterative manner using the vector rejection operation. Since the extracted diverse features prevent the discriminator from concentrating on nonmeaningful features, the discriminator can guide the generator effectively to produce images that are more similar to the real images. In addition, since the proposed CR module requires only a few simple vector operations, it can be readily applied to existing frameworks with marginal training overheads. Quantitative evaluations on various data sets, including CIFAR-10, CelebA, CelebA-HQ, LSUN, and tiny-ImageNet, confirm that the proposed method significantly improves the performance of GAN and conditional GAN in terms of the Frechet inception distance (FID), indicating the diversity and visual appearance of the generated images.
Collapse
|
39
|
Convy I, Huggins W, Liao H, Whaley KB. Mutual Information Scaling for Tensor Network Machine Learning. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2022; 3:015017. [PMID: 35211672 PMCID: PMC8862112 DOI: 10.1088/2632-2153/ac44a9] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Tensor networks have emerged as promising tools for machine learning, inspired by their widespread use as variational ansatze in quantum many-body physics. It is well known that the success of a given tensor network ansatz depends in part on how well it can reproduce the underlying entanglement structure of the target state, with different network designs favoring different scaling patterns. We demonstrate here how a related correlation analysis can be applied to tensor network machine learning, and explore whether classical data possess correlation scaling patterns similar to those found in quantum states which might indicate the best network to use for a given dataset. We utilize mutual information as measure of correlations in classical data, and show that it can serve as a lower-bound on the entanglement needed for a probabilistic tensor network classifier. We then develop a logistic regression algorithm to estimate the mutual information between bipartitions of data features, and verify its accuracy on a set of Gaussian distributions designed to mimic different correlation patterns. Using this algorithm, we characterize the scaling patterns in the MNIST and Tiny Images datasets, and find clear evidence of boundary-law scaling in the latter. This quantum-inspired classical analysis offers insight into the design of tensor networks which are best suited for specific learning tasks.
Collapse
Affiliation(s)
- Ian Convy
- Department of Chemistry, University of California, Berkeley, CA 94720, USA,Berkeley Quantum Information and Computation Center, University of California, Berkeley, CA 94720, USA,
| | - William Huggins
- Department of Chemistry, University of California, Berkeley, CA 94720, USA,Berkeley Quantum Information and Computation Center, University of California, Berkeley, CA 94720, USA
| | - Haoran Liao
- Department of Physics, University of California, Berkeley, CA 94720, USA,Berkeley Quantum Information and Computation Center, University of California, Berkeley, CA 94720, USA
| | - K. Birgitta Whaley
- Department of Chemistry, University of California, Berkeley, CA 94720, USA,Berkeley Quantum Information and Computation Center, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
40
|
Self-distribution binary neural networks. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03348-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
41
|
Kataoka H, Okayasu K, Matsumoto A, Yamagata E, Yamada R, Inoue N, Nakamura A, Satoh Y. Pre-Training Without Natural Images. Int J Comput Vis 2022. [DOI: 10.1007/s11263-021-01555-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
AbstractIs it possible to use convolutional neural networks pre-trained without any natural images to assist natural image understanding? The paper proposes a novel concept, Formula-driven Supervised Learning (FDSL). We automatically generate image patterns and their category labels by assigning fractals, which are based on a natural law. Theoretically, the use of automatically generated images instead of natural images in the pre-training phase allows us to generate an infinitely large dataset of labeled images. The proposed framework is similar yet different from Self-Supervised Learning because the FDSL framework enables the creation of image patterns based on any mathematical formulas in addition to self-generated labels. Further, unlike pre-training with a synthetic image dataset, a dataset under the framework of FDSL is not required to define object categories, surface texture, lighting conditions, and camera viewpoint. In the experimental section, we find a better dataset configuration through an exploratory study, e.g., increase of #category/#instance, patch rendering, image coloring, and training epoch. Although models pre-trained with the proposed Fractal DataBase (FractalDB), a database without natural images, do not necessarily outperform models pre-trained with human annotated datasets in all settings, we are able to partially surpass the accuracy of ImageNet/Places pre-trained models. The FractalDB pre-trained CNN also outperforms other pre-trained models on auto-generated datasets based on FDSL such as Bezier curves and Perlin noise. This is reasonable since natural objects and scenes existing around us are constructed according to fractal geometry. Image representation with the proposed FractalDB captures a unique feature in the visualization of convolutional layers and attentions.
Collapse
|
42
|
Milde MB, Afshar S, Xu Y, Marcireau A, Joubert D, Ramesh B, Bethi Y, Ralph NO, El Arja S, Dennler N, van Schaik A, Cohen G. Neuromorphic Engineering Needs Closed-Loop Benchmarks. Front Neurosci 2022; 16:813555. [PMID: 35237122 PMCID: PMC8884247 DOI: 10.3389/fnins.2022.813555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 01/24/2022] [Indexed: 12/02/2022] Open
Abstract
Neuromorphic engineering aims to build (autonomous) systems by mimicking biological systems. It is motivated by the observation that biological organisms—from algae to primates—excel in sensing their environment, reacting promptly to their perils and opportunities. Furthermore, they do so more resiliently than our most advanced machines, at a fraction of the power consumption. It follows that the performance of neuromorphic systems should be evaluated in terms of real-time operation, power consumption, and resiliency to real-world perturbations and noise using task-relevant evaluation metrics. Yet, following in the footsteps of conventional machine learning, most neuromorphic benchmarks rely on recorded datasets that foster sensing accuracy as the primary measure for performance. Sensing accuracy is but an arbitrary proxy for the actual system's goal—taking a good decision in a timely manner. Moreover, static datasets hinder our ability to study and compare closed-loop sensing and control strategies that are central to survival for biological organisms. This article makes the case for a renewed focus on closed-loop benchmarks involving real-world tasks. Such benchmarks will be crucial in developing and progressing neuromorphic Intelligence. The shift towards dynamic real-world benchmarking tasks should usher in richer, more resilient, and robust artificially intelligent systems in the future.
Collapse
|
43
|
Large-Scale Data Clustering Using Manifold-Regularized Ensemble of Posterior in GAN. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-021-05809-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
44
|
PConv: simple yet effective convolutional layer for generative adversarial network. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06846-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
45
|
Ashraf M, Robles WRQ, Kim M, Ko YS, Yi MY. A loss-based patch label denoising method for improving whole-slide image analysis using a convolutional neural network. Sci Rep 2022; 12:1392. [PMID: 35082315 PMCID: PMC8791954 DOI: 10.1038/s41598-022-05001-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 01/05/2022] [Indexed: 12/24/2022] Open
Abstract
This paper proposes a deep learning-based patch label denoising method (LossDiff) for improving the classification of whole-slide images of cancer using a convolutional neural network (CNN). Automated whole-slide image classification is often challenging, requiring a large amount of labeled data. Pathologists annotate the region of interest by marking malignant areas, which pose a high risk of introducing patch-based label noise by involving benign regions that are typically small in size within the malignant annotations, resulting in low classification accuracy with many Type-II errors. To overcome this critical problem, this paper presents a simple yet effective method for noisy patch classification. The proposed method, validated using stomach cancer images, provides a significant improvement compared to other existing methods in patch-based cancer classification, with accuracies of 98.81%, 97.30% and 89.47% for binary, ternary, and quaternary classes, respectively. Moreover, we conduct several experiments at different noise levels using a publicly available dataset to further demonstrate the robustness of the proposed method. Given the high cost of producing explicit annotations for whole-slide images and the unavoidable error-prone nature of the human annotation of medical images, the proposed method has practical implications for whole-slide image annotation and automated cancer diagnosis.
Collapse
|
46
|
Boundary-Aware Hashing for Hamming Space Retrieval. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12010508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Hamming space retrieval is a hot area of research in deep hashing because it is effective for large-scale image retrieval. Existing hashing algorithms have not fully used the absolute boundary to discriminate the data inside and outside the Hamming ball, and the performance is not satisfying. In this paper, a boundary-aware contrastive loss is designed. It involves an exponential function with absolute boundary (i.e., Hamming radius) information for dissimilar pairs and a logarithmic function to encourage small distance for similar pairs. It achieves a push that is bigger than the pull inside the Hamming ball, and the pull is bigger than the push outside the ball. Furthermore, a novel Boundary-Aware Hashing (BAH) architecture is proposed. It discriminatively penalizes the dissimilar data inside and outside the Hamming ball. BAH enables the influence of extremely imbalanced data to be reduced without up-weight to similar pairs or other optimization strategies because its exponential function rapidly converges outside the absolute boundary, making a huge contrast difference between the gradients of the logarithmic and exponential functions. Extensive experiments conducted on four benchmark datasets show that the proposed BAH obtains higher performance for different code lengths, and it has the advantage of handling extremely imbalanced data.
Collapse
|
47
|
Salari A, Djavadifar A, Liu XR, Najjaran H. Object recognition datasets and challenges: A review. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.01.022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
48
|
Mantegazza D, Giusti A, Gambardella LM, Guzzi J. An Outlier Exposure Approach to Improve Visual Anomaly Detection Performance for Mobile Robots. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3192794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Dario Mantegazza
- Dalle Molle Institute for Artificial Intelligence (IDSIA), USI-SUPSI, Lugano, Switzerland
| | - Alessandro Giusti
- Dalle Molle Institute for Artificial Intelligence (IDSIA), USI-SUPSI, Lugano, Switzerland
| | - Luca Maria Gambardella
- Dalle Molle Institute for Artificial Intelligence (IDSIA), USI-SUPSI, Lugano, Switzerland
| | - Jerome Guzzi
- Dalle Molle Institute for Artificial Intelligence (IDSIA), USI-SUPSI, Lugano, Switzerland
| |
Collapse
|
49
|
Szadkowski R, Drchal J, Faigl J. Continually trained life-long classification. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06154-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
50
|
Singh R, Dubey AK, Kapoor R. Deep Neural Network Regularization (DNNR) on Denoised Image. INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES 2022. [DOI: 10.4018/ijiit.309584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Image dehazing in supervised learning models suffers from overfitting and underfitting problems. To avoid overfitting, the authors use regularization techniques like dropout and L2 norm. Dropout helps in reducing overfitting and batch normalization reduces the training time. In this paper, they have conducted experiments to analyze combination of various hyperparameters to have better network performance using deep neural network (DNN) on cifar10 dataset. The qualitative and quantitative study is performed by estimating the accuracy of the model on training and test images using with and without batch normalization. The proposed model performs better and is more stable. The results shows that dropout regularization technique is better than L2 technique containing hidden layers with large neurons. The paper assesses performance of DNN for any denoised model with the techniques like batch normalization and dropout, feature map, and adding more layers to the network. The authors quantitatively identify the value model loss and accuracy with the absence and presence of these parameters.
Collapse
Affiliation(s)
- Richa Singh
- Amity Institute of Information Technology, Amity University, Noida, India
| | | | | |
Collapse
|