1
|
Pei Z, Yao X, Zhao W, Yu B. Quantization via Distillation and Contrastive Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:17164-17176. [PMID: 37610897 DOI: 10.1109/tnnls.2023.3300309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/25/2023]
Abstract
Quantization is a critical technique employed across various research fields for compressing deep neural networks (DNNs) to facilitate deployment within resource-limited environments. This process necessitates a delicate balance between model size and performance. In this work, we explore knowledge distillation (KD) as a promising approach for improving quantization performance by transferring knowledge from high-precision networks to low-precision counterparts. We specifically investigate feature-level information loss during distillation and emphasize the importance of feature-level network quantization perception. We propose a novel quantization method that combines feature-level distillation and contrastive learning to extract and preserve more valuable information during the quantization process. Furthermore, we utilize the hyperbolic tangent function to estimate gradients with respect to the rounding function, which smoothens the training procedure. Our extensive experimental results demonstrate that the proposed approach achieves competitive model performance with the quantized network compared to its full-precision counterpart, thus validating its efficacy and potential for real-world applications.
Collapse
|
2
|
Boudardara F, Boussif A, Meyer PJ, Ghazel M. INNAbstract: An INN-Based Abstraction Method for Large-Scale Neural Network Verification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18455-18469. [PMID: 37792651 DOI: 10.1109/tnnls.2023.3316551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/06/2023]
Abstract
Neural networks (NNs) have witnessed widespread deployment across various domains, including some safetycritical applications. In this regard, the demand for verifying means of such artificial intelligence techniques is more and more pressing. Nowadays, the development of evaluation approaches for NNs is a hot topic that is attracting considerable interest, and a number of verification methods have been proposed. Yet, a challenging issue for NN verification is pertaining to the scalability when some NNs of practical interest have to be evaluated. This work aims to present INNAbstract, an abstraction method to reduce the size of NNs, which leads to improving the scalability of NN verification and reachability analysis methods. This is achieved by merging neurons while ensuring that the obtained model (i.e., abstract model) overapproximates the original one. INNAbstract supports networks with numerous activation functions. In addition, we propose a heuristic for nodes' selection to build more precise abstract models, in the sense that the outputs are closer to those of the original network. The experimental results illustrate the efficiency of the proposed approach compared to the existing relevant abstraction techniques. Furthermore, they demonstrate that INNAbstract can help the existing verification tools to be applied on larger networks while considering various activation functions.
Collapse
|
3
|
Fabre W, Haroun K, Lorrain V, Lepecq M, Sicard G. From Near-Sensor to In-Sensor: A State-of-the-Art Review of Embedded AI Vision Systems. SENSORS (BASEL, SWITZERLAND) 2024; 24:5446. [PMID: 39205141 PMCID: PMC11360785 DOI: 10.3390/s24165446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 08/07/2024] [Accepted: 08/19/2024] [Indexed: 09/04/2024]
Abstract
In modern cyber-physical systems, the integration of AI into vision pipelines is now a standard practice for applications ranging from autonomous vehicles to mobile devices. Traditional AI integration often relies on cloud-based processing, which faces challenges such as data access bottlenecks, increased latency, and high power consumption. This article reviews embedded AI vision systems, examining the diverse landscape of near-sensor and in-sensor processing architectures that incorporate convolutional neural networks. We begin with a comprehensive analysis of the critical characteristics and metrics that define the performance of AI-integrated vision systems. These include sensor resolution, frame rate, data bandwidth, computational throughput, latency, power efficiency, and overall system scalability. Understanding these metrics provides a foundation for evaluating how different embedded processing architectures impact the entire vision pipeline, from image capture to AI inference. Our analysis delves into near-sensor systems that leverage dedicated hardware accelerators and commercially available components to efficiently process data close to their source, minimizing data transfer overhead and latency. These systems offer a balance between flexibility and performance, allowing for real-time processing in constrained environments. In addition, we explore in-sensor processing solutions that integrate computational capabilities directly into the sensor. This approach addresses the rigorous demand constraints of embedded applications by significantly reducing data movement and power consumption while also enabling in-sensor feature extraction, pre-processing, and CNN inference. By comparing these approaches, we identify trade-offs related to flexibility, power consumption, and computational performance. Ultimately, this article provides insights into the evolving landscape of embedded AI vision systems and suggests new research directions for the development of next-generation machine vision systems.
Collapse
Affiliation(s)
- William Fabre
- Université Paris-Saclay, CEA, List, F-91120 Palaiseau, France; (K.H.); (V.L.); (M.L.); (G.S.)
| | | | | | | | | |
Collapse
|
4
|
Lin M, Ji R, Li S, Wang Y, Wu Y, Huang F, Ye Q. Network Pruning Using Adaptive Exemplar Filters. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:7357-7366. [PMID: 34101606 DOI: 10.1109/tnnls.2021.3084856] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Popular network pruning algorithms reduce redundant information by optimizing hand-crafted models, and may cause suboptimal performance and long time in selecting filters. We innovatively introduce adaptive exemplar filters to simplify the algorithm design, resulting in an automatic and efficient pruning approach called EPruner. Inspired by the face recognition community, we use a message-passing algorithm Affinity Propagation on the weight matrices to obtain an adaptive number of exemplars, which then act as the preserved filters. EPruner breaks the dependence on the training data in determining the "important" filters and allows the CPU implementation in seconds, an order of magnitude faster than GPU-based SOTAs. Moreover, we show that the weights of exemplars provide a better initialization for the fine-tuning. On VGGNet-16, EPruner achieves a 76.34%-FLOPs reduction by removing 88.80% parameters, with 0.06% accuracy improvement on CIFAR-10. In ResNet-152, EPruner achieves a 65.12%-FLOPs reduction by removing 64.18% parameters, with only 0.71% top-5 accuracy loss on ILSVRC-2012. Our code is available at https://github.com/lmbxmu/EPruner.
Collapse
|
5
|
Hu SG, Qiao GC, Liu XK, Liu YH, Zhang CM, Zuo Y, Zhou P, Liu YA, Ning N, Yu Q, Liu Y. A Co-Designed Neuromorphic Chip With Compact (17.9K F 2) and Weak Neuron Number-Dependent Neuron/Synapse Modules. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2022; 16:1250-1260. [PMID: 36150001 DOI: 10.1109/tbcas.2022.3209073] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Many efforts have been made to improve the neuron integration efficiency on neuromorphic chips, such as using emerging memory devices and shrinking CMOS technology nodes. However, in the fully connected (FC) neuromorphic core, increasing the number of neurons will lead to a square increase in synapse & dendrite costs and a high-slope linear increase in soma costs, resulting in an explosive growth of core hardware costs. We propose a co-designed neuromorphic core (SRCcore) based on the quantized spiking neural network (SNN) technology and compact chip design methodology. The cost of the neuron/synapse module in SRCcore weakly depends on the neuron number, which effectively relieves the growth pressure of the core area caused by increasing the neuron number. In the proposed BICS chip based on SRCcore, although the neuron/synapse module implements 1∼16 times of neurons and 1∼66 times of synapses, it only costs an area of 1.79 × 107 F2, which is 7.9%∼38.6% of that in previous works. Based on the weight quantization strategy matched with SRCcore, quantized SNNs achieve 0.05%∼2.19% higher accuracy than previous works, thus supporting the design and application of SRCcore. Finally, a cross-modeling application is demonstrated based on the chip. We hope this work will accelerate the development of cortical-scale neuromorphic systems.
Collapse
|
6
|
Lin M, Cao L, Li S, Ye Q, Tian Y, Liu J, Tian Q, Ji R. Filter Sketch for Network Pruning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:7091-7100. [PMID: 34125685 DOI: 10.1109/tnnls.2021.3084206] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
We propose a novel network pruning approach by information preserving of pretrained network weights (filters). Network pruning with the information preserving is formulated as a matrix sketch problem, which is efficiently solved by the off-the-shelf frequent direction method. Our approach, referred to as FilterSketch, encodes the second-order information of pretrained weights, which enables the representation capacity of pruned networks to be recovered with a simple fine-tuning procedure. FilterSketch requires neither training from scratch nor data-driven iterative optimization, leading to a several-orders-of-magnitude reduction of time cost in the optimization of pruning. Experiments on CIFAR-10 show that FilterSketch reduces 63.3% of floating-point operations (FLOPs) and prunes 59.9% of network parameters with negligible accuracy cost for ResNet-110. On ILSVRC-2012, it reduces 45.5% of FLOPs and removes 43.0% of parameters with only 0.69% accuracy drop for ResNet-50. Our code and pruned models can be found at https://github.com/lmbxmu/FilterSketch.
Collapse
|
7
|
Fei W, Dai W, Li C, Zou J, Xiong H. General Bitwidth Assignment for Efficient Deep Convolutional Neural Network Quantization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5253-5267. [PMID: 33830929 DOI: 10.1109/tnnls.2021.3069886] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Model quantization is essential to deploy deep convolutional neural networks (DCNNs) on resource-constrained devices. In this article, we propose a general bitwidth assignment algorithm based on theoretical analysis for efficient layerwise weight and activation quantization of DCNNs. The proposed algorithm develops a prediction model to explicitly estimate the loss of classification accuracy led by weight quantization with a geometrical approach. Consequently, dynamic programming is adopted to achieve optimal bitwidth assignment on weights based on the estimated error. Furthermore, we optimize bitwidth assignment for activations by considering the signal-to-quantization-noise ratio (SQNR) between weight and activation quantization. The proposed algorithm is general to reveal the tradeoff between classification accuracy and model size for various network architectures. Extensive experiments demonstrate the efficacy of the proposed bitwidth assignment algorithm and the error rate prediction model. Furthermore, the proposed algorithm is shown to be well extended to object detection.
Collapse
|
8
|
Hu Y, Wen G, Luo M, Dai D, Cao W, Yu Z, Hall W. Inner-Imaging Networks: Put Lenses Into Convolutional Structure. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:8547-8560. [PMID: 34398768 DOI: 10.1109/tcyb.2020.3034605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Despite the tremendous success in computer vision, deep convolutional networks suffer from serious computation costs and redundancies. Although previous works address that by enhancing the diversities of filters, they have not considered the complementarity and the completeness of the internal convolutional structure. To respond to this problem, we propose a novel inner-imaging (InI) architecture, which allows relationships between channels to meet the above requirement. Specifically, we organize the channel signal points in groups using convolutional kernels to model both the intragroup and intergroup relationships simultaneously. A convolutional filter is a powerful tool for modeling spatial relations and organizing grouped signals, so the proposed methods map the channel signals onto a pseudoimage, like putting a lens into the internal convolution structure. Consequently, not only is the diversity of channels increased but also the complementarity and completeness can be explicitly enhanced. The proposed architecture is lightweight and easy to be implement. It provides an efficient self-organization strategy for convolutional networks to improve their performance. Extensive experiments are conducted on multiple benchmark datasets, including CIFAR, SVHN, and ImageNet. Experimental results verify the effectiveness of the InI mechanism with the most popular convolutional networks as the backbones.
Collapse
|
9
|
Verma S, Wang C, Zhu L, Liu W. Attn-HybridNet: Improving Discriminability of Hybrid Features With Attention Fusion. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:6567-6578. [PMID: 33739927 DOI: 10.1109/tcyb.2021.3060176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The principal component analysis network (PCANet) is an unsupervised deep network, utilizing principal components as convolution filters in its layers. Albeit powerful, the PCANet suffers from two fundamental problems responsible for its performance degradation. First, the principal components transform the data as column vectors (which we call the amalgamated view) and incur a loss of spatial information present in the data. Second, the generalized pooling in the PCANet is unable to incorporate spatial statistics of the natural images, and it also induces redundancy among the features. In this research, we first propose a tensor-factorization-based deep network called the tensor factorization network (TFNet). The TFNet extracts features by preserving the spatial view of the data (which we call the minutiae view). We then proposed HybridNet, which simultaneously extracts information with the two views of the data since their integration can improve the performance of classification systems. Finally, to alleviate the feature redundancy among hybrid features, we propose Attn-HybridNet to perform attention-based feature selection and fusion to improve their discriminability. Classification results on multiple real-world datasets using features extracted by our proposed Attn-HybridNet achieves significantly better performance over other popular baseline methods, demonstrating the effectiveness of the proposed techniques.
Collapse
|
10
|
Jin X, Xie Y, Wei XS, Zhao BR, Zhang Y, Tan X, Yu Y. A Lightweight Encoder-Decoder Path for Deep Residual Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:866-878. [PMID: 33180736 DOI: 10.1109/tnnls.2020.3029613] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this article, we present a novel lightweight path for deep residual neural networks. The proposed method integrates a simple plug-and-play module, i.e., a convolutional encoder-decoder (ED), as an augmented path to the original residual building block. Due to the abstract design and ability of the encoding stage, the decoder part tends to generate feature maps where highly semantically relevant responses are activated, while irrelevant responses are restrained. By a simple elementwise addition operation, the learned representations derived from the identity shortcut and original transformation branch are enhanced by our ED path. Furthermore, we exploit lightweight counterparts by removing a portion of channels in the original transformation branch. Fortunately, our lightweight processing does not cause an obvious performance drop but brings a computational economy. By conducting comprehensive experiments on ImageNet, MS-COCO, CUB200-2011, and CIFAR, we demonstrate the consistent accuracy gain obtained by our ED path for various residual architectures, with comparable or even lower model complexity. Concretely, it decreases the top-1 error of ResNet-50 and ResNet-101 by 1.22% and 0.91% on the task of ImageNet classification and increases the mmAP of Faster R-CNN with ResNet-101 by 2.5% on the MS-COCO object detection task. The code is available at https://github.com/Megvii-Nanjing/ED-Net.
Collapse
|
11
|
|
12
|
Xu TB, Liu CL. Deep Neural Network Self-Distillation Exploiting Data Representation Invariance. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:257-269. [PMID: 33074828 DOI: 10.1109/tnnls.2020.3027634] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
To harvest small networks with high accuracies, most existing methods mainly utilize compression techniques such as low-rank decomposition and pruning to compress a trained large model into a small network or transfer knowledge from a powerful large model (teacher) to a small network (student). Despite their success in generating small models of high performance, the dependence of accompanying assistive models complicates the training process and increases memory and time cost. In this article, we propose an elegant self-distillation (SD) mechanism to obtain high-accuracy models directly without going through an assistive model. Inspired by the invariant recognition in the human vision system, different distorted instances of the same input should possess similar high-level data representations. Thus, we can learn data representation invariance between different distorted versions of the same sample. Especially, in our learning algorithm based on SD, the single network utilizes the maximum mean discrepancy metric to learn the global feature consistency and the Kullback-Leibler divergence to constrain the posterior class probability consistency across the different distorted branches. Extensive experiments on MNIST, CIFAR-10/100, and ImageNet data sets demonstrate that the proposed method can effectively reduce the generalization error for various network architectures, such as AlexNet, VGGNet, ResNet, Wide ResNet, and DenseNet, and outperform existing model distillation methods with little extra training efforts.
Collapse
|
13
|
Bao T, Zaidi SAR, Xie S, Yang P, Zhang ZQ. Inter-Subject Domain Adaptation for CNN-Based Wrist Kinematics Estimation Using sEMG. IEEE Trans Neural Syst Rehabil Eng 2021; 29:1068-1078. [PMID: 34086574 DOI: 10.1109/tnsre.2021.3086401] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Recently, convolutional neural network (CNN) has been widely investigated to decode human intentions using surface Electromyography (sEMG) signals. However, a pre-trained CNN model usually suffers from severe degradation when testing on a new individual, and this is mainly due to domain shift where characteristics of training and testing sEMG data differ substantially. To enhance inter-subject performances of CNN in the wrist kinematics estimation, we propose a novel regression scheme for supervised domain adaptation (SDA), based on which domain shift effects can be effectively reduced. Specifically, a two-stream CNN with shared weights is established to exploit source and target sEMG data simultaneously, such that domain-invariant features can be extracted. To tune CNN weights, both regression losses and a domain discrepancy loss are employed, where the former enable supervised learning and the latter minimizes distribution divergences between two domains. In this study, eight healthy subjects were recruited to perform wrist flexion-extension movements. Experiment results illustrated that the proposed regression SDA outperformed fine-tuning, a state-of-the-art transfer learning method, in both single-single and multiple-single scenarios of kinematics estimation. Unlike fine-tuning which suffers from catastrophic forgetting, regression SDA can maintain much better performances in original domains, which boosts the model reusability among multiple subjects.
Collapse
|
14
|
Wang P, He X, Chen Q, Cheng A, Liu Q, Cheng J. Unsupervised Network Quantization via Fixed-Point Factorization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:2706-2720. [PMID: 32706647 DOI: 10.1109/tnnls.2020.3007749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The deep neural network (DNN) has achieved remarkable performance in a wide range of applications at the cost of huge memory and computational complexity. Fixed-point network quantization emerges as a popular acceleration and compression method but still suffers from huge performance degradation when extremely low-bit quantization is utilized. Moreover, current fixed-point quantization methods rely heavily on supervised retraining using large amounts of the labeled training data, while the labeled data are hard to obtain in the real-world applications. In this article, we propose an efficient framework, namely, fixed-point factorized network (FFN), to turn all weights into ternary values, i.e., {-1, 0, 1}. We highlight that the proposed FFN framework can achieve negligible degradation even without any supervised retraining on the labeled data. Note that the activations can be easily quantized into an 8-bit format; thus, the resulting networks only have low-bit fixed-point additions that are significantly more efficient than 32-bit floating-point multiply-accumulate operations (MACs). Extensive experiments on large-scale ImageNet classification and object detection on MS COCO show that the proposed FFN can achieve about more than 20× compression and remove most of the multiply operations with comparable accuracy. Codes are available on GitHub at https://github.com/wps712/FFN.
Collapse
|
15
|
Zhang Y, Cui M, Shen L, Zeng Z. Memristive Quantized Neural Networks: A Novel Approach to Accelerate Deep Learning On-Chip. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:1875-1887. [PMID: 31059463 DOI: 10.1109/tcyb.2019.2912205] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Existing deep neural networks (DNNs) are computationally expensive and memory intensive, which hinder their further deployment in novel nanoscale devices and applications with lower memory resources or strict latency requirements. In this paper, a novel approach to accelerate on-chip learning systems using memristive quantized neural networks (M-QNNs) is presented. A real problem of multilevel memristive synaptic weights due to device-to-device (D2D) and cycle-to-cycle (C2C) variations is considered. Different levels of Gaussian noise are added to the memristive model during each adjustment. Another method of using memristors with binary states to build M-QNNs is presented, which suffers from fewer D2D and C2C variations compared with using multilevel memristors. Furthermore, methods of solving the sneak path issues in the memristive crossbar arrays are proposed. The M-QNN approach is evaluated on two image classification datasets, that is, ten-digit number and handwritten images of mixed National Institute of Standards and Technology (MNIST). In addition, input images with different levels of zero-mean Gaussian noise are tested to verify the robustness of the proposed method. Another highlight of the proposed method is that it can significantly reduce computational time and memory during the process of image recognition.
Collapse
|
16
|
Liu X, Li L, Wang S, Zha ZJ, Huang Q. Local-binarized very deep residual network for visual categorization. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.11.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
17
|
Chen H, Wang Y, Xu C, Xu C, Tao D. Learning Student Networks via Feature Embedding. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:25-35. [PMID: 32092018 DOI: 10.1109/tnnls.2020.2970494] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Deep convolutional neural networks have been widely used in numerous applications, but their demanding storage and computational resource requirements prevent their applications on mobile devices. Knowledge distillation aims to optimize a portable student network by taking the knowledge from a well-trained heavy teacher network. Traditional teacher-student-based methods used to rely on additional fully connected layers to bridge intermediate layers of teacher and student networks, which brings in a large number of auxiliary parameters. In contrast, this article aims to propagate information from teacher to student without introducing new variables that need to be optimized. We regard the teacher-student paradigm from a new perspective of feature embedding. By introducing the locality preserving loss, the student network is encouraged to generate the low-dimensional features that could inherit intrinsic properties of their corresponding high-dimensional features from the teacher network. The resulting portable network, thus, can naturally maintain the performance as that of the teacher network. Theoretical analysis is provided to justify the lower computation complexity of the proposed method. Experiments on benchmark data sets and well-trained networks suggest that the proposed algorithm is superior to state-of-the-art teacher-student learning methods in terms of computational and storage complexity.
Collapse
|
18
|
RGB Image Prioritization Using Convolutional Neural Network on a Microprocessor for Nanosatellites. REMOTE SENSING 2020. [DOI: 10.3390/rs12233941] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Nanosatellites are being widely used in various missions, including remote sensing applications. However, the difficulty lies in mission operation due to downlink speed limitation in nanosatellites. Considering the global cloud fraction of 67%, retrieving clear images through the limited downlink capacity becomes a larger issue. In order to solve this problem, we propose an image prioritization method based on cloud coverage using CNN. The CNN is designed to be lightweight and to be able to prioritize RGB images for nanosatellite application. As previous CNNs are too heavy for onboard processing, new strategies are introduced to lighten the network. The input size is reduced, and patch decomposition is implemented for reduced memory usage. Replication padding is applied on the first block to suppress border ambiguity in the patches. The depth of the network is reduced for small input size adaptation, and the number of kernels is reduced to decrease the total number of parameters. Lastly, a multi-stream architecture is implemented to suppress the network from optimizing on color features. As a result, the number of parameters was reduced down to 0.4%, and the inference time was reduced down to 4.3% of the original network while maintaining approximately 70% precision. We expect that the proposed method will enhance the downlink capability of clear images in nanosatellites by 112%.
Collapse
|
19
|
Wu K, Guo Y, Zhang C. Compressing Deep Neural Networks With Sparse Matrix Factorization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:3828-3838. [PMID: 31725393 DOI: 10.1109/tnnls.2019.2946636] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Modern deep neural networks (DNNs) are usually overparameterized and composed of a large number of learnable parameters. One of a few effective solutions attempts to compress DNN models via learning sparse weights and connections. In this article, we follow this line of research and present an alternative framework of learning sparse DNNs, with the assistance of matrix factorization. We provide an underlying principle for substituting the original parameter matrices with the multiplications of highly sparse ones, which constitutes the theoretical basis of our method. Experimental results demonstrate that our method substantially outperforms previous states of the arts for compressing various DNNs, giving rich empirical evidence in support of its effectiveness. It is also worth mentioning that, unlike many other works that focus on feedforward networks like multi-layer perceptrons and convolutional neural networks only, we also evaluate our method on a series of recurrent networks in practice.
Collapse
|
20
|
|
21
|
|
22
|
Lin S, Ji R, Li Y, Deng C, Li X. Toward Compact ConvNets via Structure-Sparsity Regularized Filter Pruning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:574-588. [PMID: 30990448 DOI: 10.1109/tnnls.2019.2906563] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The success of convolutional neural networks (CNNs) in computer vision applications has been accompanied by a significant increase of computation and memory costs, which prohibits their usage on resource-limited environments, such as mobile systems or embedded devices. To this end, the research of CNN compression has recently become emerging. In this paper, we propose a novel filter pruning scheme, termed structured sparsity regularization (SSR), to simultaneously speed up the computation and reduce the memory overhead of CNNs, which can be well supported by various off-the-shelf deep learning libraries. Concretely, the proposed scheme incorporates two different regularizers of structured sparsity into the original objective function of filter pruning, which fully coordinates the global output and local pruning operations to adaptively prune filters. We further propose an alternative updating with Lagrange multipliers (AULM) scheme to efficiently solve its optimization. AULM follows the principle of alternating direction method of multipliers (ADMM) and alternates between promoting the structured sparsity of CNNs and optimizing the recognition loss, which leads to a very efficient solver ( 2.5× to the most recent work that directly solves the group sparsity-based regularization). Moreover, by imposing the structured sparsity, the online inference is extremely memory-light since the number of filters and the output feature maps are simultaneously reduced. The proposed scheme has been deployed to a variety of state-of-the-art CNN structures, including LeNet, AlexNet, VGGNet, ResNet, and GoogLeNet, over different data sets. Quantitative results demonstrate that the proposed scheme achieves superior performance over the state-of-the-art methods. We further demonstrate the proposed compression scheme for the task of transfer learning, including domain adaptation and object detection, which also show exciting performance gains over the state-of-the-art filter pruning methods.
Collapse
|
23
|
Leyva R, Sanchez V, Li CT. Compact and Low-Complexity Binary Feature Descriptor and Fisher Vectors for Video Analytics. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:6169-6184. [PMID: 31251186 DOI: 10.1109/tip.2019.2922826] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this paper, we propose a compact and low-complexity binary feature descriptor for video analytics. Our binary descriptor encodes the motion information of a spatio-temporal support region into a low-dimensional binary string. The descriptor is based on a binning strategy and a construction that binarizes separately the horizontal and vertical motion components of the spatio-temporal support region. We pair our descriptor with a novel Fisher Vector (FV) scheme for binary data to project a set of binary features into a fixed length vector in order to evaluate the similarity between feature sets. We test the effectiveness of our binary feature descriptor with FVs for action recognition, which is one of the most challenging tasks in computer vision, as well as gait recognition and animal behavior clustering. Several experiments on the KTH, UCF50, UCF101, CASIA-B, and TIGdog datasets show that the proposed binary feature descriptor outperforms the state-of-the-art feature descriptors in terms of computational time and memory and storage requirements. When paired with FVs, the proposed feature descriptor attains a very competitive performance, outperforming several state-of-the-art feature descriptors and some methods based on convolutional neural networks.
Collapse
|
24
|
Passalis N, Tefas A. Training Lightweight Deep Convolutional Neural Networks Using Bag-of-Features Pooling. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1705-1715. [PMID: 30369453 DOI: 10.1109/tnnls.2018.2872995] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Convolutional neural networks (CNNs) are predominantly used for several challenging computer vision tasks achieving state-of-the-art performance. However, CNNs are complex models that require the use of powerful hardware, both for training and deploying them. To this end, a quantization-based pooling method is proposed in this paper. The proposed method is inspired from the bag-of-features model and can be used for learning more lightweight deep neural networks. Trainable radial basis function neurons are used to quantize the activations of the final convolutional layer, reducing the number of parameters in the network and allowing for natively classifying images of various sizes. The proposed method employs differentiable quantization and aggregation layers leading to an end-to-end trainable CNN architecture. Furthermore, a fast linear variant of the proposed method is introduced and discussed, providing new insight for understanding convolutional neural architectures. The ability of the proposed method to reduce the size of CNNs and increase the performance over other competitive methods is demonstrated using seven data sets and three different learning tasks (classification, regression, and retrieval).
Collapse
|