Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

24
(from Reference Citation Analysis)

Article PDFs (0)

Cited by > 0 (10)

Searched Name

Model compression

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Yuan T, Li Z, Liu B, Tang Y, Liu Y. ARPruning: An automatic channel pruning based on attention map ranking. Neural Netw 2024;174:106220. [PMID: 38447427 DOI: 10.1016/j.neunet.2024.106220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 01/22/2024] [Accepted: 02/27/2024] [Indexed: 03/08/2024]

Zhang Z, Lu Y, Wang T, Wei X, Wei Z. DDK: Dynamic structure pruning based on differentiable search and recursive knowledge distillation for BERT. Neural Netw 2024;173:106164. [PMID: 38367353 DOI: 10.1016/j.neunet.2024.106164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 02/01/2024] [Accepted: 02/03/2024] [Indexed: 02/19/2024]

Wang Y, Guo S, Guo J, Zhang J, Zhang W, Yan C, Zhang Y. Towards performance-maximizing neural network pruning via global channel attention. Neural Netw 2024;171:104-113. [PMID: 38091754 DOI: 10.1016/j.neunet.2023.11.065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 11/24/2023] [Accepted: 11/29/2023] [Indexed: 01/29/2024]

Niyaz U, Sambyal AS, Bathula DR. Leveraging different learning styles for improved knowledge distillation in biomedical imaging. Comput Biol Med 2024;168:107764. [PMID: 38056210 DOI: 10.1016/j.compbiomed.2023.107764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 10/15/2023] [Accepted: 11/21/2023] [Indexed: 12/08/2023]

Abstract

Learning style refers to a type of training mechanism adopted by an individual to gain new knowledge. As suggested by the VARK model, humans have different learning preferences, like Visual (V), Auditory (A), Read/Write (R), and Kinesthetic (K), for acquiring and effectively processing information. Our work endeavors to leverage this concept of knowledge diversification to improve the performance of model compression techniques like Knowledge Distillation (KD) and Mutual Learning (ML). Consequently, we use a single-teacher and two-student network in a unified framework that not only allows for the transfer of knowledge from teacher to students (KD) but also encourages collaborative learning between students (ML). Unlike the conventional approach, where the teacher shares the same knowledge in the form of predictions or feature representations with the student network, our proposed approach employs a more diversified strategy by training one student with predictions and the other with feature maps from the teacher. We further extend this knowledge diversification by facilitating the exchange of predictions and feature maps between the two student networks, enriching their learning experiences. We have conducted comprehensive experiments with three benchmark datasets for both classification and segmentation tasks using two different network architecture combinations. These experimental results demonstrate that knowledge diversification in a combined KD and ML framework outperforms conventional KD or ML techniques (with similar network configuration) that only use predictions with an average improvement of 2%. Furthermore, consistent improvement in performance across different tasks, with various network architectures, and over state-of-the-art techniques establishes the robustness and generalizability of the proposed model.

Collapse

López-González CI, Gascó E, Barrientos-Espillco F, Besada-Portas E, Pajares G. Filter pruning for convolutional neural networks in semantic image segmentation. Neural Netw 2024;169:713-732. [PMID: 37976595 DOI: 10.1016/j.neunet.2023.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 10/01/2023] [Accepted: 11/05/2023] [Indexed: 11/19/2023]

Wang W, Zhang Y, Zhu L. DRF-DRC: dynamic receptive field and dense residual connections for model compression. Cogn Neurodyn 2023;17:1561-1573. [PMID: 37974581 PMCID: PMC10640440 DOI: 10.1007/s11571-022-09913-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 09/25/2022] [Accepted: 10/28/2022] [Indexed: 11/16/2022] Open

Zhen C, Zhang W, Mo J, Ji M, Zhou H, Zhu J. RASP: Regularization-based Amplitude Saliency Pruning. Neural Netw 2023;168:1-13. [PMID: 37734135 DOI: 10.1016/j.neunet.2023.09.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 07/14/2023] [Accepted: 09/01/2023] [Indexed: 09/23/2023]

Abstract

Due to the prevalent data-dependent nature of existing pruning criteria, norm criteria with data independence play a crucial role in filter pruning criteria, providing promising prospects for deploying deep neural networks on resource-constrained devices. However, norm criteria based on amplitude measurements have long posed challenges in terms of theoretical feasibility. Existing methods rely on data-derived information such as derivatives to establish reasonable pruning standards. Nonetheless, achieving quantitative analysis of the "smaller-norm-less-important" notion remains elusive within the norm criterion context. To address the need for data independence and theoretical feasibility, we conducted saliency analysis on filters and proposed a regularization-based amplitude saliency pruning criterion (RASP). This amplitude saliency not only attains data independence but also establishes norm criteria for usage guidelines. Furthermore, we further investigated the amplitude saliency, addressing the issues of data dependency in model evaluation and inter-class filter selection. We introduced model saliency and an adaptive parameter group lasso (AGL) regularization approach sensitive to different layers. Theoretically, we thoroughly analyzed the feasibility of amplitude saliency and employed quantitative saliency analysis to validate the advantages of our method over previous approaches. Experimentally, conducted on the CIFAR-10 and ImageNet image classification benchmarks, we extensively validated the improved top-level performance of our method compared to previous methods. Even when the pruned model has the same or even smaller number of FLOP, our method can achieve equivalent or higher model accuracy. Notably, in our ImageNet experiment, RASP achieved a 51.9% reduction in FLOPs while maintaining an accuracy of 76.19% on ResNet-50.

Collapse

Gong Z, Chen C, Chen C, Li C, Tian X, Gong Z, Lv X. RamanCMP: A Raman spectral classification acceleration method based on lightweight model and model compression techniques. Anal Chim Acta 2023;1278:341758. [PMID: 37709483 DOI: 10.1016/j.aca.2023.341758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 08/02/2023] [Accepted: 08/27/2023] [Indexed: 09/16/2023]

Abstract

In recent years, Raman spectroscopy combined with deep learning techniques has been widely used in various fields such as medical, chemical, and geological. However, there is still room for optimization of deep learning techniques and model compression algorithms for processing Raman spectral data. To further optimize deep learning models applied to Raman spectroscopy, in this study time, accuracy, sensitivity, specificity and floating point operations numbers(FLOPs) are used as evaluation metrics to optimize the model, which is named RamanCompact(RamanCMP). The experimental data used in this research are selected from the RRUFF public dataset, which consists of 723 Raman spectroscopy data samples from 10 different mineral categories. In this paper, 1D-EfficientNet adapted to the spectral data as well as 1D-DRSN are proposed to improve the model classification accuracy. To achieve better classification accuracy while optimizing the time parameters, three model compression methods are designed: knowledge distillation using 1D-EfficientNet model as a teacher model to train convolutional neural networks(CNN), proposing a channel conversion method to optimize 1D-DRSN model, and using 1D-DRSN model as a feature extractor in combination with linear discriminant analysis(LDA) model for classification. Compared with the traditional LDA and CNN models, the accuracy of 1D-EfficientNet and 1D-DRSN is improved by more than 20%. The time of the distilled model is reduced by 9680.9s compared with the teacher model 1D-EfficientNet under the condition of losing 2.07% accuracy. The accuracy of the distilled model is improved by 20% compared to the CNN student model while keeping inference efficiency constant. The 1D-DRSN optimized with channel conversion method saves 60% inference time of the original 1D-DRSN model. Feature extraction reduces the inference time of 1D-DRSN model by 93% with 94.48% accuracy. This study innovatively combines lightweight models and model compression algorithms to improve the classification speed of deep learning models in the field of Raman spectroscopy, forming a complete set of analysis methods and laying the foundation for future research.

Collapse

Jantre S, Bhattacharya S, Maiti T. Layer adaptive node selection in Bayesian neural networks: Statistical guarantees and implementation details. Neural Netw 2023;167:309-330. [PMID: 37666188 DOI: 10.1016/j.neunet.2023.08.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 04/04/2023] [Accepted: 08/17/2023] [Indexed: 09/06/2023]

Sun C, Chen J, Li Y, Wang W, Ma T. Random pruning: channel sparsity by expectation scaling factor. PeerJ Comput Sci 2023;9:e1564. [PMID: 37705629 PMCID: PMC10495938 DOI: 10.7717/peerj-cs.1564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 08/13/2023] [Indexed: 09/15/2023]

Shang R, Li W, Zhu S, Jiao L, Li Y. Multi-teacher knowledge distillation based on joint Guidance of Probe and Adaptive Corrector. Neural Netw 2023;164:345-356. [PMID: 37163850 DOI: 10.1016/j.neunet.2023.04.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 03/05/2023] [Accepted: 04/11/2023] [Indexed: 05/12/2023]

Yun HI, Park JS. End-to-end emotional speech recognition using acoustic model adaptation based on knowledge distillation. Multimed Tools Appl 2023;82:22759-22776. [PMID: 36817556 PMCID: PMC9923643 DOI: 10.1007/s11042-023-14680-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 07/28/2022] [Accepted: 02/03/2023] [Indexed: 06/01/2023]

Abstract

The end-to-end approach provides better performance in speech recognition compared to the traditional hidden Markov model-deep neural network (HMM-DNN)-based approach, but still shows poor performance in abnormal speech, especially emotional speech. The optimal solution is to build an acoustic model suitable for emotional speech recognition using only emotional speech data for each emotion, but it is impossible because it is difficult to collect sufficient amount of emotional speech data for each emotion. In this study, we propose a method to improve the emotional speech recognition performance by using the knowledge distillation technique that was originally introduced to decrease computational intensity of deep learning-based approaches by reducing the number of model parameters. In addition to its use as model compression, we employ this technique for model adaptation to emotional speech. The proposed method builds a basic model (referred to as a teacher model) with a number of model parameters using an amount of normal speech data, and then constructs a target model (referred to as a student model) with fewer model parameters using a small amount of emotional speech data (i.e., adaptation data). Since the student model is built with emotional speech data, it is expected to reflect the emotional characteristics of each emotion well. In the emotional speech recognition experiment, the student model maintained recognition performance regardless of the number of model parameters, whereas the teacher model degraded performance significantly as the number of parameters decreased, showing performance degradation of about 10% in word error rate. This result demonstrates that the student model serves as an acoustic model suitable for emotional speech recognition even though it does not require much emotional speech data.

Collapse

Li L, Su W, Liu F, He M, Liang X. Knowledge Fusion Distillation: Improving Distillation with Multi-scale Attention Mechanisms. Neural Process Lett 2023;55:1-16. [PMID: 36619739 PMCID: PMC9807430 DOI: 10.1007/s11063-022-11132-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/12/2022] [Indexed: 01/04/2023]

Abrar S, Samad MD. Perturbation of deep autoencoder weights for model compression and classification of tabular data. Neural Netw 2022;156:160-169. [PMID: 36270199 PMCID: PMC9669225 DOI: 10.1016/j.neunet.2022.09.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Revised: 07/18/2022] [Accepted: 09/19/2022] [Indexed: 11/16/2022]

Li G, Togo R, Ogawa T, Haseyama M. Compressed gastric image generation based on soft-label dataset distillation for medical data sharing. Comput Methods Programs Biomed 2022;227:107189. [PMID: 36323177 DOI: 10.1016/j.cmpb.2022.107189] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 07/07/2022] [Accepted: 10/17/2022] [Indexed: 06/16/2023]

Li J, Zhao B, Liu D. DMPP: Differentiable multi-pruner and predictor for neural network pruning. Neural Netw 2021;147:103-112. [PMID: 34998270 DOI: 10.1016/j.neunet.2021.12.020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 12/13/2021] [Accepted: 12/23/2021] [Indexed: 10/19/2022]

Deng X, Zhang Z. Sparsity-control ternary weight networks. Neural Netw 2021;145:221-232. [PMID: 34773898 DOI: 10.1016/j.neunet.2021.10.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 08/10/2021] [Accepted: 10/21/2021] [Indexed: 11/18/2022]

Zhang R, Chung ACS. MedQ: Lossless ultra-low-bit neural network quantization for medical image segmentation. Med Image Anal 2021;73:102200. [PMID: 34416578 DOI: 10.1016/j.media.2021.102200] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 06/30/2021] [Accepted: 07/26/2021] [Indexed: 10/20/2022]

Tan K, Wang D. Towards Model Compression for Deep Learning Based Speech Enhancement. IEEE/ACM Trans Audio Speech Lang Process 2021;29:1785-1794. [PMID: 34179220 PMCID: PMC8224477 DOI: 10.1109/taslp.2021.3082282] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]

Alkhulaifi A, Alsahli F, Ahmad I. Knowledge distillation in deep learning and its applications. PeerJ Comput Sci 2021;7:e474. [PMID: 33954248 PMCID: PMC8053015 DOI: 10.7717/peerj-cs.474] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Accepted: 03/16/2021] [Indexed: 05/20/2023]

Wen L, Zhang X, Bai H, Xu Z. Structured pruning of recurrent neural networks through neuron selection. Neural Netw 2019;123:134-141. [PMID: 31855748 DOI: 10.1016/j.neunet.2019.11.018] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 10/01/2019] [Accepted: 11/19/2019] [Indexed: 11/17/2022]

Patra A, Cai Y, Chatelain P, Sharma H, Drukker L, Papageorghiou A, Noble JA. Efficient Ultrasound Image Analysis Models with Sonographer Gaze Assisted Distillation. Med Image Comput Comput Assist Interv 2019;22:394-402. [PMID: 31942569 DOI: 10.1007/978-3-030-32251-9_43] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]

Guo J, Zhou B, Zeng X, Freyberg Z, Xu M. Model Compression for Faster Structural Separation of Macromolecules Captured by Cellular Electron Cryo-Tomography. Image Anal Recognit 2018;10882:144-152. [PMID: 31231722 PMCID: PMC6588193 DOI: 10.1007/978-3-319-93000-8_17] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]

Heinrich MP, Blendowski M, Oktay O. TernaryNet: faster deep model inference without GPUs for medical 3D segmentation using sparse and binary convolutions. Int J Comput Assist Radiol Surg 2018;13:1311-1320. [PMID: 29850978 DOI: 10.1007/s11548-018-1797-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2018] [Accepted: 05/21/2018] [Indexed: 10/16/2022]