Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang X, Huang Z, Wang N, Xiang S, Pan C. You Only Search Once: Single Shot Neural Architecture Search via Direct Sparse Optimization. IEEE Trans Pattern Anal Mach Intell 2021;43:2891-2904. [PMID: 32866093 DOI: 10.1109/tpami.2020.3020300] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

For:	Zhang X, Huang Z, Wang N, Xiang S, Pan C. You Only Search Once: Single Shot Neural Architecture Search via Direct Sparse Optimization. IEEE Trans Pattern Anal Mach Intell 2021;43:2891-2904. [PMID: 32866093 DOI: 10.1109/tpami.2020.3020300] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Number

Cited by Other Article(s)

Tian Y, Xie L, Fang J, Jiao J, Ye Q, Tian Q. Exploring Complicated Search Spaces With Interleaving-Free Sampling. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025;36:7764-7771. [PMID: 39024083 DOI: 10.1109/tnnls.2024.3408329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2024]

Yang A, Liu Y, Li C, Ren Q. Deeply Supervised Block-Wise Neural Architecture Search. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025;36:2451-2464. [PMID: 38231812 DOI: 10.1109/tnnls.2023.3347542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]

Abstract

Neural architecture search (NAS) has shown great promise in automatically designing neural network models. Recently, block-wise NAS has been proposed to alleviate deep coupling problem between architectures and weights existed in the well-known weight-sharing NAS, by training the huge weight-sharing supernet block-wisely. However, the existing block-wise NAS methods, which resort to either supervised distillation or self-supervised contrastive learning scheme to enable block-wise optimization, take massive computational cost. To be specific, the former introduces an external high-capacity teacher model, while the latter involves supernet-scale momentum model and requires a long training schedule. Considering this, in this work, we propose a resource-friendly deeply supervised block-wise NAS (DBNAS) method. In the proposed DBNAS, we construct a lightweight deeply-supervised module after each block to enable a simple supervised learning scheme and leverage ground-truth labels to indirectly supervise optimization of each block progressively. Besides, the deeply-supervised module is specifically designed as structural and functional condensation of the supernet, which establishes global awareness for progressive block-wise optimization and helps search for promising architectures. Experimental results show that the DBNAS method only takes less than 1 GPU day to search out promising architectures on the ImageNet dataset with less GPU memory footprint than the other block-wise NAS works. The best-performing model among the searched DBNAS family achieves 75.6% Top-1 accuracy on ImageNet, which is competitive with the state-of-the-art NAS models. Moreover, our DBNAS family models also achieve good transfer performance on CIFAR-10/100, as well as two downstream tasks: object detection and semantic segmentation.

Collapse

Zhang C, Meng G, Fan B, Tian K, Zhang Z, Xiang S, Pan C. Reusable Architecture Growth for Continual Stereo Matching. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024;46:6167-6184. [PMID: 38502627 DOI: 10.1109/tpami.2024.3378884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/21/2024]

Zhen C, Zhang W, Mo J, Ji M, Zhou H, Zhu J. RASP: Regularization-based Amplitude Saliency Pruning. Neural Netw 2023;168:1-13. [PMID: 37734135 DOI: 10.1016/j.neunet.2023.09.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 07/14/2023] [Accepted: 09/01/2023] [Indexed: 09/23/2023]

Abstract

Due to the prevalent data-dependent nature of existing pruning criteria, norm criteria with data independence play a crucial role in filter pruning criteria, providing promising prospects for deploying deep neural networks on resource-constrained devices. However, norm criteria based on amplitude measurements have long posed challenges in terms of theoretical feasibility. Existing methods rely on data-derived information such as derivatives to establish reasonable pruning standards. Nonetheless, achieving quantitative analysis of the "smaller-norm-less-important" notion remains elusive within the norm criterion context. To address the need for data independence and theoretical feasibility, we conducted saliency analysis on filters and proposed a regularization-based amplitude saliency pruning criterion (RASP). This amplitude saliency not only attains data independence but also establishes norm criteria for usage guidelines. Furthermore, we further investigated the amplitude saliency, addressing the issues of data dependency in model evaluation and inter-class filter selection. We introduced model saliency and an adaptive parameter group lasso (AGL) regularization approach sensitive to different layers. Theoretically, we thoroughly analyzed the feasibility of amplitude saliency and employed quantitative saliency analysis to validate the advantages of our method over previous approaches. Experimentally, conducted on the CIFAR-10 and ImageNet image classification benchmarks, we extensively validated the improved top-level performance of our method compared to previous methods. Even when the pruned model has the same or even smaller number of FLOP, our method can achieve equivalent or higher model accuracy. Notably, in our ImageNet experiment, RASP achieved a 51.9% reduction in FLOPs while maintaining an accuracy of 76.19% on ResNet-50.

Collapse

Du Y, Zhou X, Huang T, Yang C. A hierarchical evolution of neural architecture search method based on state transition algorithm. INT J MACH LEARN CYB 2023. [DOI: 10.1007/s13042-023-01794-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]

He J, Ding Y, Zhang M, Li D. Towards efficient network compression via Few-Shot Slimming. Neural Netw 2021;147:113-125. [PMID: 34999388 DOI: 10.1016/j.neunet.2021.12.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 10/13/2021] [Accepted: 12/20/2021] [Indexed: 10/19/2022]

Abstract

While previous network compression methods achieve great success, most of them rely on the abundant training data which is, unfortunately, often unavailable in practice due to some reasons, e.g., privacy issues, storage constraints, and transmission limitations. A promising way to solve this problem is to perform compression with a few unlabeled data. Proceeding along this way, we propose a novel few-shot network compression framework named Few-Shot Slimming (FSS). FSS follows the student/teacher paradigm, and contains two steps: (1) construct the student by inheriting principal feature maps from the teacher; (2) refine the student feature representation by knowledge distillation with an enhanced mixing data augmentation method called GridMix. Specifically, in the first step, we employ normalized cross correlation to perform the principal feature analysis, and then theoretically construct a new indicator to select the most informative feature maps from the teacher for the student. The indicator is based on the variances of feature maps which can efficiently quantitate the information richness of the input feature maps in a feature-agnostic manner. In the second step, we perform the knowledge distillation for the initialized student in first step with a novel grid-based mixing data augmentation technique which greatly extends the limited sample dataset. In this way, the student is able to refine its feature representation and achieves a better result. Extensive experiments on multiple benchmarks demonstrate the state-of-the-art performance of FSS. For example, by using 0.2% label-free data of full training set, FSS yields a 60% FLOPs reduction for DenseNet-40 on CIFAR-10 with only a loss of 0.8% in top-1 accuracy, achieving a result on par with that obtained by the conventional full-data methods.

Collapse

Zhang M, Li H, Pan S, Chang X, Zhou C, Ge Z, Su S. One-Shot Neural Architecture Search: Maximising Diversity to Overcome Catastrophic Forgetting. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021;43:2921-2935. [PMID: 33147140 DOI: 10.1109/tpami.2020.3035351] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Zhang X, Chang J, Guo Y, Meng G, Xiang S, Lin Z, Pan C. DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021;43:2905-2920. [PMID: 32866094 DOI: 10.1109/tpami.2020.3020315] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]