🌙

快速有效:一种用于混合精度量化网络的新型顺序单路径搜索方法

Fast and Effective: A Novel Sequential Single-Path Search for Mixed-Precision-Quantized Networks

IEEE Transactions on Cybernetics · 2022
被引 18
ABS 3

中文导读

提出一种顺序单路径搜索方法,在硬件资源等约束下快速确定深度神经网络各层的量化位宽,实验表明该方法在多种架构和数据集上优于均匀精度模型。

Abstract

Model quantization can reduce the model size and computational latency, it has been successfully applied for many applications of mobile phones, embedded devices, and smart chips. Mixed-precision quantization models can match different bit precision according to the sensitivity of different layers to achieve great performance. However, it is difficult to quickly determine the quantization bit precision of each layer in deep neural networks under some constraints (for example, hardware resources, energy consumption, model size, and computational latency). In this article, a novel sequential single-path search (SSPS) method for mixed-precision model quantization is proposed, in which some given constraints are introduced to guide the searching process. A single-path search cell is proposed to combine a fully differentiable supernet, which can be optimized by gradient-based algorithms. Moreover, we sequentially determine the candidate precisions according to the selection certainties to exponentially reduce the search space and speed up the convergence of the searching process. Experiments show that our method can efficiently search the mixed-precision models for different architectures (for example, ResNet-20, 18, 34, 50, and MobileNet-V2) and datasets (for example, CIFAR-10, ImageNet, and COCO) under given constraints, and our experimental results verify that SSPS significantly outperforms their uniform-precision counterparts.

模型量化混合精度神经网络深度学习计算机视觉