克服量化训练中的振荡

论文标题

克服量化训练中的振荡

Overcoming Oscillations in Quantization-Aware Training

论文作者

Nagel, Markus, Fournarakis, Marios, Bondarenko, Yelysei, Blankevoort, Tijmen

论文摘要

当通过模拟量化训练神经网络时，我们观察到量化的权重可以在两个网格点之间振荡。这种影响的重要性及其对量化感知培训（QAT）的影响并未在文献中得到充分理解或研究。在本文中，我们更深入地研究了重量振荡现象，并表明由于推理过程中错误估计的批量差异统计量和训练过程中噪声增加，它可能导致明显的准确性降解。这些效果在低位（$ \ leq $ 4位）的高效网络中尤其明显，具有深度可分开的层，例如Mobilenets和效率网络。在我们的分析中，我们研究了一些先前提出的QAT算法，并表明其中大多数无法克服振荡。最后，我们提出了两种新型的QAT算法来克服训练期间的振荡：振荡衰减和迭代重量冻结。我们证明，我们的算法可以实现低位（3＆4位）重量的最新精度，并在Imagenet上的有效体系结构（例如MobileNetV2，MobilenetV3和EfficentNet-lite）进行了有效体系结构的激活量化。我们的源代码可从{https://github.com/qualcomm-ai-research/oscillations-qat}获得。

When training neural networks with simulated quantization, we observe that quantized weights can, rather unexpectedly, oscillate between two grid-points. The importance of this effect and its impact on quantization-aware training (QAT) are not well-understood or investigated in literature. In this paper, we delve deeper into the phenomenon of weight oscillations and show that it can lead to a significant accuracy degradation due to wrongly estimated batch-normalization statistics during inference and increased noise during training. These effects are particularly pronounced in low-bit ($\leq$ 4-bits) quantization of efficient networks with depth-wise separable layers, such as MobileNets and EfficientNets. In our analysis we investigate several previously proposed QAT algorithms and show that most of these are unable to overcome oscillations. Finally, we propose two novel QAT algorithms to overcome oscillations during training: oscillation dampening and iterative weight freezing. We demonstrate that our algorithms achieve state-of-the-art accuracy for low-bit (3 & 4 bits) weight and activation quantization of efficient architectures, such as MobileNetV2, MobileNetV3, and EfficentNet-lite on ImageNet. Our source code is available at {https://github.com/qualcomm-ai-research/oscillations-qat}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题