长矛：基于图形处理单元的神经网络的有效低精度量化Winograd卷积

论文标题

长矛：基于图形处理单元的神经网络的有效低精度量化Winograd卷积

LANCE: Efficient Low-Precision Quantized Winograd Convolution for Neural Networks Based on Graphics Processing Units

论文作者

Li, Guangli, Liu, Lei, Wang, Xueying, Ma, Xiu, Feng, Xiaobing

论文摘要

加速深层卷积神经网络已成为一个积极的话题，并引起了对学术界和工业的兴趣。在本文中，我们提出了一种有效的低精度量化Winograd卷积算法，称为Lance，该算法结合了快速卷积和量化技术的优势。通过将线性量化操作嵌入到Winograd域中，可以在图形处理单元上有效地进行快速卷积。我们在包括SVHN，CIFAR和Imagenet在内的代表性图像分类数据集上测试具有LANE的神经网络模型。实验结果表明，我们的8位量化Winograd卷积在完全精确的精度损失的情况下，在全精度卷积上提高了2.40倍的性能。

Accelerating deep convolutional neural networks has become an active topic and sparked an interest in academia and industry. In this paper, we propose an efficient low-precision quantized Winograd convolution algorithm, called LANCE, which combines the advantages of fast convolution and quantization techniques. By embedding linear quantization operations into the Winograd-domain, the fast convolution can be performed efficiently under low-precision computation on graphics processing units. We test neural network models with LANCE on representative image classification datasets, including SVHN, CIFAR, and ImageNet. The experimental results show that our 8-bit quantized Winograd convolution improves the performance by up to 2.40x over the full-precision convolution with trivial accuracy loss.

下载PDF全文

下载文献需遵守相关版权规定

论文标题