FastWave：在FPGA上加速自回归卷积神经网络

论文标题

FastWave：在FPGA上加速自回归卷积神经网络

FastWave: Accelerating Autoregressive Convolutional Neural Networks on FPGA

论文作者

Hussain, Shehzeen, Javaheripi, Mojan, Neekhara, Paarth, Kastner, Ryan, Koushanfar, Farinaz

论文摘要

自回归卷积神经网络（CNN）已被广泛利用用于序列生成任务，例如音频综合，语言建模和神经机器翻译。 WaveNet是一种深度自回旋的CNN，由用于序列生成的几个堆叠式卷积层组成。尽管Wavenet产生了最先进的音频产生结果，但幼稚的推断实施却很慢。在高端GPU上仅产生音频的一秒钟，需要几分钟。在这项工作中，我们开发了第一个用于自动卷积神经网络的加速器平台〜\ textit {fastwave}，并应对相关的设计挑战。我们在Vivado HLS中设计了快速波动推理模型，并执行广泛的优化，包括定点实现，阵列分区和管道。我们的模型对快速矩阵向量乘法使用完全参数化的并行体系结构，该乘法可实现每层自定义延迟微调以进行进一步的吞吐量改进。我们的实验相对评估了各种优化的吞吐量和资源利用之间的权衡。我们在Xilinx XCVU13P FPGA上的最佳Wavenet设计仅使用芯片内存，与CPU实现相比，与GPU实现相比，与CPU实现相比，生成速度更快66。

Autoregressive convolutional neural networks (CNNs) have been widely exploited for sequence generation tasks such as audio synthesis, language modeling and neural machine translation. WaveNet is a deep autoregressive CNN composed of several stacked layers of dilated convolution that is used for sequence generation. While WaveNet produces state-of-the art audio generation results, the naive inference implementation is quite slow; it takes a few minutes to generate just one second of audio on a high-end GPU. In this work, we develop the first accelerator platform~\textit{FastWave} for autoregressive convolutional neural networks, and address the associated design challenges. We design the Fast-Wavenet inference model in Vivado HLS and perform a wide range of optimizations including fixed-point implementation, array partitioning and pipelining. Our model uses a fully parameterized parallel architecture for fast matrix-vector multiplication that enables per-layer customized latency fine-tuning for further throughput improvement. Our experiments comparatively assess the trade-off between throughput and resource utilization for various optimizations. Our best WaveNet design on the Xilinx XCVU13P FPGA that uses only on-chip memory, achieves 66 faster generation speed compared to CPU implementation and 11 faster generation speed than GPU implementation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题