基于自适应激活的结构化修剪

论文标题

基于自适应激活的结构化修剪

Adaptive Activation-based Structured Pruning

论文作者

Zhao, Kaiqi, Jain, Animesh, Zhao, Ming

论文摘要

修剪是一种有前途的方法，可以压缩复杂的深度学习模型，以便将其部署在资源受限的边缘设备上。但是，许多现有的修剪解决方案基于非结构化的修剪，该修剪产生了无法有效地在商品硬件上运行的模型，并要求用户手动探索和调整修剪过程，这是耗时的，并且通常会导致次优效果。为了解决这些限制，本文提出了一种自适应，基于激活的结构化修剪方法，以自动有效地生成满足用户需求的小型，准确且硬件有效的模型。首先，它提出了使用基于激活的注意特征图的迭代结构修剪，以有效地识别和修剪无关紧要的过滤器。然后，它提出了自适应修剪政策，以自动满足准确性，内存约束和潜伏敏感任务的修剪目标。全面的评估表明，所提出的方法可以大大优于CIFAR-10和Imagenet数据集上最先进的结构化修剪作品。例如，在使用CIFAR-10的RESNET-56上，我们的方法降低了最大的参数（79.11％），超过相关工作的22.81％至66.07％，而最大的FLOPS降低（70.13％）（70.13％）（70.13％），超过14.13％的相关作品，至26.13％。

Pruning is a promising approach to compress complex deep learning models in order to deploy them on resource-constrained edge devices. However, many existing pruning solutions are based on unstructured pruning, which yields models that cannot efficiently run on commodity hardware and require users to manually explore and tune the pruning process, which is time-consuming and often leads to sub-optimal results. To address these limitations, this paper presents an adaptive, activation-based, structured pruning approach to automatically and efficiently generate small, accurate, and hardware-efficient models that meet user requirements. First, it proposes iterative structured pruning using activation-based attention feature maps to effectively identify and prune unimportant filters. Then, it proposes adaptive pruning policies for automatically meeting the pruning objectives of accuracy-critical, memory-constrained, and latency-sensitive tasks. A comprehensive evaluation shows that the proposed method can substantially outperform the state-of-the-art structured pruning works on CIFAR-10 and ImageNet datasets. For example, on ResNet-56 with CIFAR-10, without any accuracy drop, our method achieves the largest parameter reduction (79.11%), outperforming the related works by 22.81% to 66.07%, and the largest FLOPs reduction (70.13%), outperforming the related works by 14.13% to 26.53%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题