元基督：通过元学习的自上而下蒸馏网络自我促进

论文标题

元基督：通过元学习的自上而下蒸馏网络自我促进

MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation

论文作者

Liu, Benlin, Rao, Yongming, Lu, Jiwen, Zhou, Jie, Hsieh, Cho-jui

论文摘要

知识蒸馏（KD）一直是学习紧凑模型的最普遍的方法之一。但是，它仍然遭受时间的高度和由顺序训练管道引起的计算资源。此外，由于compatibility的差距，来自更深层次模型的软目标确实不像是较浅模型的好线索。在这项工作中，我们同时考虑了这两个问题。特别是，我们建议可以通过使用标签发电机以自上而下的方式融合更深层阶段的更高兼容性的更好的软目标，我们可以采用元学习技术来最大程度地利用该标签生成器。利用从模型的中间特征图中学到的软件，我们与最先进的网络相比，我们可以更好地自我增强网络。这些实验是在两个标准分类基准上进行的，即CIFAR-100和ILSVRC2012。我们测试了各种网络架构，以显示我们的元基督的普遍性。在两个数据集上的实验结果强烈证明了我们方法的有效性。

Knowledge Distillation (KD) has been one of the most popu-lar methods to learn a compact model. However, it still suffers from highdemand in time and computational resources caused by sequential train-ing pipeline. Furthermore, the soft targets from deeper models do notoften serve as good cues for the shallower models due to the gap of com-patibility. In this work, we consider these two problems at the same time.Specifically, we propose that better soft targets with higher compatibil-ity can be generated by using a label generator to fuse the feature mapsfrom deeper stages in a top-down manner, and we can employ the meta-learning technique to optimize this label generator. Utilizing the softtargets learned from the intermediate feature maps of the model, we canachieve better self-boosting of the network in comparison with the state-of-the-art. The experiments are conducted on two standard classificationbenchmarks, namely CIFAR-100 and ILSVRC2012. We test various net-work architectures to show the generalizability of our MetaDistiller. Theexperiments results on two datasets strongly demonstrate the effective-ness of our method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题