动态内核选择，以提高元学习的概括和记忆效率

论文标题

动态内核选择，以提高元学习的概括和记忆效率

Dynamic Kernel Selection for Improved Generalization and Memory Efficiency in Meta-learning

论文作者

Chavan, Arnav, Tiwari, Rishabh, Bamba, Udbhav, Gupta, Deepak K.

论文摘要

基于梯度的元学习方法很容易过度拟合元训练集，并且这种行为在大型且复杂的网络中更为突出。此外，大型网络限制了在低功率边缘设备上的元学习模型的应用。在一定程度上选择较小的网络避免了这些问题，但它会影响整体概括导致性能下降。显然，网络体系结构的大约最佳选择最适合每个元学习问题，但是，事先确定它并不简单。在本文中，我们提出了元库克，这是一种特定于任务的动态内核选择策略，用于设计压缩的CNN模型，可以很好地推广到元学习中的看不见的任务。我们的方法基于以下假设：对于给定的一组类似任务，并非每个任务都需要网络的所有内核。相反，每个任务仅使用一小部分内核，并且可以动态地学习每个任务内核的选择，作为内部更新步骤的一部分。 Metadock压缩元模型以及特定于任务的内部模型，从而为每个任务提供了显着降低模型大小，并通过约束每个任务的活动核的数量，它隐含地减轻了元评估的问题。我们表明，在相同的推理预算中，使用我们的方法获得的大型CNN模型的修剪版本始终优于CNN模型的常规选择。元库克夫妇与流行的元学习方法（例如Imaml）很好。我们的方法的功效在CIFAR-FS和MINI-IMAGENET数据集上进行了验证，我们已经观察到，我们的方法可以在标准的元学习基准测试中提供高达2％的模型准确性，同时将模型大小降低超过75％。

Gradient based meta-learning methods are prone to overfit on the meta-training set, and this behaviour is more prominent with large and complex networks. Moreover, large networks restrict the application of meta-learning models on low-power edge devices. While choosing smaller networks avoid these issues to a certain extent, it affects the overall generalization leading to reduced performance. Clearly, there is an approximately optimal choice of network architecture that is best suited for every meta-learning problem, however, identifying it beforehand is not straightforward. In this paper, we present MetaDOCK, a task-specific dynamic kernel selection strategy for designing compressed CNN models that generalize well on unseen tasks in meta-learning. Our method is based on the hypothesis that for a given set of similar tasks, not all kernels of the network are needed by each individual task. Rather, each task uses only a fraction of the kernels, and the selection of the kernels per task can be learnt dynamically as a part of the inner update steps. MetaDOCK compresses the meta-model as well as the task-specific inner models, thus providing significant reduction in model size for each task, and through constraining the number of active kernels for every task, it implicitly mitigates the issue of meta-overfitting. We show that for the same inference budget, pruned versions of large CNN models obtained using our approach consistently outperform the conventional choices of CNN models. MetaDOCK couples well with popular meta-learning approaches such as iMAML. The efficacy of our method is validated on CIFAR-fs and mini-ImageNet datasets, and we have observed that our approach can provide improvements in model accuracy of up to 2% on standard meta-learning benchmark, while reducing the model size by more than 75%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题