沿切线矢量领域的整体持续学习

论文标题

沿切线矢量领域的整体持续学习

Integral Continual Learning Along the Tangent Vector Field of Tasks

论文作者

Liu, Tian Yu, Golatkar, Aditya, Soatto, Stefano, Achille, Alessandro

论文摘要

我们提出了一种轻巧的持续学习方法，该方法通过沿“通才”模型的向量字段集成，从而将其从专业数据集中逐渐结合。与专业模型的切线平面充当通才指南，并避免了导致灾难性遗忘的过度拟合，同时利用了切线平面中优化景观的凸度。它维护了一个小的固定尺寸内存缓冲区，低至源数据集的0.4％，这是通过简单的重新采样来更新的。我们的方法在不同数据集的各种缓冲尺寸上实现了强劲的性能。具体而言，对于SEQ-CIFAR-10和SEQ-TINYIMAGENET，我们在类销售环境中的表现平均超过了不需要蒸馏的现有方法。我们的方法可以轻松与现有基于重播的持续学习方法结合使用。当放松内存缓冲区约束以允许元数据（例如逻辑）存储时，我们将降低17.84％的误差降低到Seq-Cifar-10上的Paragon性能。

We propose a lightweight continual learning method which incorporates information from specialized datasets incrementally, by integrating it along the vector field of "generalist" models. The tangent plane to the specialist model acts as a generalist guide and avoids the kind of over-fitting that leads to catastrophic forgetting, while exploiting the convexity of the optimization landscape in the tangent plane. It maintains a small fixed-size memory buffer, as low as 0.4% of the source datasets, which is updated by simple resampling. Our method achieves strong performance across various buffer sizes for different datasets. Specifically, in the class-incremental setting we outperform the existing methods that do not require distillation by an average of 18.77% and 28.48%, for Seq-CIFAR-10 and Seq-TinyImageNet respectively. Our method can easily be used in conjunction with existing replay-based continual learning methods. When memory buffer constraints are relaxed to allow storage of metadata such as logits, we attain an error reduction of 17.84% towards the paragon performance on Seq-CIFAR-10.

下载PDF全文

下载文献需遵守相关版权规定

论文标题