论文标题
沿切线矢量领域的整体持续学习
Integral Continual Learning Along the Tangent Vector Field of Tasks
论文作者
论文摘要
我们提出了一种轻巧的持续学习方法,该方法通过沿“通才”模型的向量字段集成,从而将其从专业数据集中逐渐结合。与专业模型的切线平面充当通才指南,并避免了导致灾难性遗忘的过度拟合,同时利用了切线平面中优化景观的凸度。它维护了一个小的固定尺寸内存缓冲区,低至源数据集的0.4%,这是通过简单的重新采样来更新的。我们的方法在不同数据集的各种缓冲尺寸上实现了强劲的性能。具体而言,对于SEQ-CIFAR-10和SEQ-TINYIMAGENET,我们在类销售环境中的表现平均超过了不需要蒸馏的现有方法。我们的方法可以轻松与现有基于重播的持续学习方法结合使用。当放松内存缓冲区约束以允许元数据(例如逻辑)存储时,我们将降低17.84%的误差降低到Seq-Cifar-10上的Paragon性能。
We propose a lightweight continual learning method which incorporates information from specialized datasets incrementally, by integrating it along the vector field of "generalist" models. The tangent plane to the specialist model acts as a generalist guide and avoids the kind of over-fitting that leads to catastrophic forgetting, while exploiting the convexity of the optimization landscape in the tangent plane. It maintains a small fixed-size memory buffer, as low as 0.4% of the source datasets, which is updated by simple resampling. Our method achieves strong performance across various buffer sizes for different datasets. Specifically, in the class-incremental setting we outperform the existing methods that do not require distillation by an average of 18.77% and 28.48%, for Seq-CIFAR-10 and Seq-TinyImageNet respectively. Our method can easily be used in conjunction with existing replay-based continual learning methods. When memory buffer constraints are relaxed to allow storage of metadata such as logits, we attain an error reduction of 17.84% towards the paragon performance on Seq-CIFAR-10.