KDeCay：仅在学习率时间表上添加K-Decay项目以改善神经网络

论文标题

KDeCay：仅在学习率时间表上添加K-Decay项目以改善神经网络

kDecay: Just adding k-decay items on Learning-Rate Schedule to improve Neural Networks

论文作者

Zhang, Tao, Li, Wei

论文摘要

最近的工作表明，优化学习率（LR）时间表可能是训练深神经网络的一种非常准确，有效的方法。我们观察到，LR的变化率（ROC）与培训过程有关联，但是如何利用这种关系来控制训练以提高准确性的目的？我们提出了一种新方法K-decay，只需在常用和简单的LR时间表（EXP，cosine和多项式）中添加一个额外的项目，可以有效地提高这些时间表的性能，也比SGDR，CLR和Autolrs等LR Shcedule的最先进算法更好。在k-decay中，通过调整超参数\（k \）来生成不同的LR时间表，当K增加时，性能会提高。我们评估具有不同神经网络（Resnet，宽重新网）的CIFAR和Imagenet数据集上的K-Decay方法。我们的实验表明，这种方法可以改善其中的大多数。 CIFAR-10数据集上的精度已提高了1.08 \％，在CIFAR-100数据集上的精度已提高了2.07 \％。在成像网上，精度提高了1.25 \％。我们的方法不仅是用于应用其他LR shceDule的一般方法，而且没有额外的计算成本。

Recent work has shown that optimizing the Learning Rate (LR) schedule can be a very accurate and efficient way to train deep neural networks. We observe that the rate of change (ROC) of LR has correlation with the training process, but how to use this relationship to control the training to achieve the purpose of improving accuracy? We propose a new method, k-decay, just add an extra item to the commonly used and easy LR schedule(exp, cosine and polynomial), is effectively improves the performance of these schedule, also better than the state-of-the-art algorithms of LR shcedule such as SGDR, CLR and AutoLRS. In the k-decay, by adjusting the hyper-parameter \(k\), to generate different LR schedule, when k increases, the performance is improved. We evaluate the k-decay method on CIFAR And ImageNet datasets with different neural networks (ResNet, Wide ResNet). Our experiments show that this method can improve on most of them. The accuracy has been improved by 1.08\% on the CIFAR-10 dataset and by 2.07 \% on the CIFAR-100 dataset. On the ImageNet, accuracy is improved by 1.25\%. Our method is not only a general method to be applied other LR Shcedule, but also has no additional computational cost.

下载PDF全文

下载文献需遵守相关版权规定

论文标题