随机批次增强，具有有效的蒸馏动态软标签正规器

论文标题

随机批次增强，具有有效的蒸馏动态软标签正规器

Stochastic Batch Augmentation with An Effective Distilled Dynamic Soft Label Regularizer

论文作者

Li, Qian, Hu, Qingyuan, Qi, Yong, Qi, Saiyu, Ma, Jie, Zhang, Jian

论文摘要

数据增强已在训练深层神经网络中被密切使用，以改善概括，无论是在原始空间（例如，图像空间）还是表示空间。尽管成功，但合成数据与原始数据之间的连接在训练中很大程度上被忽略，而没有考虑合成样本在培训中围绕原始样本的分布信息。因此，网络的行为未对此进行优化。但是，对于深度学习系统的安全，即使在对抗性环境中，这种行为对于概括至关重要。在这项工作中，我们提出了一个称为随机批次增强（SBA）的框架，以解决这些问题。 SBA随机决定是否在由批处理调度程序控制的迭代术上增加，并且通过将相似性纳入附近分布相对于原始样品而引入“蒸馏”动态软标签正则化。所提出的正则化通过原始数据和虚拟数据的输出软马克斯分布之间的KL-Divergence提供了直接的监督。我们在CIFAR-10，CIFAR-100和Imagenet上进行的实验表明，SBA可以改善神经网络的概括并加快网络培训的收敛性。

Data augmentation have been intensively used in training deep neural network to improve the generalization, whether in original space (e.g., image space) or representation space. Although being successful, the connection between the synthesized data and the original data is largely ignored in training, without considering the distribution information that the synthesized samples are surrounding the original sample in training. Hence, the behavior of the network is not optimized for this. However, that behavior is crucially important for generalization, even in the adversarial setting, for the safety of the deep learning system. In this work, we propose a framework called Stochastic Batch Augmentation (SBA) to address these problems. SBA stochastically decides whether to augment at iterations controlled by the batch scheduler and in which a ''distilled'' dynamic soft label regularization is introduced by incorporating the similarity in the vicinity distribution respect to raw samples. The proposed regularization provides direct supervision by the KL-Divergence between the output soft-max distributions of original and virtual data. Our experiments on CIFAR-10, CIFAR-100, and ImageNet show that SBA can improve the generalization of the neural networks and speed up the convergence of network training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题