论文标题
混音:从方向衍生的角度理解和改善混合
MixupE: Understanding and Improving Mixup from Directional Derivative Perspective
论文作者
论文摘要
混音是一种流行的数据增强技术,用于训练深层神经网络,通过线性插值对输入及其标签产生其他样本。已知该技术可以改善许多学习范式和应用中的概括性能。在这项工作中,我们首先分析混音并表明它隐含地正规化了所有阶的许多定向衍生物。基于这个新的见解,我们提出了一种改进的混音版本,理论上是合理的,可以提供比香草混合的更好的概括性能。为了证明该方法的有效性,我们在各个领域(例如图像,表格数据,语音和图形)进行实验。我们的结果表明,所提出的方法使用各种体系结构改善了多个数据集的混合,例如,Imagenet Top-1的精度表现出比混合的0.8%的改进。
Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. Based on this new insight, we propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.