论文标题
裁缝:通过在预测时间优化无监督目标来编码归纳偏见
Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time
论文作者
论文摘要
从CNN到注意机制,将归纳偏差编码为神经网络一直是机器学习改善的富有成效的来源。将辅助损失添加到主要目标函数中是一种通用的一种编码偏差的一般方式,可以帮助网络学习更好的表示形式。但是,由于仅在训练数据上将辅助损失最小化,因此它们的概括差距与常规任务损失相同。此外,通过将术语添加到损失函数中,该模型优化了与我们关心的目标不同的目标。在这项工作中,我们解决了这两个问题:首先,我们从\ textit {thrdductive学习}中获得灵感,并注意到收到输入后,但是在做出预测之前,我们可以对任何无监督的损失进行微调。我们称此过程{\ em daloring},因为我们将模型自定义为每个输入以确保我们的预测满足归纳偏差。其次,我们制定了{\ em meta-tailoring},这是一种类似于元学习的嵌套优化,并训练我们的模型在使用无监督的损失调整后,在任务目标上表现良好。理论上讨论了裁缝和荟萃量的优势,并在各种示例上进行了经验证明。
From CNNs to attention mechanisms, encoding inductive biases into neural networks has been a fruitful source of improvement in machine learning. Adding auxiliary losses to the main objective function is a general way of encoding biases that can help networks learn better representations. However, since auxiliary losses are minimized only on training data, they suffer from the same generalization gap as regular task losses. Moreover, by adding a term to the loss function, the model optimizes a different objective than the one we care about. In this work we address both problems: first, we take inspiration from \textit{transductive learning} and note that after receiving an input but before making a prediction, we can fine-tune our networks on any unsupervised loss. We call this process {\em tailoring}, because we customize the model to each input to ensure our prediction satisfies the inductive bias. Second, we formulate {\em meta-tailoring}, a nested optimization similar to that in meta-learning, and train our models to perform well on the task objective after adapting them using an unsupervised loss. The advantages of tailoring and meta-tailoring are discussed theoretically and demonstrated empirically on a diverse set of examples.