论文标题
对跨类别转变的鲁棒性:不变神经表示驱动的转换的鲁棒性吗?
Robustness to Transformations Across Categories: Is Robustness To Transformations Driven by Invariant Neural Representations?
论文作者
论文摘要
深度卷积神经网络(DCNN)表现出令人印象深刻的鲁棒性,以识别这些转换在训练集中时的转换(例如模糊或噪声)下的对象。解释这种鲁棒性的一个假设是,dcnns会形成不变的神经表示,这些神经表示在转化时仍未改变。但是,这个假设在多大程度上是一个杰出的问题,因为与不变性不同的属性可以实现转换的鲁棒性,例如。网络的一部分可以专门识别转换的或未转换的图像。本文调查了不变神经表示的条件,以利用它们促进训练分布以外的转变。具体而言,我们分析了训练范式,在训练中,只能看到一些对象类别在训练过程中进行转换,并评估DCNN是否对未看到转换的类别的转换是否强大。我们使用最先进的DCNN的结果表明,不变的神经表示并不总是会推动转换的鲁棒性,因为网络对训练中在训练中所见类别的鲁棒性也表现出稳健性,即使在没有不变神经表示的情况下也是如此。不变性仅随着训练集中转换类别的数量的增加而出现。与几何变换(例如旋转和稀疏)相比,这种现象在局部变化(例如模糊和高通滤波)中更为突出,这些变换需要变化,这需要变化对象的空间排列。我们的结果有助于更好地理解深度学习中不变的神经表现形式及其自发出现的条件。
Deep Convolutional Neural Networks (DCNNs) have demonstrated impressive robustness to recognize objects under transformations (eg. blur or noise) when these transformations are included in the training set. A hypothesis to explain such robustness is that DCNNs develop invariant neural representations that remain unaltered when the image is transformed. However, to what extent this hypothesis holds true is an outstanding question, as robustness to transformations could be achieved with properties different from invariance, eg. parts of the network could be specialized to recognize either transformed or non-transformed images. This paper investigates the conditions under which invariant neural representations emerge by leveraging that they facilitate robustness to transformations beyond the training distribution. Concretely, we analyze a training paradigm in which only some object categories are seen transformed during training and evaluate whether the DCNN is robust to transformations across categories not seen transformed. Our results with state-of-the-art DCNNs indicate that invariant neural representations do not always drive robustness to transformations, as networks show robustness for categories seen transformed during training even in the absence of invariant neural representations. Invariance only emerges as the number of transformed categories in the training set is increased. This phenomenon is much more prominent with local transformations such as blurring and high-pass filtering than geometric transformations such as rotation and thinning, which entail changes in the spatial arrangement of the object. Our results contribute to a better understanding of invariant neural representations in deep learning and the conditions under which it spontaneously emerges.