论文标题
循环正交卷积,用于特征的远程整合
Cyclic orthogonal convolutions for long-range integration of features
论文作者
论文摘要
在卷积神经网络(CNN)中,信息在图像的每个像素的一个小社区中流动,从而在到达网络中的深层之前,以防止特征的长距离整合。我们提出了一种新颖的体系结构,该架构允许在整个图像中具有少量层的特征$ z $和位置$(x,y)$之间的灵活信息流。该体系结构使用三个正交卷积的周期,不仅在$(x,y)$坐标中,而且在$(x,z)$和$(y,z)$坐标中。我们堆叠了一系列此类周期,以获取我们的深网,名为Cyclenet。由于这仅需要标准卷积轴的排列,因此可以将其性能直接与CNN进行比较。与相似大小的CNN相比,我们的模型在CIFAR-10和Imagenet数据集的图像分类中获得了竞争结果。我们假设远距离集成有利于通过形状而不是纹理对对象的识别,并且我们表明,Cyclenet的传输比CNN更好地转移到了风格化的图像。在探路挑战中,遥远特征的整合至关重要,Cyclenet的表现要优于CNN。我们还表明,即使使用较小的卷积内核,Cyclenet的接受场的大小也达到了一个周期后的最大值,而常规的CNN则需要大量层。
In Convolutional Neural Networks (CNNs) information flows across a small neighbourhood of each pixel of an image, preventing long-range integration of features before reaching deep layers in the network. We propose a novel architecture that allows flexible information flow between features $z$ and locations $(x,y)$ across the entire image with a small number of layers. This architecture uses a cycle of three orthogonal convolutions, not only in $(x,y)$ coordinates, but also in $(x,z)$ and $(y,z)$ coordinates. We stack a sequence of such cycles to obtain our deep network, named CycleNet. As this only requires a permutation of the axes of a standard convolution, its performance can be directly compared to a CNN. Our model obtains competitive results at image classification on CIFAR-10 and ImageNet datasets, when compared to CNNs of similar size. We hypothesise that long-range integration favours recognition of objects by shape rather than texture, and we show that CycleNet transfers better than CNNs to stylised images. On the Pathfinder challenge, where integration of distant features is crucial, CycleNet outperforms CNNs by a large margin. We also show that even when employing a small convolutional kernel, the size of receptive fields of CycleNet reaches its maximum after one cycle, while conventional CNNs require a large number of layers.