重新审视：您的接受场是次优的

论文标题

重新审视：您的接受场是次优的

Pooling Revisited: Your Receptive Field is Suboptimal

论文作者

Jang, Dong-Hwan, Chu, Sanghyeok, Kim, Joonhyuk, Han, Bohyung

论文摘要

接收场的大小和形状决定了网络如何汇总本地信息并极大地影响模型的整体性能。神经网络中的许多组件，例如内核大小和用于卷积和汇总操作的大步，都会影响接收场的配置。但是，它们仍然依靠超参数，现有模型的接受场导致了次优的形状和尺寸。因此，我们提出了一个简单而有效的动态优化的合并操作，称为Dynopool，该操作通过学习每一层中其接受场的理想大小和形状来优化特征图的比例因子。深层神经网络中的任何调整模块都可以用Dynopool的操作取代，费用最低。此外，DynoPool通过引入额外的损失项来控制模型的复杂性，以限制计算成本。我们的实验表明，配备了拟议的可学习调整模块的模型优于图像分类和语义分割中多个数据集上的基线网络。

The size and shape of the receptive field determine how the network aggregates local information and affect the overall performance of a model considerably. Many components in a neural network, such as kernel sizes and strides for convolution and pooling operations, influence the configuration of a receptive field. However, they still rely on hyperparameters, and the receptive fields of existing models result in suboptimal shapes and sizes. Hence, we propose a simple yet effective Dynamically Optimized Pooling operation, referred to as DynOPool, which optimizes the scale factors of feature maps end-to-end by learning the desirable size and shape of its receptive field in each layer. Any kind of resizing modules in a deep neural network can be replaced by the operations with DynOPool at a minimal cost. Also, DynOPool controls the complexity of a model by introducing an additional loss term that constrains computational cost. Our experiments show that the models equipped with the proposed learnable resizing module outperform the baseline networks on multiple datasets in image classification and semantic segmentation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题