DRTAM：双等级1张量注意模块

论文标题

DRTAM：双等级1张量注意模块

DRTAM: Dual Rank-1 Tensor Attention Module

论文作者

Chi, Hanxing, Lin, Baihong, Hu, Jun, Wang, Liang

论文摘要

最近，注意力机制已经在计算机视觉中进行了广泛的研究，但是很少有人在大型和移动网络上表现出出色的性能。本文提出了双等级-1张量注意模块（DRTAM），这是一种新型的残留注意力学习引导的注意模块，用于前进卷积神经网络。给定3D特征张量图，DRTAM首先沿三个轴生成三个2D特征描述符。然后，使用三个描述符，DRTAM顺序渗透了两个rank-1张量注意图，初始注意力图和补充注意图，将它们组合并乘以输入特征映射以进行自适应特征细化（见图1（C））。要生成两个注意图，DRTAM将rank-1张量注意模块（RTAM）和残留描述符提取模块（RDEM）引入：RTAM将每个2D特征描述符分为几个块，并生成三个因子向量的三个因子向量，即通过在每个池塘上堆放台式的台词，从而使每个池塘均匀地构成了该台阶，从而可以与三个狭窄的信息相同，从而使三个斑点与三个斑点相同。 RDEM使用初始注意力图的三个因子向量和输入功能的三个描述符生成了残差特征的三个2D特征描述符，以产生补充注意力图。 ImageNet-1K，Coco和Pascal VOC的广泛实验结果表明，DRTAM在大型和移动网络上都在与其他最先进的注意模块相比，在大型和移动网络上取得了竞争性能。

Recently, attention mechanisms have been extensively investigated in computer vision, but few of them show excellent performance on both large and mobile networks. This paper proposes Dual Rank-1 Tensor Attention Module (DRTAM), a novel residual-attention-learning-guided attention module for feed-forward convolutional neural networks. Given a 3D feature tensor map, DRTAM firstly generates three 2D feature descriptors along three axes. Then, using three descriptors, DRTAM sequentially infers two rank-1 tensor attention maps, the initial attention map and the complement attention map, combines and multiplied them to the input feature map for adaptive feature refinement(see Fig.1(c)). To generate two attention maps, DRTAM introduces rank-1 tensor attention module (RTAM) and residual descriptors extraction module (RDEM): RTAM divides each 2D feature descriptors into several chunks, and generate three factor vectors of a rank-1 tensor attention map by employing strip pooling on each chunk so that local and long-range contextual information can be captured along three dimension respectively; RDEM generates three 2D feature descriptors of the residual feature to produce the complement attention map, using three factor vectors of the initial attention map and three descriptors of the input feature. Extensive experimental results on ImageNet-1K, MS COCO and PASCAL VOC demonstrate that DRTAM achieves competitive performance on both large and mobile networks compare with other state-of-the-art attention modules.

下载PDF全文

下载文献需遵守相关版权规定

论文标题