TSA网络：动作质量评估的管自我注意力网络

论文标题

TSA网络：动作质量评估的管自我注意力网络

TSA-Net: Tube Self-Attention Network for Action Quality Assessment

论文作者

Wang, Shunli, Yang, Dingkang, Zhai, Peng, Chen, Chixiao, Zhang, Lihua

论文摘要

近年来，评估视频的动作质量引起了计算机视觉社区和人类计算机互动的日益关注。大多数现有方法通常通过将模型从动作识别任务直接迁移来解决此问题，该任务忽略了特征图内的内在差异，例如前景和背景信息。为了解决此问题，我们提出了一个管自我发项网络（TSA-NET），以进行行动质量评估（AQA）。具体来说，我们将单个对象跟踪器引入AQA，并提出管自发项模块（TSA），该模块可以通过采用稀疏特征交互来有效地生成丰富的时空上下文信息。 TSA模块嵌入了现有视频网络中以形成TSA-NET。总体而言，我们的TSA-NET具有以下优点：1）高计算效率，2）高灵活性和3）最先进的性能。对包括AQA-7和MTL-AQA在内的流行动作质量评估数据集进行了广泛的实验。此外，提出了一个名为Fall识别的数据集（FR-FS），以探索花样滑冰场景中的基本动作评估。

In recent years, assessing action quality from videos has attracted growing attention in computer vision community and human computer interaction. Most existing approaches usually tackle this problem by directly migrating the model from action recognition tasks, which ignores the intrinsic differences within the feature map such as foreground and background information. To address this issue, we propose a Tube Self-Attention Network (TSA-Net) for action quality assessment (AQA). Specifically, we introduce a single object tracker into AQA and propose the Tube Self-Attention Module (TSA), which can efficiently generate rich spatio-temporal contextual information by adopting sparse feature interactions. The TSA module is embedded in existing video networks to form TSA-Net. Overall, our TSA-Net is with the following merits: 1) High computational efficiency, 2) High flexibility, and 3) The state-of-the art performance. Extensive experiments are conducted on popular action quality assessment datasets including AQA-7 and MTL-AQA. Besides, a dataset named Fall Recognition in Figure Skating (FR-FS) is proposed to explore the basic action assessment in the figure skating scene.

下载PDF全文

下载文献需遵守相关版权规定

论文标题