论文标题
关于在线操作检测的时间建模的全面研究
A Comprehensive Study on Temporal Modeling for Online Action Detection
论文作者
论文摘要
在线行动检测(OAD)是一项实用但又具有挑战性的任务,近年来引起了人们越来越多的关注。典型的OAD系统主要由三个模块组成:框架级特征提取器通常基于预训练的深卷积神经网络(CNN),时间建模模块和一个动作分类器。其中,时间建模模块至关重要,它从历史和当前特征汇总了歧视性信息。尽管已经为OAD和其他主题开发了许多时间建模方法,但它们的影响是对OAD缺乏研究。本文旨在提供有关OAD的时间建模的全面研究,包括四种类型的时间建模方法,\ ie时间合并,时间卷积,经常性的神经网络和时间关注,并发现一些良好的实践来产生最先进的OAD系统。其中许多是第一次在OAD中探索,并使用各种超级参数进行了广泛的评估。此外,根据我们的综合研究,我们提出了几种混合时间建模方法,这些方法的表现优于最新的最新方法,该方法在Thumos-14和TVSeries上具有相当大的边缘。
Online action detection (OAD) is a practical yet challenging task, which has attracted increasing attention in recent years. A typical OAD system mainly consists of three modules: a frame-level feature extractor which is usually based on pre-trained deep Convolutional Neural Networks (CNNs), a temporal modeling module, and an action classifier. Among them, the temporal modeling module is crucial which aggregates discriminative information from historical and current features. Though many temporal modeling methods have been developed for OAD and other topics, their effects are lack of investigation on OAD fairly. This paper aims to provide a comprehensive study on temporal modeling for OAD including four meta types of temporal modeling methods, \ie temporal pooling, temporal convolution, recurrent neural networks, and temporal attention, and uncover some good practices to produce a state-of-the-art OAD system. Many of them are explored in OAD for the first time, and extensively evaluated with various hyper parameters. Furthermore, based on our comprehensive study, we present several hybrid temporal modeling methods, which outperform the recent state-of-the-art methods with sizable margins on THUMOS-14 and TVSeries.