自适应视频通过从用户历史记录中学习来突出显示检测

论文标题

自适应视频通过从用户历史记录中学习来突出显示检测

Adaptive Video Highlight Detection by Learning from User History

论文作者

Rochan, Mrigank, Reddy, Mahesh Kumar Krishna, Ye, Linwei, Wang, Yang

论文摘要

最近，人们对突出显示研究研究的兴趣越来越大，其目标是通过提取有趣的时刻从更长的视频中创建一个短持续时间视频。但是，大多数现有方法忽略了视频亮点的定义高度主观的事实。对于同一输入视频，不同的用户可能具有不同的突出显示。在本文中，我们提出了一个简单而有效的框架，该框架通过以用户以前创建的亮点的形式利用用户的历史记录来调整对用户的突出显示检测。我们的框架由两个子网络组成：一个全时间卷积突出显示网络$ h $，可预测输入视频的突出显示和用户历史记录的历史记录编码器网络$ m $。我们将新设计的时间自适应实例归一化（T-AIN）层引入$ H $，其中两个子网络彼此相互作用。 T-ain具有基于用户历史记录的$ M $预测的仿射参数，并负责用户自适应信号至$ h $。大规模数据集中的大量实验表明，我们的框架可以做出更准确和特定于用户的突出显示预测。

Recently, there is an increasing interest in highlight detection research where the goal is to create a short duration video from a longer video by extracting its interesting moments. However, most existing methods ignore the fact that the definition of video highlight is highly subjective. Different users may have different preferences of highlight for the same input video. In this paper, we propose a simple yet effective framework that learns to adapt highlight detection to a user by exploiting the user's history in the form of highlights that the user has previously created. Our framework consists of two sub-networks: a fully temporal convolutional highlight detection network $H$ that predicts highlight for an input video and a history encoder network $M$ for user history. We introduce a newly designed temporal-adaptive instance normalization (T-AIN) layer to $H$ where the two sub-networks interact with each other. T-AIN has affine parameters that are predicted from $M$ based on the user history and is responsible for the user-adaptive signal to $H$. Extensive experiments on a large-scale dataset show that our framework can make more accurate and user-specific highlight predictions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题