句子指导动态视频缩略图的时间调制

论文标题

句子指导动态视频缩略图的时间调制

Sentence Guided Temporal Modulation for Dynamic Video Thumbnail Generation

论文作者

Rochan, Mrigank, Reddy, Mahesh Kumar Krishna, Wang, Yang

论文摘要

我们考虑句子指定的动态视频缩略图生成的问题。给定输入视频和用户查询句子，目标是生成视频缩略图，不仅提供视频内容的预览，而且在语义上也对应于句子。在本文中，我们提出了一个引导时间调制（SGTM）机制的句子，该机制利用嵌入句子来调节视频缩略图生成网络的归一化时间激活。与使用经常性体系结构的现有最新方法不同，我们提出了一个简单且允许更加并行化的非旋转框架。大规模数据集的广泛实验和分析证明了我们框架的有效性。

We consider the problem of sentence specified dynamic video thumbnail generation. Given an input video and a user query sentence, the goal is to generate a video thumbnail that not only provides the preview of the video content, but also semantically corresponds to the sentence. In this paper, we propose a sentence guided temporal modulation (SGTM) mechanism that utilizes the sentence embedding to modulate the normalized temporal activations of the video thumbnail generation network. Unlike the existing state-of-the-art method that uses recurrent architectures, we propose a non-recurrent framework that is simple and allows much more parallelization. Extensive experiments and analysis on a large-scale dataset demonstrate the effectiveness of our framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题