论文标题
句子指导动态视频缩略图的时间调制
Sentence Guided Temporal Modulation for Dynamic Video Thumbnail Generation
论文作者
论文摘要
我们考虑句子指定的动态视频缩略图生成的问题。给定输入视频和用户查询句子,目标是生成视频缩略图,不仅提供视频内容的预览,而且在语义上也对应于句子。在本文中,我们提出了一个引导时间调制(SGTM)机制的句子,该机制利用嵌入句子来调节视频缩略图生成网络的归一化时间激活。与使用经常性体系结构的现有最新方法不同,我们提出了一个简单且允许更加并行化的非旋转框架。大规模数据集的广泛实验和分析证明了我们框架的有效性。
We consider the problem of sentence specified dynamic video thumbnail generation. Given an input video and a user query sentence, the goal is to generate a video thumbnail that not only provides the preview of the video content, but also semantically corresponds to the sentence. In this paper, we propose a sentence guided temporal modulation (SGTM) mechanism that utilizes the sentence embedding to modulate the normalized temporal activations of the video thumbnail generation network. Unlike the existing state-of-the-art method that uses recurrent architectures, we propose a non-recurrent framework that is simple and allows much more parallelization. Extensive experiments and analysis on a large-scale dataset demonstrate the effectiveness of our framework.