语义意识到密集的视频字幕

论文标题

语义意识到密集的视频字幕

Semantic-Aware Pretraining for Dense Video Captioning

论文作者

Wang, Teng, Liu, Zhu, Zheng, Feng, Lu, Zhichao, Cheng, Ran, Luo, Ping

论文摘要

本报告介绍了我们在活动网络挑战2021中进行事件密集任务的方法的细节。我们提出了一种语义意识到的通用视频字幕的训练预处理方法，该方法赋予了学识渊博的功能以识别高级语义概念。不同方式的不同视频功能被馈入事件字幕模块，以生成准确而有意义的句子。我们的最终合奏模型在测试集中达到了10.00流星得分。

This report describes the details of our approach for the event dense-captioning task in ActivityNet Challenge 2021. We present a semantic-aware pretraining method for dense video captioning, which empowers the learned features to recognize high-level semantic concepts. Diverse video features of different modalities are fed into an event captioning module to generate accurate and meaningful sentences. Our final ensemble model achieves a 10.00 METEOR score on the test set.

下载PDF全文

下载文献需遵守相关版权规定

论文标题