论文标题
QueryProp:高性能视频对象检测的对象查询传播
QueryProp: Object Query Propagation for High-Performance Video Object Detection
论文作者
论文摘要
视频对象检测是计算机视觉中重要但充满挑战的话题。传统方法主要集中于设计图像级或盒子级特征传播策略以利用时间信息。本文认为,通过更有效,更有效的功能传播框架,视频对象探测器可以在准确性和速度方面提高。为此,本文研究了对象级特征传播,并为高性能视频对象检测提出了一个对象查询传播(QUERYPROP)框架。所提出的查询Prop包含两个传播策略:1)查询传播是从稀疏的钥匙帧到密集的非钥匙框架进行的,以减少非钥匙框架的冗余计算; 2)查询繁殖是从以前的关键帧到当前关键帧,以通过时间上下文建模来改善特征表示。为了进一步促进查询传播,自适应传播门旨在实现柔性钥匙框架的选择。我们在Imagenet VID数据集上进行了广泛的实验。 QueryProp通过最先进的方法实现了可比的精度,并实现了不错的精度/速度权衡。代码可在https://github.com/hf1995/queryprop上找到。
Video object detection has been an important yet challenging topic in computer vision. Traditional methods mainly focus on designing the image-level or box-level feature propagation strategies to exploit temporal information. This paper argues that with a more effective and efficient feature propagation framework, video object detectors can gain improvement in terms of both accuracy and speed. For this purpose, this paper studies object-level feature propagation, and proposes an object query propagation (QueryProp) framework for high-performance video object detection. The proposed QueryProp contains two propagation strategies: 1) query propagation is performed from sparse key frames to dense non-key frames to reduce the redundant computation on non-key frames; 2) query propagation is performed from previous key frames to the current key frame to improve feature representation by temporal context modeling. To further facilitate query propagation, an adaptive propagation gate is designed to achieve flexible key frame selection. We conduct extensive experiments on the ImageNet VID dataset. QueryProp achieves comparable accuracy with state-of-the-art methods and strikes a decent accuracy/speed trade-off. Code is available at https://github.com/hf1995/QueryProp.