通过多任务丢失和跳过记忆重新访问序列到序列视频对象分割

论文标题

通过多任务丢失和跳过记忆重新访问序列到序列视频对象分割

Revisiting Sequence-to-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

论文作者

Azimi, Fatemeh, Bischke, Benjamin, Palacio, Sebastian, Raue, Federico, Hees, Joern, Dengel, Andreas

论文摘要

视频对象分割（VOS）是视觉域的活跃研究领域。它的基本子任务之一是半监督 /一击学习：仅鉴于第一个帧的分割掩码，任务是在序列的其余部分中为对象提供像素精度掩码。尽管过去几年取得了很多进展，但我们注意到许多现有方法以较长的序列失去对象，尤其是当对象较小或短暂遮挡时。在这项工作中，我们建立在一种序列到序列方法的基础上，该方法采用编码器架构以及内存模块来利用顺序数据。我们通过提出一个模型来进一步改善这种方法，该模型使用配备存储器的跳过连接来操纵多规模时空信息。此外，我们根据距离分类合并了一项辅助任务，该任务大大提高了分割面罩中边缘的质量。我们将我们的方法进行比较，并在轮廓准确度度量和整体细分精度方面显示出显着提高。

Video Object Segmentation (VOS) is an active research area of the visual domain. One of its fundamental sub-tasks is semi-supervised / one-shot learning: given only the segmentation mask for the first frame, the task is to provide pixel-accurate masks for the object over the rest of the sequence. Despite much progress in the last years, we noticed that many of the existing approaches lose objects in longer sequences, especially when the object is small or briefly occluded. In this work, we build upon a sequence-to-sequence approach that employs an encoder-decoder architecture together with a memory module for exploiting the sequential data. We further improve this approach by proposing a model that manipulates multi-scale spatio-temporal information using memory-equipped skip connections. Furthermore, we incorporate an auxiliary task based on distance classification which greatly enhances the quality of edges in segmentation masks. We compare our approach to the state of the art and show considerable improvement in the contour accuracy metric and the overall segmentation accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题